AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul' #1433

thistleknot · 2023-04-21T01:37:52Z

Describe the bug

Unable to run 4-bit quantized models on GPU despite having correct GPTQ installed.

Is there an existing issue for this?

I have searched the existing issues

Reproduction

pip uninstall -y quant-cuda
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda
cd GPTQ-for-LLaMa
python setup_cuda.py install
python server.py --model vicuna-13b-GPTQ-4bit-128g --wbits 4 --groupsize 128 --chat --listen --auto-devices --gpu-memory 3500MiB

Screenshot

No response

Logs

Traceback (most recent call last):
  File "/mnt/distvol/text-generation-webui/modules/callbacks.py", line 66, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "/mnt/distvol/text-generation-webui/modules/text_generation.py", line 252, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/generation/utils.py", line 1485, in generate
    return self.sample(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/generation/utils.py", line 2524, in sample
    outputs = self(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 687, in forward
    outputs = self.model(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 577, in forward
    layer_outputs = decoder_layer(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 196, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/text-generation-webui/repositories/GPTQ-for-LLaMa/quant.py", line 426, in forward
    quant_cuda.vecquant4matmul(x, self.qweight, y, self.scales, self.qzeros, self.groupsize)
AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul'

System Info

python 3.9
Oracle Linux 8
Cuda 11.7
4GB VRAM
Quadro P1000

The text was updated successfully, but these errors were encountered:

thistleknot · 2023-04-21T19:57:57Z

I folloed advice from another post and uninstalled gptq from pip

but then when running the same model, I get a new error (a new error is a better error)

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "/mnt/distvol/text-generation-webui/modules/callbacks.py", line 66, in gentask
    ret = self.mfunc(callback=_callback, **self.kwargs)
  File "/mnt/distvol/text-generation-webui/modules/text_generation.py", line 252, in generate_with_callback
    shared.model.generate(**kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/generation/utils.py", line 1485, in generate
    return self.sample(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/generation/utils.py", line 2524, in sample
    outputs = self(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 687, in forward
    outputs = self.model(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 577, in forward
    layer_outputs = decoder_layer(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward
    hidden_states, self_attn_weights, present_key_value = self.self_attn(
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward
    output = old_forward(*args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 196, in forward
    query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 160, in new_forward
    args, kwargs = module._hf_hook.pre_forward(module, *args, **kwargs)
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/hooks.py", line 280, in pre_forward
    set_module_tensor_to_device(module, name, self.execution_device, value=self.weights_map[name])
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/utils/offload.py", line 123, in __getitem__
    return self.dataset[f"{self.prefix}{key}"]
  File "/mnt/distvol/python_user/python_user/gpt/lib/python3.9/site-packages/accelerate/utils/offload.py", line 170, in __getitem__
    weight_info = self.index[key]
KeyError: 'model.layers.17.self_attn.q_proj.wf1'

github-actions · 2023-05-21T23:16:23Z

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

thistleknot added the bug Something isn't working label Apr 21, 2023

vrunm mentioned this issue May 4, 2023

Module 'quant_cuda' has no attribute 'vecquant4matmul' AutoGPTQ/AutoGPTQ#53

Closed

github-actions bot added the stale label May 21, 2023

github-actions bot closed this as completed May 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul' #1433

AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul' #1433

thistleknot commented Apr 21, 2023

thistleknot commented Apr 21, 2023 •

edited

github-actions bot commented May 21, 2023

AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul' #1433

AttributeError: module 'quant_cuda' has no attribute 'vecquant4matmul' #1433

Comments

thistleknot commented Apr 21, 2023

Describe the bug

Is there an existing issue for this?

Reproduction

Screenshot

Logs

System Info

thistleknot commented Apr 21, 2023 • edited

github-actions bot commented May 21, 2023

thistleknot commented Apr 21, 2023 •

edited