-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x4096 and 1x8388608) #600
Comments
Same issue with me as well Error Below: I am running the same code in colab notebook, I am getting error : |
it may cause by peft. I install peft with git+https://github.com/huggingface/peft.git, peft-0.3.0 not work |
@CRyan2016 do you mean by updating PEFT you could able to get it work? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Training with load_in_4bit leads to RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x4096 and 1x8388608)
Version used
transformers: 4.30.2
bitsandbytes: 0.40.0
Error detail
RuntimeError: Caught RuntimeError in replica 0 on device 0. Original Traceback (most recent call last): File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/parallel/parallel_apply.py", line 64, in _worker output = module(*input, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/peft/peft_model.py", line 678, in forward return self.base_model( File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 688, in forward outputs = self.model( File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 578, in forward layer_outputs = decoder_layer( File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 194, in forward query_states = self.q_proj(hidden_states).view(bsz, q_len, self.num_heads, self.head_dim).transpose(1, 2) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/home/srikapan/anaconda3/envs/llm_peft/lib/python3.9/site-packages/peft/tuners/lora.py", line 565, in forward result = F.linear(x, transpose(self.weight, self.fan_in_fan_out), bias=self.bias) RuntimeError: mat1 and mat2 shapes cannot be multiplied (1024x4096 and 1x8388608)
The text was updated successfully, but these errors were encountered: