Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GloRa Implementation #1880

Closed

Conversation

viliamvolosv
Copy link

@viliamvolosv viliamvolosv commented Jun 21, 2024

@viliamvolosv
Copy link
Author

i stopd on original error
File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze()) RuntimeError: The size of tensor a (2048) must match the size of tensor b (64) at non-singleton dimension 0

Tis is log from run
forward A.shape: torch.Size([64, 64]), B.shape: torch.Size([64, 64]), C.shape: torch.Size([64, 1]), D.shape: torch.Size([1]), E.shape: torch.Size([1])

and we need to diside hot to deal with this - because this is ia far i anderstand normal behaviour
image

https://arxiv.org/pdf/2306.07967

@BenjaminBossan This is you time to help
user glora_finetuning.py for start GLORA

@viliamvolosv
Copy link
Author

hey @BenjaminBossan can you look on issue in this PR or may be some other from PEFT - im stuck
prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 64]) prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 64]) prepare_path config: LoRA_8, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 1]) prepare_path config: none, X.shape: torch.Size([1]), Xd.shape: torch.Size([64, 1]), Xu.shape: None prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 1]), Xu.shape: None forward A.shape: torch.Size([64, 1]), B.shape: torch.Size([64, 1]), C.shape: torch.Size([64, 1]), D.shape: torch.Size([1]), E.shape: torch.Size([64, 1]) Traceback (most recent call last): File "/home/guest/peft/examples/glora_finetuning/glora_finetuning.py", line 117, in <module> training_function() File "/home/guest/peft/examples/glora_finetuning/glora_finetuning.py", line 105, in training_function trainer.train() File "/home/guest/.local/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 440, in train output = super().train(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2216, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3238, in training_step loss = self.compute_loss(model, inputs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3264, in compute_loss outputs = model(**inputs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/peft_model.py", line 1505, in forward return self.base_model( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 180, in forward return self.model.forward(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1164, in forward outputs = self.model( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 957, in forward layer_outputs = self._gradient_checkpointing_func( File "/home/guest/.local/lib/python3.10/site-packages/torch/_compile.py", line 30, in inner return disable_fn(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 599, in _fn return fn(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 480, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/guest/.local/lib/python3.10/site-packages/torch/autograd/function.py", line 573, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/guest/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 254, in forward outputs = run_function(*args) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 713, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 616, in forward key_states = self.k_proj(hidden_states) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze()) RuntimeError: The expanded size of the tensor (16000) must match the existing size (64) at non-singleton dimension 0. Target sizes: [16000, 64]. Tensor sizes: [64, 64]

@BenjaminBossan
Copy link
Member

Could you please tell exactly what you did to get this error? Is it from a unit test? Also, in the future please try to format the error messages correctly (not all on one line) or else it's very hard to read.

@viliamvolosv
Copy link
Author

viliamvolosv commented Jul 4, 2024

Ok @BenjaminBossan , i try my best
1 - i created peft\examples\glora_finetuning\glora_finetuning.py
from this script i run glora
2 - Error in most important place glora/layer.py (full stack in file)
About this part i wrote you earlier #1880 (comment)
glora_error.txt

File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward
result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze())
RuntimeError: The expanded size of the tensor (16000) must match the existing size (64) at non-singleton dimension 0.  Target sizes: [16000, 64].  Tensor sizes: [64, 64]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants