Add GloRa Implementation #1880

viliamvolosv · 2024-06-21T11:28:53Z

More ingormation https://arxiv.org/abs/2306.07967
https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA

implementation on old fork of PEFT main...Arnav0400:peft:main

…v/peft into add_GLoRa_secon_time

viliamvolosv · 2024-06-21T18:32:51Z

i stopd on original error
File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze()) RuntimeError: The size of tensor a (2048) must match the size of tensor b (64) at non-singleton dimension 0

Tis is log from run
forward A.shape: torch.Size([64, 64]), B.shape: torch.Size([64, 64]), C.shape: torch.Size([64, 1]), D.shape: torch.Size([1]), E.shape: torch.Size([1])

and we need to diside hot to deal with this - because this is ia far i anderstand normal behaviour

https://arxiv.org/pdf/2306.07967

@BenjaminBossan This is you time to help
user glora_finetuning.py for start GLORA

viliamvolosv · 2024-06-29T18:55:49Z

hey @BenjaminBossan can you look on issue in this PR or may be some other from PEFT - im stuck
prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 64]) prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 64]) prepare_path config: LoRA_8, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 8]), Xu.shape: torch.Size([8, 1]) prepare_path config: none, X.shape: torch.Size([1]), Xd.shape: torch.Size([64, 1]), Xu.shape: None prepare_path config: vector, X.shape: torch.Size([64, 1]), Xd.shape: torch.Size([64, 1]), Xu.shape: None forward A.shape: torch.Size([64, 1]), B.shape: torch.Size([64, 1]), C.shape: torch.Size([64, 1]), D.shape: torch.Size([1]), E.shape: torch.Size([64, 1]) Traceback (most recent call last): File "/home/guest/peft/examples/glora_finetuning/glora_finetuning.py", line 117, in <module> training_function() File "/home/guest/peft/examples/glora_finetuning/glora_finetuning.py", line 105, in training_function trainer.train() File "/home/guest/.local/lib/python3.10/site-packages/trl/trainer/sft_trainer.py", line 440, in train output = super().train(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 1885, in train return inner_training_loop( File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 2216, in _inner_training_loop tr_loss_step = self.training_step(model, inputs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3238, in training_step loss = self.compute_loss(model, inputs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/trainer.py", line 3264, in compute_loss outputs = model(**inputs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/peft_model.py", line 1505, in forward return self.base_model( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 180, in forward return self.model.forward(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1164, in forward outputs = self.model( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 957, in forward layer_outputs = self._gradient_checkpointing_func( File "/home/guest/.local/lib/python3.10/site-packages/torch/_compile.py", line 30, in inner return disable_fn(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 599, in _fn return fn(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 480, in checkpoint return CheckpointFunction.apply(function, preserve, *args) File "/home/guest/.local/lib/python3.10/site-packages/torch/autograd/function.py", line 573, in apply return super().apply(*args, **kwargs) # type: ignore[misc] File "/home/guest/.local/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 254, in forward outputs = run_function(*args) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 713, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 616, in forward key_states = self.k_proj(hidden_states) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1552, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1561, in _call_impl return forward_call(*args, **kwargs) File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze()) RuntimeError: The expanded size of the tensor (16000) must match the existing size (64) at non-singleton dimension 0. Target sizes: [16000, 64]. Tensor sizes: [64, 64]

BenjaminBossan · 2024-07-02T10:50:10Z

Could you please tell exactly what you did to get this error? Is it from a unit test? Also, in the future please try to format the error messages correctly (not all on one line) or else it's very hard to read.

viliamvolosv · 2024-07-04T18:53:40Z

Ok @BenjaminBossan , i try my best
1 - i created peft\examples\glora_finetuning\glora_finetuning.py
from this script i run glora
2 - Error in most important place glora/layer.py (full stack in file)
About this part i wrote you earlier #1880 (comment)
glora_error.txt

File "/home/guest/.local/lib/python3.10/site-packages/peft/tuners/glora/layer.py", line 148, in forward
result = F.linear(x, self.weight + self.weight*A + B, bias=E+torch.matmul(self.weight, C).squeeze())
RuntimeError: The expanded size of the tensor (16000) must match the existing size (64) at non-singleton dimension 0.  Target sizes: [16000, 64].  Tensor sizes: [64, 64]

viliamvolosv added 12 commits June 19, 2024 22:55

glora first iteration

7012d9a

update

c26363d

update

9048bc1

remove recurcy

5859b30

WIP

4ceff7f

WIP

1dd3e6c

wip

c4cdb81

wip

819d164

fix

793f004

wip

6557558

Merge branch 'huggingface:main' into add_GLoRa_secon_time

024932e

wip

c1f9839

viliamvolosv mentioned this pull request Jun 21, 2024

Glora implementation from https://github.com/Arnav0400/ViT-Slim/tree/master/GLoRA #1835

Closed

viliamvolosv added 9 commits June 21, 2024 18:26

Merge branch 'huggingface:main' into add_GLoRa_secon_time

3a222ad

update

dd53715

wip

2b7c526

Merge branch 'add_GLoRa_secon_time' of https://github.com/viliamvolos…

b730e1d

…v/peft into add_GLoRa_secon_time

wip

e9e4b5f

WIP

9f97e19

WIP

575dc43

wip

dffaad8

wip

35396d8

wip

cf9fb92

viliamvolosv closed this Jul 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add GloRa Implementation #1880

Add GloRa Implementation #1880

viliamvolosv commented Jun 21, 2024 •

edited

Loading

viliamvolosv commented Jun 21, 2024

viliamvolosv commented Jun 29, 2024

BenjaminBossan commented Jul 2, 2024

viliamvolosv commented Jul 4, 2024 •

edited

Loading

Add GloRa Implementation #1880

Add GloRa Implementation #1880

Conversation

viliamvolosv commented Jun 21, 2024 • edited Loading

viliamvolosv commented Jun 21, 2024

viliamvolosv commented Jun 29, 2024

BenjaminBossan commented Jul 2, 2024

viliamvolosv commented Jul 4, 2024 • edited Loading

viliamvolosv commented Jun 21, 2024 •

edited

Loading

viliamvolosv commented Jul 4, 2024 •

edited

Loading