Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoRA A and LoRA B dimensions mentioned in the paper are different from the implementation here. #983

Closed
1 of 4 tasks
s3pi opened this issue Sep 30, 2023 · 3 comments
Closed
1 of 4 tasks

Comments

@s3pi
Copy link

s3pi commented Sep 30, 2023

System Info

PEFT library

Who can help?

@sayakpaul

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

(lora_A): ModuleDict(
(default): Linear(in_features=20, out_features=8, bias=False)
)
(lora_B): ModuleDict(
(default): Linear(in_features=8, out_features=2000, bias=False)
)

Expected behavior

The dimensions of LoRA A and LoRA B layers have to opposite to the one implementation here.

@s3pi
Copy link
Author

s3pi commented Sep 30, 2023

Why does the paper say BA while in this implementation it is lora_A * lora_B?

@ChrisHayduk
Copy link

ChrisHayduk commented Sep 30, 2023

The paper seems to have some incosistency in which matrix it names A and which it names B. Take the main image from the paper for example:

image

In the above image, we have that A is a d x r matrix and B is an r x k (where d is the input dimension, k is the output dimension, and r is the LoRA adapter rank). In this case k = d. This suggests that AB would be a d x k matrix, matching the dimension of W (as desired). This is what PEFT has implemented.

However, later in the paper, the authors state that B is a d x r matrix and A is an r x k matrix, reversing their dimensionalities compared to the above image. This is why they use BA throughout the paper.

Both implementations are equivalent, just with different naming schemes .

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants