Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer. #1919

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

zhangsheng377
Copy link
Contributor

There was a previous commit that moved the merge and other functions in LoraLayer to Linear, etc., but the LoraParallelLinear in tp_layer was missed.

…ayer


Synchronize lora's merge, unmerge, etc. modifications to lora's tp_layer.
@zhangsheng377
Copy link
Contributor Author

@BenjaminBossan

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

Thanks for this update. Could you please run make style?

@zhangsheng377
Copy link
Contributor Author

Thanks for this update. Could you please run make style?

Ok. Sorry, I forgot it.

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR is failing on Python 3.8 because of type annotation syntax. Adding from __future__ import annotations should fix that. I assume that you tried out that the changes you made to the megatron layer work?

@@ -108,7 +116,7 @@ def update_layer(
else:
lora_dropout_layer = nn.Identity()

self.lora_dropout[adapter_name] = lora_dropout_layer
self.lora_dropout.update(nn.ModuleDict({adapter_name: lora_dropout_layer}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be necessary, right?

@zhangsheng377
Copy link
Contributor Author

The PR is failing on Python 3.8 because of type annotation syntax. Adding from __future__ import annotations should fix that. I assume that you tried out that the changes you made to the megatron layer work?

The merge func is that I copyed from Linear.merge

But I forgot to add the from __future__ import annotations

@BenjaminBossan
Copy link
Member

I don't really have experience with megatron, so I'm not sure if these methods would just work when copied 1:1. Just to be sure, did you test with your setup that nothing breaks with these changes? If yes, I think we can merge and in the future try to be more careful to keep tp_layer.py in sync.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants