-
Notifications
You must be signed in to change notification settings - Fork 301
Add query_proj
, value_proj
to target names for enable_lora
#2107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
enable_lora
query_proj
, value_proj
to target names for enable_lora
keras_hub/src/models/backbone.py
Outdated
"value", | ||
"query_proj", | ||
"value_proj", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach works too but I was thinking of having a get_lora_target_names
which returns ["query_dense", "value_dense", "query", "value"]
in this base class and if any other model like PaliGemma
has lorafiable layers other than this, can override get_lora_target_names
and return their lorafiable layers, e.g. "query_proj", "value_proj",
in the case of PaliGemma
. This would be more scalable and flexible and prevent this list from getting longer. wdyt?
PS: The reason that I said we don't have to have it as an argument is that it's usually a constant list for a model and it's unlikely that users need to change it so hard-coding it should be fine and leaves us with a simpler API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I guess I misconstrued what you were saying during the call. Pushing the change. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comments!
""" | ||
return ["query_dense", "value_dense", "query", "value"] | ||
|
||
def enable_lora(self, rank): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could consider exposing this. E.g. target_names=None
, if None
uses backbone.get_default_lora_targets()
or something like that. Would allow some customization.
target_names = super().get_lora_target_names() | ||
|
||
# Add these for `PaliGemmaVITAttention`. | ||
target_names += ["query_proj", "value_proj"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just align these names without breaking checkpoint compat? Fine to keep the overridable method, but seems like it would just save us headache (and improve overall ux) if we just kept the same naming convention here. There's no reason for the divergence right?
Resolves #2104