-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add setter for EinsumDense.kernel #19469
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #19469 +/- ##
==========================================
+ Coverage 76.08% 76.14% +0.06%
==========================================
Files 367 367
Lines 41055 41051 -4
Branches 8014 8010 -4
==========================================
+ Hits 31235 31259 +24
+ Misses 8103 8082 -21
+ Partials 1717 1710 -7
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR!
keras/layers/core/einsum_dense.py
Outdated
@@ -197,6 +197,10 @@ def kernel(self): | |||
) | |||
return self._kernel | |||
|
|||
@kernel.setter | |||
def kernel(self, value): | |||
self._kernel.assign(value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the intended behavior? This is a value assignment, but would users expect to set the actual variable instead?
Also, should we disable this when LoRA is enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be compatible with lora:
@kernel.setter
def kernel(self, value):
self._kernel.assign(value)
if self.lora_enabled:
self.lora_kernel_a.assign(ops.zeros(self.lora_kernel_a.shape))
self.lora_kernel_b.assign(ops.zeros(self.lora_kernel_b.shape))
just like how load_own_variables
does
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a value assignment, but would users expect to set the actual variable instead?
Yes, you are right. I changed it to set the actual variable.
The use case I saw was that they try to normalize the kernel during training everytime the layer is called.
I think this use case is OK, but it would be better to set the variable.
For Lora, it would error out even before reaching this setter function. See the new test I added for testing the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't clear to me what the use case is in the first place. Based on the use case, we can determine whether direct assignment or assign()
is the right behavior.
- What do you want to assign? A tensor? An existing Variable?
- What tracking behavior do you want? Should
layer.kernel
still be a trainable variable?
@@ -197,6 +197,10 @@ def kernel(self): | |||
) | |||
return self._kernel | |||
|
|||
@kernel.setter | |||
def kernel(self, value): | |||
self._kernel = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to do this, should we typecheck kernel
to make sure it's a Variable? Also, we should definitely untrack the previously tracked _kernel
variable otherwise it's still listed in weights
.
) | ||
layer.build(input_shape) | ||
kernel = layer.kernel | ||
layer.kernel = kernel + 1.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will set the kernel to a tensor (it won't be trainable anymore). Is that intended?
bias_axes = "de" | ||
input_shape = (2, 1, 2) | ||
output_shape = (3, 4) | ||
layer = layers.EinsumDense( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a check on layer.trainable_variables
Thinking more about this -- I think we should do it like:
We should also extend the setter to |
Yes, that make sense. |
Cool, let's do that! |
I found it is affecting a lot of things. We may just let the user extend the |
The naming/ordering issue is fixable. Another issue is that the layer's tracker would have to be unlocked, and we can't do that in the setter (because setattr gets called first). Let's not do that then! If |
No description provided.