Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add setter for EinsumDense.kernel #19469

Closed
wants to merge 3 commits into from

Conversation

haifeng-jin
Copy link
Contributor

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Apr 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 76.14%. Comparing base (8961e3f) to head (879e132).
Report is 5 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #19469      +/-   ##
==========================================
+ Coverage   76.08%   76.14%   +0.06%     
==========================================
  Files         367      367              
  Lines       41055    41051       -4     
  Branches     8014     8010       -4     
==========================================
+ Hits        31235    31259      +24     
+ Misses       8103     8082      -21     
+ Partials     1717     1710       -7     
Flag Coverage Δ
keras 76.00% <100.00%> (+0.06%) ⬆️
keras-jax 60.27% <100.00%> (+0.07%) ⬆️
keras-numpy 54.23% <100.00%> (+0.11%) ⬆️
keras-tensorflow 61.47% <100.00%> (+0.06%) ⬆️
keras-torch 60.36% <100.00%> (+0.06%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

@@ -197,6 +197,10 @@ def kernel(self):
)
return self._kernel

@kernel.setter
def kernel(self, value):
self._kernel.assign(value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the intended behavior? This is a value assignment, but would users expect to set the actual variable instead?

Also, should we disable this when LoRA is enabled?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be compatible with lora:

    @kernel.setter
    def kernel(self, value):
        self._kernel.assign(value)
        if self.lora_enabled:
            self.lora_kernel_a.assign(ops.zeros(self.lora_kernel_a.shape))
            self.lora_kernel_b.assign(ops.zeros(self.lora_kernel_b.shape))

just like how load_own_variables does

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a value assignment, but would users expect to set the actual variable instead?
Yes, you are right. I changed it to set the actual variable.
The use case I saw was that they try to normalize the kernel during training everytime the layer is called.
I think this use case is OK, but it would be better to set the variable.

For Lora, it would error out even before reaching this setter function. See the new test I added for testing the error.

Copy link
Member

@fchollet fchollet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't clear to me what the use case is in the first place. Based on the use case, we can determine whether direct assignment or assign() is the right behavior.

  1. What do you want to assign? A tensor? An existing Variable?
  2. What tracking behavior do you want? Should layer.kernel still be a trainable variable?

@@ -197,6 +197,10 @@ def kernel(self):
)
return self._kernel

@kernel.setter
def kernel(self, value):
self._kernel = value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to do this, should we typecheck kernel to make sure it's a Variable? Also, we should definitely untrack the previously tracked _kernel variable otherwise it's still listed in weights.

)
layer.build(input_shape)
kernel = layer.kernel
layer.kernel = kernel + 1.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will set the kernel to a tensor (it won't be trainable anymore). Is that intended?

bias_axes = "de"
input_shape = (2, 1, 2)
output_shape = (3, 4)
layer = layers.EinsumDense(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a check on layer.trainable_variables

@fchollet
Copy link
Member

Thinking more about this -- I think we should do it like:

  • Check that it's a KerasVariable
  • Untrack the old value
  • Set the new value directly

We should also extend the setter to Dense and Embedding.

@haifeng-jin
Copy link
Contributor Author

Yes, that make sense.
Basically, we just swap out the old kernel.
For my specific use case, it is OK to do either way, but to be more general, we should just do as you said.

@fchollet
Copy link
Member

Cool, let's do that!

@haifeng-jin
Copy link
Contributor Author

I found it is affecting a lot of things.
It changes the order of the variables, thus, changed get_weights() and saving and loading of the weights, and naming of the weights as well.
It is very prone to bugs.

We may just let the user extend the EinsumDense class to do that instead of via the setter.
WDYT? @fchollet

@fchollet
Copy link
Member

The naming/ordering issue is fixable. Another issue is that the layer's tracker would have to be unlocked, and we can't do that in the setter (because setattr gets called first). Let's not do that then!

If assign works for your use case you can just do layer.kernel.assign().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Closed/Rejected
Development

Successfully merging this pull request may close these issues.

5 participants