-
-
Notifications
You must be signed in to change notification settings - Fork 610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parameter Sharing breaks destructure
#1767
Comments
destructure
destructure
The key line is https://github.com/FluxML/Flux.jl/blob/master/src/utils.jl#L649. Because Zygote is blissfully unaware of the tying, it will return a separate gradient for each layer. However, since the gradient for the biases are Fixing this would require a few things. First, passing some additional additional metadata (e.g. offsets of each param) to Footnotes
|
Note that the problem is worse than this. Even with dense arrays, it's easy for two parameters to get the same gradient, e.g. if they enter as Conversely, with shared parameters in the model, the present structure of |
Exactly. There's no getting around either closing over the original structure or creating a new auxiliary one for use in co-iterating over the gradients and determining what goes where. |
Maybe also worth noting that the notion of sharing in Functors is |
MWE:
Stacktrace:
The text was updated successfully, but these errors were encountered: