New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add docs note about saving/loading models with anonymous functions #2263
Comments
Given that activation functions have always been handled weirdly, despite the fact that it is a little against the Flux style, maybe it might not be a bad idea to have an |
An This kind of solution is both cleaner and correct by just naming the function (e.g. |
I wonder if we could create a helper function which searches the model for these closures and warns the user if it finds them? |
Might my issue at #2339 be related to this? It contains anonymous functions that slice the input arrays, like |
Just as mentioned up top: extract the parameters with |
After checking the docs, this implies that the model definition must be available in the session, right? Is it necessary to create a custom struct and apply the |
|
Thank you, I managed to make it work with the |
The reason we don't mention that in the docs is the same reason PyTorch doesn't mention that you need to define all the layer types for a model before calling |
The new save/load docs promote JLD2.jl which does not support saving/loading anonymous functions reliably. This most commonly occurs for activation functions. The solution is to use
Flux.state
+Flux.loadmodel!
and set the desired anonymous function in the destination ofloadmodel!
. This avoids needing the serialization library to correctly handle the anonymous function.This will be problematic when the anonymous function contains data (state) that actually should be restored. A possible solution here is to make the closure an explicit struct. Maybe there are better solutions.
Regardless, new users are unlikely to realize these edge cases. We should expand the saving/loading documentation to explain how to handle these cases with code examples.
The text was updated successfully, but these errors were encountered: