You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 21, 2022. It is now read-only.
Frequently it's not enough to have an in-memory representation of a model or the optimization algorithm state. We want to serialize the structure, parameters, and state of the models and sub-learners that we're working with.
I'd like to take this one step further, and define a generic "spec" for these stateful items that we can both serialize/deserialize, but which could be loaded into another "backend" easily. I'm using the same terminology as Plots on purpose... I think there are a lot of similarities in the problems we're looking to solve.
Some examples of backends:
Transformations, etc
Optim
TensorFlow
Theano
BayesNet
The idea is that, where there is overlapping functionality, there is the opportunity to generalize. Suppose we have a general concept "I want a 3 layer neural net with these numbers of nodes and relu activations, and these initial weight values". Lots of software implements this. If we build this information into a generic spec (similar to the Plot object in Plots), then we only need to connect the spec to a backend's constructor, and we have the ability to convert and transfer models between backends.
The same goes for optimization routines... many times there is a 1-1 mapping between, for example, an Adam optimizer in TensorFlow and Theano. But each backend reimplements the concept in a different way, and reinvents the wheel completely. The Plots model applies here as well. Define a generic concept Adam updater, then map from that to a backend's implementation.
The end result is that I'd like for models and learning algorithms to be built from specs, which then get mapped to sub-learners specific to a particular spec. This would allow us to serialize/deserialize a backend-agnostic spec, not actual Julia objects.
For example, it would allow us to experiment with structures/models/algos in something more flexible (JuliaML I hope), and then convert to a TensorFlow graph for pounding the GPU or sharing with other stubborn researchers that aren't using Julia.
This is closely related to JuliaPlots/Plots.jl#390, and I think design decisions can be used for both.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Frequently it's not enough to have an in-memory representation of a model or the optimization algorithm state. We want to serialize the structure, parameters, and state of the models and sub-learners that we're working with.
I'd like to take this one step further, and define a generic "spec" for these stateful items that we can both serialize/deserialize, but which could be loaded into another "backend" easily. I'm using the same terminology as Plots on purpose... I think there are a lot of similarities in the problems we're looking to solve.
Some examples of backends:
The idea is that, where there is overlapping functionality, there is the opportunity to generalize. Suppose we have a general concept "I want a 3 layer neural net with these numbers of nodes and relu activations, and these initial weight values". Lots of software implements this. If we build this information into a generic spec (similar to the
Plot
object in Plots), then we only need to connect the spec to a backend's constructor, and we have the ability to convert and transfer models between backends.The same goes for optimization routines... many times there is a 1-1 mapping between, for example, an Adam optimizer in TensorFlow and Theano. But each backend reimplements the concept in a different way, and reinvents the wheel completely. The Plots model applies here as well. Define a generic concept Adam updater, then map from that to a backend's implementation.
The end result is that I'd like for models and learning algorithms to be built from specs, which then get mapped to sub-learners specific to a particular spec. This would allow us to serialize/deserialize a backend-agnostic spec, not actual Julia objects.
For example, it would allow us to experiment with structures/models/algos in something more flexible (JuliaML I hope), and then convert to a TensorFlow graph for pounding the GPU or sharing with other stubborn researchers that aren't using Julia.
This is closely related to JuliaPlots/Plots.jl#390, and I think design decisions can be used for both.
The text was updated successfully, but these errors were encountered: