-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
better update! interface #76
Comments
The docs indicate constructors like Momentum(), in actual code you require Momentum(w). Should we go back to Momentum()? |
I am not sure if the functional interface: (w,p)=update!(w,g,p) is worth supporting. The original motivation was scalar parameters. But you use axpy! so you don't support this anyway? |
There should be a update!(w,g;lr=0.1) which defaults to Sgd. This will come in handy if w,g are iterators, and we just want to apply a simple Sgd update to all without having to construct separate optimization objects. |
Since all current update! methods require a well typed prms argument, the untyped update!(w,g,p) can be used to implement the iterator method. In fact this will support iterators within iterators (embedded arrays). The only thing it won't support is w::Dict, which requires its own method. In each case we require w,g,p to have parallel structure. |
Do we have the Nesterov version of Momentum or the regular? Maybe we can make that a flag... |
I am working on this in the |
We use the interface (w,p)=update!(w,g,p) to allow further developments. We don't want everyone change their own code when we change our inner code. We don't have any nesterov... |
The new calls have been integrated. The problem with the (w,p)=update!(w,g,p) syntax is (1) either we are modifying in-place, in which case it is not necessary, (2) or we are constructing new w,p in which case we need to match their types to the originals, i.e. dictionary, tuple, array, struct etc. I don't think it is worth the trouble. |
I was thinking of the update! interface. The requirement of doing individual updates to each numeric weight array is counter-intuitive even if we better documented it. People will naturally want a single update! for the whole model:
The problem is we do not know the structure of the weights, params, or grads (Array, Tuple, Dict, other Type etc.) What if we require the minimum common interface for all three to be iterators i.e. that they only support
for ... in ...
? Then, update! can use zip to iterate over them:We probably need to think about what, if anything, to return. If we just guarantee that the returned values are also iterators, would that be sufficient?
The text was updated successfully, but these errors were encountered: