Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better update! interface #76

Closed
denizyuret opened this issue Feb 12, 2017 · 8 comments
Closed

better update! interface #76

denizyuret opened this issue Feb 12, 2017 · 8 comments
Assignees

Comments

@denizyuret
Copy link
Owner

I was thinking of the update! interface. The requirement of doing individual updates to each numeric weight array is counter-intuitive even if we better documented it. People will naturally want a single update! for the whole model:

(weights, params) = update!(weights, params, grads)

The problem is we do not know the structure of the weights, params, or grads (Array, Tuple, Dict, other Type etc.) What if we require the minimum common interface for all three to be iterators i.e. that they only support for ... in ...? Then, update! can use zip to iterate over them:

for (w,p,g) in zip(weights, params, grads)
  ...
end

We probably need to think about what, if anything, to return. If we just guarantee that the returned values are also iterators, would that be sufficient?

@denizyuret
Copy link
Owner Author

The docs indicate constructors like Momentum(), in actual code you require Momentum(w). Should we go back to Momentum()?

@denizyuret
Copy link
Owner Author

I am not sure if the functional interface: (w,p)=update!(w,g,p) is worth supporting. The original motivation was scalar parameters. But you use axpy! so you don't support this anyway?

@denizyuret
Copy link
Owner Author

There should be a update!(w,g;lr=0.1) which defaults to Sgd. This will come in handy if w,g are iterators, and we just want to apply a simple Sgd update to all without having to construct separate optimization objects.

@denizyuret
Copy link
Owner Author

Since all current update! methods require a well typed prms argument, the untyped update!(w,g,p) can be used to implement the iterator method. In fact this will support iterators within iterators (embedded arrays). The only thing it won't support is w::Dict, which requires its own method. In each case we require w,g,p to have parallel structure.

@denizyuret
Copy link
Owner Author

Do we have the Nesterov version of Momentum or the regular? Maybe we can make that a flag...

@denizyuret
Copy link
Owner Author

I am working on this in the newupdate branch.

@ozanarkancan
Copy link
Collaborator

We use the interface (w,p)=update!(w,g,p) to allow further developments. We don't want everyone change their own code when we change our inner code.

We don't have any nesterov...

@denizyuret
Copy link
Owner Author

The new calls have been integrated. The problem with the (w,p)=update!(w,g,p) syntax is (1) either we are modifying in-place, in which case it is not necessary, (2) or we are constructing new w,p in which case we need to match their types to the originals, i.e. dictionary, tuple, array, struct etc. I don't think it is worth the trouble.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants