Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port over rule changes from Flux #38

Closed
2 of 3 tasks
ToucheSir opened this issue Jan 28, 2022 · 8 comments
Closed
2 of 3 tasks

Port over rule changes from Flux #38

ToucheSir opened this issue Jan 28, 2022 · 8 comments
Labels
help wanted Extra attention is needed

Comments

@ToucheSir
Copy link
Member

ToucheSir commented Jan 28, 2022

@ToucheSir ToucheSir added the help wanted Extra attention is needed label Jan 28, 2022
@darsnack
Copy link
Member

Also, to be clear, ExpDecay and InvDecay should not be ported. These are not optimizers, and I think it's a mistake to shoehorn them into the same interface.

@mcabbott
Copy link
Member

This can be closed, 3rd bullet point closed by FluxML/Flux.jl#1868 instead of here.

@axsk
Copy link

axsk commented Feb 23, 2024

Also, to be clear, ExpDecay and InvDecay should not be ported. These are not optimizers, and I think it's a mistake to shoehorn them into the same interface.

How are these supposed to be used now? Trying the Flux builtin ExpDecay I get errors saying Flux can't convert them to the new Optimisers.jl.

@ToucheSir
Copy link
Member Author

Have you seen https://fluxml.ai/Flux.jl/stable/training/optimisers/#Scheduling-Optimisers? Basically, use ParameterSchedulers.jl. The Flux docs are actually just out of date, because we added functionality to ParameterSchedulers which makes this easier. http://fluxml.ai/ParameterSchedulers.jl/dev/tutorials/optimizers/ should cover everything.

@axsk
Copy link

axsk commented Feb 24, 2024

Sorry for crossposting since this probably doesn't belong here, but I don't know which package exactly is responsible.
I had seen ParameterSchedulers and tried to get it to work with the stateful optimisers, i.e. along the lines:

optstate = Flux.setup!(Optimisers.OptimisersChain(Optimisers.AdamW(), ParameterSchedulers.Exp(1,1)))
# so that i can use update!(optstate, model, grad[1])

I tried many combinations of Flux or Optimisers versions as well as with ParameterSchedulers.Stateful, but could not find a working version.

@ToucheSir
Copy link
Member Author

ToucheSir commented Feb 24, 2024

That code doesn't look right in many ways: Flux.setup! doesn't exist, and ParameterSchedulers schedules should not be used in OptimisersChain. I'd recommend having a look through some of the examples in the ParameterSchedulers docs, specifically the second example on http://fluxml.ai/ParameterSchedulers.jl/dev/tutorials/complex-schedules/. If you try them out and they aren't working, please open an issue with a minimal working example (MWE) on ParameterSchedulers.jl.

@darsnack
Copy link
Member

Optimisers.OptimisersChain(Optimisers.AdamW(), ParameterSchedulers.Exp(1,1)) looks like you're still treating schedules the old way (multiplying the learning rate). I suggest following the linked documentation for ParameterSchedulers and post a complete code example if it isn't working.

@axsk
Copy link

axsk commented Feb 25, 2024

Thank you for pointing me to the complex schedulers. I indeed didn't see that before.
It is working now with setup(Scheduler(AdamW, η=Exp(1,0.95)), model). The only thing I cannot get working yet is setting other AdamW parameters to constants at initialisation. Something like Scheduler(AdamW, η=Exp(1,0.95)), λ=1e-3) still gives me errors (Floats are not callable).

Edit: Moved this problem to FluxML/ParameterSchedulers.jl#61

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants