Port over rule changes from Flux #38

ToucheSir · 2022-01-28T22:47:07Z

Use conjugates in optimizers to better learn on complex-valued inputs Flux.jl#1776
Fix ADADelta calculations and broken tests not catching the problems Flux.jl#1299
fix AdamW and improve decays docs Flux.jl#1612 (not the InvDecay and ExpDecay parts)

The text was updated successfully, but these errors were encountered:

darsnack · 2022-01-28T22:52:38Z

Also, to be clear, ExpDecay and InvDecay should not be ported. These are not optimizers, and I think it's a mistake to shoehorn them into the same interface.

mcabbott · 2022-02-14T22:21:23Z

This can be closed, 3rd bullet point closed by FluxML/Flux.jl#1868 instead of here.

axsk · 2024-02-23T15:14:16Z

Also, to be clear, ExpDecay and InvDecay should not be ported. These are not optimizers, and I think it's a mistake to shoehorn them into the same interface.

How are these supposed to be used now? Trying the Flux builtin ExpDecay I get errors saying Flux can't convert them to the new Optimisers.jl.

ToucheSir · 2024-02-23T16:08:07Z

Have you seen https://fluxml.ai/Flux.jl/stable/training/optimisers/#Scheduling-Optimisers? Basically, use ParameterSchedulers.jl. The Flux docs are actually just out of date, because we added functionality to ParameterSchedulers which makes this easier. http://fluxml.ai/ParameterSchedulers.jl/dev/tutorials/optimizers/ should cover everything.

axsk · 2024-02-24T11:10:18Z

Sorry for crossposting since this probably doesn't belong here, but I don't know which package exactly is responsible.
I had seen ParameterSchedulers and tried to get it to work with the stateful optimisers, i.e. along the lines:

optstate = Flux.setup!(Optimisers.OptimisersChain(Optimisers.AdamW(), ParameterSchedulers.Exp(1,1)))
# so that i can use update!(optstate, model, grad[1])

I tried many combinations of Flux or Optimisers versions as well as with ParameterSchedulers.Stateful, but could not find a working version.

ToucheSir · 2024-02-24T18:30:14Z

That code doesn't look right in many ways: Flux.setup! doesn't exist, and ParameterSchedulers schedules should not be used in OptimisersChain. I'd recommend having a look through some of the examples in the ParameterSchedulers docs, specifically the second example on http://fluxml.ai/ParameterSchedulers.jl/dev/tutorials/complex-schedules/. If you try them out and they aren't working, please open an issue with a minimal working example (MWE) on ParameterSchedulers.jl.

darsnack · 2024-02-24T19:50:05Z

Optimisers.OptimisersChain(Optimisers.AdamW(), ParameterSchedulers.Exp(1,1)) looks like you're still treating schedules the old way (multiplying the learning rate). I suggest following the linked documentation for ParameterSchedulers and post a complete code example if it isn't working.

axsk · 2024-02-25T16:55:48Z

Thank you for pointing me to the complex schedulers. I indeed didn't see that before.
It is working now with setup(Scheduler(AdamW, η=Exp(1,0.95)), model). The only thing I cannot get working yet is setting other AdamW parameters to constants at initialisation. Something like Scheduler(AdamW, η=Exp(1,0.95)), λ=1e-3) still gives me errors (Floats are not callable).

Edit: Moved this problem to FluxML/ParameterSchedulers.jl#61

ToucheSir added the help wanted Extra attention is needed label Jan 28, 2022

This was referenced Jan 30, 2022

Fix ADAMW, and track the loss #46

Closed

Complex numbers alla Flux 1776 #47

Merged

mcabbott closed this as completed Feb 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port over rule changes from Flux #38

Port over rule changes from Flux #38

ToucheSir commented Jan 28, 2022 •

edited by mcabbott

Loading

darsnack commented Jan 28, 2022

mcabbott commented Feb 14, 2022

axsk commented Feb 23, 2024 •

edited

Loading

ToucheSir commented Feb 23, 2024

axsk commented Feb 24, 2024

ToucheSir commented Feb 24, 2024 •

edited

Loading

darsnack commented Feb 24, 2024

axsk commented Feb 25, 2024 •

edited

Loading

Port over rule changes from Flux #38

Port over rule changes from Flux #38

Comments

ToucheSir commented Jan 28, 2022 • edited by mcabbott Loading

darsnack commented Jan 28, 2022

mcabbott commented Feb 14, 2022

axsk commented Feb 23, 2024 • edited Loading

ToucheSir commented Feb 23, 2024

axsk commented Feb 24, 2024

ToucheSir commented Feb 24, 2024 • edited Loading

darsnack commented Feb 24, 2024

axsk commented Feb 25, 2024 • edited Loading

ToucheSir commented Jan 28, 2022 •

edited by mcabbott

Loading

axsk commented Feb 23, 2024 •

edited

Loading

ToucheSir commented Feb 24, 2024 •

edited

Loading

axsk commented Feb 25, 2024 •

edited

Loading