Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

learning rate logging #131

Closed
wants to merge 3 commits into from
Closed

Conversation

alexander-g
Copy link
Contributor

@alexander-g alexander-g commented Dec 21, 2020

Learning rates are now logged during training (#124). For this I have created a new module elegy.optax which is a wrapper around optax. Specifically it wraps:

  • scale_by_schedule to intercept the step_fn function
  • all functions that accept a learning_rate parameter to intercept it
  • chain
  • all other optax functions are kept the same

Extended elegy.Optimizer with the method get_effective_learning_rate which calculates the lr scaled by schedule(s).

The original optax module can still be used as before but the learning rates are not logged

@codecov-io
Copy link

codecov-io commented Dec 21, 2020

Codecov Report

Merging #131 (c8cffb7) into master (78777ff) will increase coverage by 0.34%.
The diff coverage is 98.92%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #131      +/-   ##
==========================================
+ Coverage   79.25%   79.60%   +0.34%     
==========================================
  Files         110      111       +1     
  Lines        5143     5236      +93     
==========================================
+ Hits         4076     4168      +92     
- Misses       1067     1068       +1     
Impacted Files Coverage Δ
elegy/model/model_base.py 75.64% <94.44%> (+1.33%) ⬆️
elegy/model/model_test.py 100.00% <100.00%> (ø)
elegy/optax.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 78777ff...c8cffb7. Read the comment docs.

@cgarciae
Copy link
Collaborator

Hey @alexander-g!

I am wondering if this is the right approach. An alternative to this is just modifying our Optimizer class as we discussed in #124 and manually adding chain + scale_by_schedule to the optax optimizer passed by the user given a lr_schedule function also passed by the user.

My main doubts are:

  • We have to add the lr and step_fns fields to all GradientTransformations even if we don't need them.
  • The use of meta-programming can make error messages confusing at times.

I try to see if I can implement the proposal in #124 so we can compare.

@cgarciae cgarciae mentioned this pull request Dec 23, 2020
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants