Simplify training: use Adam and remove the need for manual tuning #58

daniel-j-h · 2018-06-30T07:25:05Z

At the moment we are using stochastic gradient descent and a multi-step weight decay policy.

In this setup the user has to set

the initial sgd learning rate
the sgd momentum to use
the weight decay milestones
the weight decay factor

Lines 83 to 85 in 2067cb7

    
           optimizer = SGD(net.parameters(), lr=model["opt"]["lr"], momentum=model["opt"]["momentum"]) 
        
           scheduler = MultiStepLR(optimizer, milestones=model["opt"]["milestones"], gamma=model["opt"]["gamma"])

And while this allows for great flexibility and control over details it might be to complicated for our users. We should look into replacing our current setup e.g. with the Adam optimizer only setting the initial learning rate and the weight decay.

We can then set these two values to reasonable defaults and users can get started without thinking too much about parameters and without having to run multiple experiments just to get basic parameters figured out.

Tasks

Implement Adam optimizer with learning rate and weight decay
Benchmark and check results; if it looks reasonable go for it
Remove sgd parameters from config; use learning rate and weight decay only

daniel-j-h added a commit that referenced this issue Jul 3, 2018

Uses Adam optimizer by default to simplify training setup, closes #58

9387c8c

daniel-j-h mentioned this issue Jul 3, 2018

Uses Adam optimizer by default to simplify training setup, closes #58 #65

Merged

daniel-j-h added a commit that referenced this issue Jul 3, 2018

Uses Adam optimizer by default to simplify training setup, closes #58

316e298

daniel-j-h closed this as completed in 0813ff4 Jul 5, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify training: use Adam and remove the need for manual tuning #58

Simplify training: use Adam and remove the need for manual tuning #58

daniel-j-h commented Jun 30, 2018 •

edited

Loading

Simplify training: use Adam and remove the need for manual tuning #58

Simplify training: use Adam and remove the need for manual tuning #58

Comments

daniel-j-h commented Jun 30, 2018 • edited Loading

daniel-j-h commented Jun 30, 2018 •

edited

Loading