Skip to content

Conversation

@yiheng
Copy link
Contributor

@yiheng yiheng commented Oct 7, 2016

In this PR, we add two other learning rate adjust strategies.

  1. Poly
    adjust the learning rate as:
    init_lr * (1 - iter / maxiter)^beta
    where init_lr is the intial learning rate, iter is the current iteration, maxiter is the maximum iteration and beta is a hyper parameter.
  2. Step
    multiply the learning rate with a give factor gamma every N iteration, where the gamma and N are hyper parameters.

These strategies help the model converge to a good accuracy in large batch size.

The implementation is in optim/SGD. And the old lr adust strategies are also move to that place.

@yiheng yiheng merged commit 0e9ca8b into intel:master Oct 7, 2016
Oscilloscope98 pushed a commit to Oscilloscope98/ipex-llm that referenced this pull request Oct 18, 2022
Fix mis-numbered list problems & other small fixes
liu-shaojun pushed a commit that referenced this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant