Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
RMSprop clean up and rebase #2867
Conversation
shelhamer
added focus JD
labels
Aug 6, 2015
This was referenced Aug 6, 2015
shelhamer
added the
ready for review
label
Aug 6, 2015
|
thanks for handling this :) |
jeffdonahue
commented on an outdated diff
Aug 7, 2015
| @@ -521,7 +531,7 @@ TYPED_TEST(NesterovSolverTest, TestNesterovLeastSquaresUpdateWithMomentum) { | ||
| const Dtype kMomentum = 0.5; | ||
| const int kNumIters = 1; | ||
| for (int i = 0; i <= kNumIters; ++i) { | ||
| - this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, i); | ||
| + this->TestLeastSquaresUpdate(kLearningRate, kWeightDecay, kMomentum, 0., i); |
jeffdonahue
Contributor
|
jeffdonahue
commented on an outdated diff
Aug 7, 2015
| @@ -128,6 +128,29 @@ class AdaGradSolver : public SGDSolver<Dtype> { | ||
| DISABLE_COPY_AND_ASSIGN(AdaGradSolver); | ||
| }; | ||
| + | ||
| +template <typename Dtype> | ||
| +class RMSpropSolver : public SGDSolver<Dtype> { |
jeffdonahue
Contributor
|
|
Thanks @erogol for the original work and thanks @ronghanghu for the rebase. This looks good except as noted above. |
|
@jeffdonahue OK, I'll handle them. Thanks for the comments! |
|
Cool, LGTM. @ronghanghu feel free to merge whenever it's easiest for you, before or after the other two PRs. |
ronghanghu
added a commit
that referenced
this pull request
Aug 9, 2015
|
|
ronghanghu |
698fc76
|
ronghanghu
merged commit 698fc76
into
BVLC:master
Aug 9, 2015
1 check passed
ronghanghu
deleted the
ronghanghu:rms-prop branch
Aug 9, 2015
shelhamer
commented on the diff
Aug 9, 2015
| @@ -867,10 +906,124 @@ TYPED_TEST(NesterovSolverTest, TestSnapshotShare) { | ||
| const Dtype kLearningRate = 0.01; | ||
| const Dtype kWeightDecay = 0.5; | ||
| const Dtype kMomentum = 0.9; | ||
| + const Dtype kRMSDecay = 0; | ||
| + const int kNumIters = 4; | ||
| + this->share_ = true; | ||
| + for (int i = 1; i <= kNumIters; ++i) { | ||
| + this->TestSnapshot(kLearningRate, kWeightDecay, kMomentum, kRMSDecay, i); | ||
| + } | ||
| +} | ||
| + | ||
| +template <typename TypeParam> | ||
| +class RMSPropSolverTest : public GradientBasedSolverTest<TypeParam> { | ||
| + typedef typename TypeParam::Dtype Dtype; | ||
| + | ||
| + protected: | ||
| + virtual void InitSolver(const SolverParameter& param) { | ||
| + this->solver_.reset(new RMSPropSolver<Dtype>(param)); |
shelhamer
Owner
|
shelhamer
commented on the diff
Aug 9, 2015
| @@ -173,6 +174,9 @@ class GradientBasedSolverTest : public MultiDeviceTest<TypeParam> { | ||
| if (momentum != 0) { | ||
| proto << "momentum: " << momentum << " "; | ||
| } | ||
| + if (rms_decay != 0) { |
shelhamer
Owner
|
|
@ronghanghu Sorry I didn't catch this earlier, but I have a suggestion for the RMS decay parameter in the tests. Instead of introducing another argument and setting it for every test, this param could be set by the RMSProp test class for encapsulation. Could you send a follow-up PR to make this change? |
ronghanghu
restored the
ronghanghu:rms-prop branch
Aug 9, 2015
|
@shelhamer Yes, I can send another PR to do that. Adam solver is also going to introduce a Addressed in #2888. |
ronghanghu commentedAug 6, 2015
Rebased and adapted RMSprop implementation #1890 to the new solver interface #2518 and #1977. The original author is @erogol. Pulled against master instead of dev.
The RMSprop solver is based on G. Hinton's lecture (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf). Param gradients are divided by average root mean square of gradients in recent batches. It can be seen as a mini-batch version of using only the sign of gradients.
Update rule:
Momentum is not supported for RMSprop solver, as in #1890.