Adam solver #2918

Merged
merged 2 commits into from Aug 14, 2015

Conversation

Projects
None yet
3 participants
Member

ronghanghu commented Aug 14, 2015

Carried on Adam solver (originally #2856) for merge.

I completed the tests and rebased it to latest master.

Authorship belongs to @PatWie, and is preserved in git commit.

Original message in #2856 :

This commit implements the Adam solver by Kingma et. al for CPU and GPU. All solver parameters are defined in the caffe.proto. This also adds an example for the MNIST dataset.

As you may see, now both solver.cpp and test_gradient_based_solver.cpp are growing to 1000+ lines. This problem will be addressed in #2890.

shelhamer referenced this pull request Aug 14, 2015

Closed

Adaptive Solvers: AdaDelta, RMSprop, and ADAM #2860

3 of 3 tasks complete
Member

ronghanghu commented Aug 14, 2015

Encountered error larger than margin on solver tests. Perhaps this is simply numerical issue. Looking into details.
It was a mistake I made when implementing test code ComputeLeastSquaresUpdate for Adam, where I forgot to power beta1 and beta2 by t. Now everything is fine.

ronghanghu referenced this pull request Aug 14, 2015

Closed

Adam solver #2856

Member

ronghanghu commented Aug 14, 2015

@shelhamer @jeffdonahue @philkr @PatWie Please take a look if you have time. I think this should be ready to merge.

This is last piece of the solver trilogy in #2860. After merging this one, we can address #2890.

@jeffdonahue jeffdonahue commented on the diff Aug 14, 2015

include/caffe/solver.hpp
@@ -218,6 +218,21 @@ class AdaDeltaSolver : public SGDSolver<Dtype> {
};
@jeffdonahue

jeffdonahue Aug 14, 2015

Contributor

We need to cite the ADAM paper somewhere*. I suggest putting a reference here, e.g. in a doxygen formatted comment like this. Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added.

*We probably also need to go back and add references for some of the other recently merged solvers.

Contributor

jeffdonahue commented Aug 14, 2015

Thanks for the rebase @ronghanghu and thanks @PatWie for the original implementation! See above comment; otherwise looks good.

Member

ronghanghu commented Aug 14, 2015

Citation added for Adam.

Eventually it would also be good to add sections to the solver tutorial on these new solvers, where the reference should also then be added.
*We probably also need to go back and add references for some of the other recently merged solvers.

Let's address that in #2890 .

@jeffdonahue jeffdonahue and 1 other commented on an outdated diff Aug 14, 2015

src/caffe/solver.cpp
+ this->history_.push_back(
+ shared_ptr<Blob<Dtype> >(new Blob<Dtype>(shape)));
+ }
+}
+
+template <typename Dtype>
+void AdamSolver<Dtype>::ComputeUpdateValue(int param_id, Dtype rate) {
+ const vector<Blob<Dtype>*>& net_params = this->net_->learnable_params();
+ const vector<float>& net_params_lr = this->net_->params_lr();
+ Dtype local_rate = rate * net_params_lr[param_id];
+ const Dtype beta1 = this->param_.momentum();
+ const Dtype beta2 = this->param_.momentum2();
+
+ // we create aliases for convenience
+ size_t update_history_offset = net_params.size();
+ shared_ptr<Blob<Dtype> > val_m = this->history_[param_id];
@jeffdonahue

jeffdonahue Aug 14, 2015

Contributor

No need to create shared_ptrs for these val_* variables, is there? (I suggest using the raw pointer, e.g. Blob<Dtype>* val_m = this->history_[param_id].get();.)

@ronghanghu

ronghanghu Aug 14, 2015

Member

Yes, you are right. I should use raw pointers.

Contributor

jeffdonahue commented Aug 14, 2015

Thanks for adding the citation. After a final glance I noticed the one other thing I commented above; sorry about not noticing before. Feel free to merge after addressing that.

Contributor

PatWie commented Aug 14, 2015

Looks good.

PatWie and others added some commits Aug 3, 2015

@ronghanghu PatWie Adam solver
This commit implements the Adam solver by Kingma et. al for CPU and
GPU. All solver parameters are defined in the caffe.proto. This also
adds an example for the MNIST dataset.
4e4c89b
@ronghanghu ronghanghu Cite Adam paper in solver.hpp bf42e6e
Member

ronghanghu commented Aug 14, 2015

Changed from shared ptrs to raw ptrs in AdamSolver<Dtype>::ComputeUpdateValue.

@ronghanghu ronghanghu added a commit that referenced this pull request Aug 14, 2015

@ronghanghu ronghanghu Merge pull request #2918 from ronghanghu/adam
Adam solver
cbca8fe

@ronghanghu ronghanghu merged commit cbca8fe into BVLC:master Aug 14, 2015

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

ronghanghu deleted the ronghanghu:adam branch Aug 14, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment