Why is Adam much slower than sgd? #1516

horserma · 2016-02-22T07:08:11Z

I tried to use Adam to replace sgd. I did nothing but replaced the optimiser, then I found the speed dropped too much, which was confusing. Using sgd, I got 0.6 sample per sec (I use a large input for fcn). Then using Adam, I got 0.4 sample per sec. I dont know if this makes sense, since the complexity of Adam seems not that high that caused so much difference in speed. Any idea?

horserma · 2016-02-22T08:30:51Z

And also, Adam takes 2g memory more than ccsgd. In my case, ccsgd only takes 2.5g, but Adam takes 4.5, sometimes, near 5g. Is there any way that this can be optimised? Or is it supposed to be like this?

piiswrong · 2016-02-22T20:01:27Z

ccsgd saves memory because it's implemented in C++ and doesn't allocate any temp memory. You can implement adam in cpp to get similar performance.

futurely · 2016-03-23T06:45:35Z

The Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition compared various optimization algorithms and concluded that "in practice Adam is currently recommended as the default algorithm to use, and often works slightly better than RMSProp". It's definitely worthy to implement Adam in C++.

eric-haibin-lin · 2017-09-27T20:27:18Z

AdamUpdate is now available in cpp. Closing it for now.

sabrinamehlal · 2018-08-03T13:49:21Z

Could you help me, and give me some examples of Adam's implementation in C++? I'm looking for that in MLPACK, but it's really don't clear.

eric-haibin-lin · 2018-08-04T17:18:36Z

in https://github.com/apache/incubator-mxnet/blob/master/src/operator/optimizer_op-inl.h#L841

pai-plznw4me · 2019-02-07T09:24:16Z

Internally, adams need two more variables for each weight. Therefore, more operations are required than Gradient descent and the learning time by sample is increased.

eric-haibin-lin closed this as completed Sep 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is Adam much slower than sgd? #1516

Why is Adam much slower than sgd? #1516

horserma commented Feb 22, 2016

horserma commented Feb 22, 2016

piiswrong commented Feb 22, 2016

futurely commented Mar 23, 2016

eric-haibin-lin commented Sep 27, 2017

sabrinamehlal commented Aug 3, 2018

eric-haibin-lin commented Aug 4, 2018

pai-plznw4me commented Feb 7, 2019

Why is Adam much slower than sgd? #1516

Why is Adam much slower than sgd? #1516

Comments

horserma commented Feb 22, 2016

horserma commented Feb 22, 2016

piiswrong commented Feb 22, 2016

futurely commented Mar 23, 2016

eric-haibin-lin commented Sep 27, 2017

sabrinamehlal commented Aug 3, 2018

eric-haibin-lin commented Aug 4, 2018

pai-plznw4me commented Feb 7, 2019