Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Why is Adam much slower than sgd? #1516

Closed
horserma opened this issue Feb 22, 2016 · 7 comments
Closed

Why is Adam much slower than sgd? #1516

horserma opened this issue Feb 22, 2016 · 7 comments

Comments

@horserma
Copy link

I tried to use Adam to replace sgd. I did nothing but replaced the optimiser, then I found the speed dropped too much, which was confusing. Using sgd, I got 0.6 sample per sec (I use a large input for fcn). Then using Adam, I got 0.4 sample per sec. I dont know if this makes sense, since the complexity of Adam seems not that high that caused so much difference in speed. Any idea?

@horserma
Copy link
Author

And also, Adam takes 2g memory more than ccsgd. In my case, ccsgd only takes 2.5g, but Adam takes 4.5, sometimes, near 5g. Is there any way that this can be optimised? Or is it supposed to be like this?

@piiswrong
Copy link
Contributor

ccsgd saves memory because it's implemented in C++ and doesn't allocate any temp memory. You can implement adam in cpp to get similar performance.

@futurely
Copy link

The Stanford CS class CS231n: Convolutional Neural Networks for Visual Recognition compared various optimization algorithms and concluded that "in practice Adam is currently recommended as the default algorithm to use, and often works slightly better than RMSProp". It's definitely worthy to implement Adam in C++.

@eric-haibin-lin
Copy link
Member

AdamUpdate is now available in cpp. Closing it for now.

@sabrinamehlal
Copy link

Could you help me, and give me some examples of Adam's implementation in C++? I'm looking for that in MLPACK, but it's really don't clear.

@pai-plznw4me
Copy link

Internally, adams need two more variables for each weight. Therefore, more operations are required than Gradient descent and the learning time by sample is increased.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants