New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big-batch Stochastic Gradient Descent (BBSGD) #1131

Merged
merged 11 commits into from May 3, 2018

Conversation

Projects
None yet
3 participants
@zoq
Member

zoq commented Sep 29, 2017

Implementation of Big-batch Stochastic Gradient Descent (BBSGD) as described in: "Big Batch SGD: Automated Inference using Adaptive Batch Sizes" by Soham De et al.

@rcurtin

Looks good to me, but I guess after #1137 is merged it will need some amount of refactoring. The technique seemed really cool in the paper, I am looking forward to playing with it and seeing how much of an improvement it gives. :)

{
vB += std::pow(arma::norm(funcGradients[j] - (gradient / batchSize),
2.0), 2.0);
}

This comment has been minimized.

@rcurtin

rcurtin Oct 22, 2017

Member

I think it might be possible to avoid holding the full vector funcGradients by using an incremental variance calculation, something like this:

https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online_algorithm

However I am not totally sure. But for large batch sizes, I think the whole algorithm could be much faster if all the memory for funcGradients was not needed. I think for large batch sizes and large networks being optimized, that might actually cause the system to run out of RAM in some cases.

funcGradients.resize(batchSize + batchOffset);
// Generate new batch indices.
arma::Col<size_t> batchVisitationOrder;

This comment has been minimized.

@rcurtin

rcurtin Oct 22, 2017

Member

I guess once #1137 is merged, this part can be simplified greatly because we can just call Shuffle().

zoq added some commits May 1, 2017

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 4, 2018

@zoq I don't know whether it is complete or not. But, it looks like this is ready for merge. Is it so?

@zoq

This comment has been minimized.

Member

zoq commented Apr 11, 2018

Kinda, looks like the only things that is left is to use the latest EvaluateWithGradient function. I'll set this on my list.

@sourabhvarshney111

This comment has been minimized.

Contributor

sourabhvarshney111 commented Apr 12, 2018

Thanks @zoq. Looking forward to see this merged

zoq added some commits May 1, 2018

@zoq zoq merged commit e37ceb3 into mlpack:master May 3, 2018

5 checks passed

Memory Checks
Details
Static Code Analysis Checks Build finished.
Details
Style Checks Build finished.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment