Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Theoretical slowdown using batch accumulation #1101

Closed
mrgloom opened this issue Sep 21, 2016 · 1 comment
Closed

Theoretical slowdown using batch accumulation #1101

mrgloom opened this issue Sep 21, 2016 · 1 comment

Comments

@mrgloom
Copy link

mrgloom commented Sep 21, 2016

As I understand batch accumulation is just a alias for iter_size parameter in Caffe solver.

solver.iter_size = self.batch_accumulation

As I can see here result of for axample using batch size of 16 and batch size 4 and iter_size 4 should be numerically equivalent? and other settings of solver as learning rate and etc. should not affect result?

What is theoretical slowdown when we use batch accumulation?

How it's work internally? Does it store each batch result of forward pass in GPU memory (iter_size times in total), then average\merge batches to create one batch of batch_size size and then do backward pass and update using one batch of batch_size size ?

@lukeyeager
Copy link
Member

I'll refer you to the pull request that added the feature: BVLC/caffe#1977

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants