Skip to content
This repository was archived by the owner on Jun 4, 2025. It is now read-only.

Conversation

@eldarkurtic
Copy link

When gradient accumulation is used, the effective batch size is gradent_accumulation_steps times larger.

When gradient accumulation is used, the effective batch size is `gradent_accumulation_steps` times larger.
Copy link

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch

@markurtz markurtz merged commit c7b33f0 into neuralmagic:master Jan 24, 2022
KSGulin pushed a commit that referenced this pull request Mar 9, 2022
When gradient accumulation is used, the effective batch size is `gradent_accumulation_steps` times larger.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants