Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BF16_Optimizer: add support for bf16 grad acc #4713

Merged
merged 3 commits into from Dec 8, 2023

Conversation

nelyahu
Copy link
Contributor

@nelyahu nelyahu commented Nov 21, 2023

the default accumulation data type is fp32
by adding the below to deepspeed json file:
"data_types" : {"grad_accum_dtype": "bf16"}
gradient accumulation will be performed in bf16.

the default accumulation data type is fp32
by adding the below to deepspeed json file:
"data_types" : {"grad_accum_dtype": "bf16"}
gradient accumulation will be performed in bf16.
@tjruwase tjruwase added this pull request to the merge queue Dec 8, 2023
Merged via the queue into microsoft:master with commit ce60708 Dec 8, 2023
13 checks passed
@nelyahu nelyahu deleted the bf16_grad_acc branch December 12, 2023 14:43
mauryaavinash95 pushed a commit to mauryaavinash95/DeepSpeed that referenced this pull request Feb 17, 2024
the default accumulation data type is fp32
by adding the below to deepspeed json file:
"data_types" : {"grad_accum_dtype": "bf16"}
gradient accumulation will be performed in bf16.

Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants