Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove model_state.use_fp8_ddp and optimizer.all_reduce_grads #145

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

wkcn
Copy link
Contributor

@wkcn wkcn commented Dec 14, 2023

Description
The argument model_state.use_fp8_ddp is deprecated.
In MS-AMP examples, all of model_state.use_fp8_ddp are set to True. Besides, the function optimizer.all_reduce_grads has not been used.

Major Revision

  • Remove model_state.use_fp8_ddp
  • Remove optimizer.all_reduce_grads
  • Remove the related unittests
  • Update the unittest test_fp8linear_backward since the type of weight gradient is torch.Tensor when model_state.use_fp8_ddp is True.

@tocean
Copy link
Contributor

tocean commented Dec 18, 2023

In MS-AMP-Examples, we used optimizer.all_reduce_grads. We need to remove it from examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants