Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BF16 memoment1/2 in PARTIAL_ROWWISE_ADAM TBE #2524

Closed
wants to merge 3 commits into from

Conversation

sryap
Copy link
Contributor

@sryap sryap commented Apr 19, 2024

Summary:
This diff adds BF16 support for momentum1 and momentum2 in
PARTIAL_ROWWISE_ADAM optimizer in TBE.

Backend: D56192739, D56192738, D56192736
Frontend: D56192737 (this diff)

Differential Revision: D56192737

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56192737

Copy link

netlify bot commented Apr 19, 2024

Deploy Preview for pytorch-fbgemm-docs ready!

Name Link
🔨 Latest commit ee316b8
🔍 Latest deploy log https://app.netlify.com/sites/pytorch-fbgemm-docs/deploys/6622f4ccbbf9100008695182
😎 Deploy Preview https://deploy-preview-2524--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Summary:

This diff adds the backend to support multiple optimizer state types
in TBE GPU training. Previously, TBE only supported FP32 optimizer
states.

The key changes for supporting multiple optimizer state types include:
- Adding the `ArgType.PLACEHOLDER_TENSOR` tensor type in the
  code generation script to indicate that a tensor can have multiple
  types.
- Updating code generation scripts to generate appropriate template
  types for both GPU kernels and hosts.
- Modifying the TBE GPU kernel code templates to instantiate kernels
  for all supported types.
- Updating the TBE host code template to dispatch kernels based on the
  types of the optimizer states.

Note that this diff still supports only FP32 optimizer states.  The
additional optimizer state type support will be added in the
subsequent diffs.

Reviewed By: spcyppt

Differential Revision: D56192738
Summary:

This diff adds BF16 support for momentum1 and momentum2 in
PARTIAL_ROWWISE_ADAM optimizer in TBE.

Backend: D56192739 (this diff), D56192738, D56192736
Frontend: D56192737

Reviewed By: spcyppt

Differential Revision: D56192739
Summary:

This diff adds BF16 support for momentum1 and momentum2 in
PARTIAL_ROWWISE_ADAM optimizer in TBE.

Backend: D56192739, D56192738, D56192736
Frontend: D56192737 (this diff)

Differential Revision: D56192737
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D56192737

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 15b4fd3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants