Enable quantization for bf16 model #14558

navsud · 2025-09-24T22:00:12Z

Summary: To save GPU memory bfloat16 dtype is commonly used for training of LLMs. Currently, the quantizer ignores quantizing the nodes if they are not float32. This change enables quantization of bf16 nodes as well.

Differential Revision: D82866443

pytorch-bot · 2025-09-24T22:00:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14558

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[Maintenance] MacOS runners update

❌ 4 New Failures, 2 Cancelled Jobs

As of commit 9a886d2 with merge base c98079a ():

NEW FAILURES - The following jobs have failed:

pull / test-qnn-wheel-packages-linux (3.10) / linux-job (gh)
RuntimeError: Command docker exec -t 768d0ffbfb969730747b7fae9cf506c5f430cfa27c419199a10af2f1ed3ae520 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.11) / linux-job (gh)
RuntimeError: Command docker exec -t 370d0946939f9ff727b513cdebe3cce4777c371ea11c79830ca32ff14dcfc3f4 /exec failed with exit code 1
pull / test-qnn-wheel-packages-linux (3.12) / linux-job (gh)
RuntimeError: Command docker exec -t a85eb243824da1bad52be10f5fbe523bced022f0e954f4298b9d3bc584e0606e /exec failed with exit code 1
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 20421365a6a5f9521eb2881c05b1a48eeff3cd5f16636e9caaa959fd4c91581b /exec failed with exit code 1

CANCELLED JOBS - The following jobs were cancelled. Please retry:

pull / test-binary-size-linux-gcc / linux-job (gh)
pull / test-setup-linux-gcc / linux-job (gh)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-09-24T22:00:25Z

@navsud has exported this pull request. If you are a Meta employee, you can view the originating diff in D82866443.

Summary: To save GPU memory `bfloat16` dtype is commonly used for training of LLMs. Currently, the quantizer ignores quantizing the nodes if they are not float32. This change enables quantization of bf16 nodes as well. Differential Revision: D82866443

Summary: To save GPU memory `bfloat16` dtype is commonly used for training of LLMs. Currently, the quantizer ignores quantizing the nodes if they are not float32. This change enables quantization of bf16 nodes as well. Reviewed By: billmguo Differential Revision: D82866443

facebook-github-bot · 2025-09-25T22:17:46Z

@navsud has exported this pull request. If you are a Meta employee, you can view the originating diff in D82866443.

navsud requested a review from cccclai as a code owner September 24, 2025 22:00

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 24, 2025

facebook-github-bot added fb-exported meta-exported labels Sep 24, 2025

navsud added the release notes: none Do not include this in the release notes label Sep 24, 2025

billmguo approved these changes Sep 24, 2025

View reviewed changes

navsud force-pushed the export-D82866443 branch from 247396d to 9a886d2 Compare September 25, 2025 22:17

facebook-github-bot merged commit 2283294 into pytorch:main Sep 26, 2025
125 of 132 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enable quantization for bf16 model #14558

Enable quantization for bf16 model #14558

Uh oh!

navsud commented Sep 24, 2025

Uh oh!

pytorch-bot bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 24, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Enable quantization for bf16 model #14558

Enable quantization for bf16 model #14558

Uh oh!

Conversation

navsud commented Sep 24, 2025

Uh oh!

pytorch-bot bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14558

❗ 1 Active SEVs

❌ 4 New Failures, 2 Cancelled Jobs

Uh oh!

facebook-github-bot commented Sep 24, 2025

Uh oh!

facebook-github-bot commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pytorch-bot bot commented Sep 24, 2025 •

edited

Loading