Skip to content

Conversation

liangel-02
Copy link
Contributor

@liangel-02 liangel-02 commented Oct 9, 2025

Adding bf16 support for torch._fake_quantize_learnable_per_channel_affine() op by relaxing the type check on scale

TODO: need to add bf16 support to per_tensor_affine_ as torch._fake_quantize_learnable_per_tensor_affine_backward gets called in the backward pass

Test
Modified unit test in test_workflow_ops.py

Copy link

pytorch-bot bot commented Oct 9, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/165098

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 9d89081 with merge base 34ac9b6 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: quantization release notes category label Oct 9, 2025
@liangel-02 liangel-02 marked this pull request as ready for review October 9, 2025 20:16
@liangel-02 liangel-02 requested a review from andrewor14 October 9, 2025 20:16
Copy link

meta-codesync bot commented Oct 9, 2025

@liangel-02 has imported this pull request. If you are a Meta employee, you can view this in D84286904.

@liangel-02 liangel-02 force-pushed the bf16_support_per_channel branch from 98bdb26 to 366d198 Compare October 9, 2025 21:06
Copy link
Contributor

@andrewor14 andrewor14 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, please add a TODO somewhere (PR description is fine) for fixing the per_tensor version

@liangel-02 liangel-02 force-pushed the bf16_support_per_channel branch from 366d198 to 9d89081 Compare October 9, 2025 22:59
@liangel-02
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 10, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.




[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.




[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.




[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.




[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.




[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it.





[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 13, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it.





[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Oct 14, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it.

Pull Request resolved: #165325
Approved by: https://github.com/andrewor14
liangel-02 added a commit that referenced this pull request Oct 14, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 14, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.

[ghstack-poisoned]
liangel-02 added a commit that referenced this pull request Oct 14, 2025
Follow up to #165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function.

[ghstack-poisoned]
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
…165098)

Adding bf16 support for `torch._fake_quantize_learnable_per_channel_affine()` op by relaxing the type check on scale

TODO: need to add bf16 support to `per_tensor_affine_` as `torch._fake_quantize_learnable_per_tensor_affine_backward` gets called in the backward pass

**Test**
Modified unit test in `test_workflow_ops.py`
Pull Request resolved: pytorch#165098
Approved by: https://github.com/jerryzh168, https://github.com/andrewor14
Chao1Han pushed a commit to Chao1Han/pytorch that referenced this pull request Oct 21, 2025
Follow up to pytorch#165098 - adding bf16 support for the backward pass. To avoid BC breaking changes/losing precision, we upcast the parameters to fp32 after the op gets called, and downcast the gradients to bf16 before returning.

For testing, we upcast to fp32 before calling the reference function. We increase the tolerance to 1e-2 for bf16 inputs because of a difference in casting calculations between python's `x.to(torch.bfloat16)` and cpp's `x.to(at::kBFloat16)` (after comparing intermediate tensors, we found that the numerics diverge after the final casting). We don't explicitly cast in the CPP op but rather let autograd/optimizer handle it.

Pull Request resolved: pytorch#165325
Approved by: https://github.com/andrewor14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: quantization release notes category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants