pad dequantized paged fp8 kv with zeros #4780

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

bottler wants to merge 1 commit into pytorch:main from bottler:export-D80977902

Contributor

bottler commented Aug 27, 2025

Summary: Pad zeros after the end of used sequences to avoid nans in flash attention 3, in the dequantization of fp8 paged kv-cache. This is analogous to the non-paged case which was tackled in D69522001.

Differential Revision: D80977902

netlify bot commented Aug 27, 2025 •

edited

Loading

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`3cb21bd`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68aefe4168260600089e576a
😎 Deploy Preview	https://deploy-preview-4780--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

meta-cla bot added the cla signed label

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request was exported from Phabricator. Differential Revision: D80977902

facebook-github-bot added the fb-exported label

bottler added a commit to bottler/FBGEMM-1 that referenced this pull request


          pad dequantized paged fp8 kv with zeros (pytorch#4780)

c353791

Summary:

X-link: facebookresearch/FBGEMM#1803

Pad zeros after the end of used sequences to avoid nans in flash attention 3, in the dequantization of fp8 paged kv-cache. This is analogous to the non-paged case which was tackled in D69522001.

Differential Revision: D80977902

bottler force-pushed the export-D80977902 branch from 5a83a60 to c353791 Compare

August 27, 2025 12:36

bottler added a commit to bottler/FBGEMM-1 that referenced this pull request


          pad dequantized paged fp8 kv with zeros (pytorch#4780)

b5046be

Summary:

X-link: facebookresearch/FBGEMM#1803

Pad zeros after the end of used sequences to avoid nans in flash attention 3, in the dequantization of fp8 paged kv-cache. This is analogous to the non-paged case which was tackled in D69522001.

Differential Revision: D80977902

bottler force-pushed the export-D80977902 branch from c353791 to b5046be Compare

August 27, 2025 12:36

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request was exported from Phabricator. Differential Revision: D80977902

bottler added a commit to bottler/FBGEMM-1 that referenced this pull request


          pad dequantized paged fp8 kv with zeros (pytorch#4780)

6b58c11

Summary:
Pull Request resolved: pytorch#4780

X-link: facebookresearch/FBGEMM#1803

Pad zeros after the end of used sequences to avoid nans in flash attention 3, in the dequantization of fp8 paged kv-cache. This is analogous to the non-paged case which was tackled in D69522001.

Differential Revision: D80977902

bottler force-pushed the export-D80977902 branch from b5046be to 6b58c11 Compare

August 27, 2025 12:40

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request was exported from Phabricator. Differential Revision: D80977902

1 similar comment

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request was exported from Phabricator. Differential Revision: D80977902


          pad dequantized paged fp8 kv with zeros (pytorch#4780)

3cb21bd

Summary:
Pull Request resolved: pytorch#4780

X-link: facebookresearch/FBGEMM#1803

Pad zeros after the end of used sequences to avoid nans in flash attention 3, in the dequantization of fp8 paged kv-cache. This is analogous to the non-paged case which was tackled in D69522001.

Differential Revision: D80977902

bottler force-pushed the export-D80977902 branch from 6b58c11 to 3cb21bd Compare

August 27, 2025 12:46

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request was exported from Phabricator. Differential Revision: D80977902

facebook-github-bot closed this in

699954b

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Aug 27, 2025

This pull request has been merged in 699954b.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed fb-exported Merged