Skip to content

Conversation

danielvegamyhre
Copy link
Contributor

@danielvegamyhre danielvegamyhre commented Sep 11, 2025

Fixes #2932

I think we need to explicitly add the nvcc flags to build for sm100a in the extension itself. Right now, we check if the build_for_sm100a flag is true, which is set to true for cuda 12.8+, but we don't actually modify the nvcc args passed in to build the extension.

Seems like building from source is accidentally working? Looking into this..

Test plan

  • CI Job building for CUDA 12.8 DOES have building 'torchao.prototype.mxfp8_cuda' extension logs but does NOT have this warning ("MXFP8 quantization requires SM90+ (Hopper) or SM100+ (Blackwell) architecture. Kernel will be disabled for this architecture.") in it: https://github.com/pytorch/ao/actions/runs/17631942858/job/50100852741
    • (previously, without this fix, in the CI build logs indicating the extension was being built, but also see the warning that cuda arch not supported so kernel will not be built)

Copy link

pytorch-bot bot commented Sep 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2979

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8514b11 with merge base 83e8e60 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 11, 2025
@danielvegamyhre danielvegamyhre added ci mx topic: bug fix Use this tag for PRs that fix bugs labels Sep 11, 2025
@danielvegamyhre danielvegamyhre changed the title Add nvcc flags for building MXFP8 dim1 cast kernel for sm100a on CPU-only build runners Add nvcc flags to explicitly build mxfp8 dim1 cast kernel for sm100a Sep 11, 2025
@drisspg
Copy link
Contributor

drisspg commented Sep 11, 2025

We have separate 100a modules if you look lower in the file

@danielvegamyhre
Copy link
Contributor Author

We have separate 100a modules if you look lower in the file

Are you referring to this?

ao/setup.py

Line 706 in cc35151

# Only build the cutlass_100a extension if sm100a is in the architecture flags

@danielvegamyhre
Copy link
Contributor Author

Confirmed build is successful and pytest test/prototype/mx_formats/test_mx_linear.py -k test_linear_eager_vs_hp still passes

@danielvegamyhre danielvegamyhre merged commit f1e118b into main Sep 11, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. mx topic: bug fix Use this tag for PRs that fix bugs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

importing torchao.prototype.mxfp8_cuda extension does not work from torchao wheels built by CI

2 participants