Skip to content

Conversation

@danielvegamyhre
Copy link
Contributor

@danielvegamyhre danielvegamyhre commented Sep 11, 2025

Summary

This PR integrates all the grouped GEMMs and triton kernels for per group scale conversions landed recently (details below) into mxfp8 MoE training:

Test plan

  • pytest test/prototype/moe_training/test_scaled_grouped_mm.py -k test_mxfp8_grouped_gemm_with_dq_fwd_bwd -s

Next steps

  • Make compatible with compile w/ custom ops
  • Microbenchmarks
  • torchtitan integration -> e2e benchmarks

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 11, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2977

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@danielvegamyhre danielvegamyhre marked this pull request as draft September 11, 2025 00:28
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 11, 2025
@danielvegamyhre danielvegamyhre added mx topic: not user facing Use this tag if you don't want this PR to show up in release notes moe labels Sep 11, 2025
@danielvegamyhre danielvegamyhre force-pushed the dvm-integrate branch 2 times, most recently from 7c8e965 to ecde9da Compare September 11, 2025 04:54
@danielvegamyhre danielvegamyhre marked this pull request as ready for review September 11, 2025 04:55
Returns:
- starting_row_after_padding: 1D integer tensor representing the starting row after padding each to blocked format.
- starting_col_after_padding: 1D integer tensor representing the starting row after padding each to blocked format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit row -> col

out_dtype=out_dtype,
)

# Store what we need for backward before returning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not needed comment

@danielvegamyhre danielvegamyhre changed the base branch from contraction to main September 11, 2025 17:11
@danielvegamyhre danielvegamyhre changed the base branch from main to contraction September 11, 2025 17:11
@danielvegamyhre danielvegamyhre changed the base branch from contraction to main September 11, 2025 17:13
@danielvegamyhre danielvegamyhre merged commit 66384a9 into main Sep 11, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. moe mx topic: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants