Skip to content

Conversation

@danielvegamyhre
Copy link
Contributor

Tests

  • pytest test/prototype/moe_training/test_scaled_grouped_mm.py -k mxfp8

Benchmarks

M,N,K,G                  recipe                  bf16_fwd_bwd_us    scaled_fwd_bwd_us  scaled_fwd_bwd_speedup      bf16_fwd_us    scaled_fwd_us  scaled_fwd_speedup
-----------------------  --------------------  -----------------  -------------------  ------------------------  -------------  ---------------  --------------------
(16384, 8192, 5120, 1)   MoEScalingType.MXFP8           4081.7                3072.98  1.328x                         1264.82           801.792  1.577x
(16384, 8192, 5120, 2)   MoEScalingType.MXFP8           4086.75               3057.6   1.337x                         1169.42           893.952  1.308x
(16384, 8192, 5120, 4)   MoEScalingType.MXFP8           3984.38               3313.82  1.202x                         1119.22           969.76   1.154x
(16384, 8192, 5120, 8)   MoEScalingType.MXFP8           4064.19               3644.94  1.115x                         1160.26          1063.97   1.09x
(128000, 8192, 5120, 1)  MoEScalingType.MXFP8          51744.9               24832.5   2.084x                        15718.4           6212.45   2.53x
(128000, 8192, 5120, 2)  MoEScalingType.MXFP8          45454.4               23393.4   1.943x                        13415.9           6539.84   2.051x
(128000, 8192, 5120, 4)  MoEScalingType.MXFP8          63509.1               24352.3   2.608x                        12699.8           6603.71   1.923x
(128000, 8192, 5120, 8)  MoEScalingType.MXFP8          55598.1               25940     2.143x                        14660             6517.92   2.249x
(16384, 1536, 5120, 1)   MoEScalingType.MXFP8            808                  1061.9   0.761x                          245.472          300      0.818x
(16384, 1536, 5120, 2)   MoEScalingType.MXFP8            869.312              1035.58  0.839x                          226.304          298.048  0.759x
(16384, 1536, 5120, 4)   MoEScalingType.MXFP8            840.256              1022.94  0.821x                          220.224          306.368  0.719x
(16384, 1536, 5120, 8)   MoEScalingType.MXFP8            824.384              1055.25  0.781x                          222.4            332.816  0.668x
(128000, 1536, 5120, 1)  MoEScalingType.MXFP8           7127.15               7782.99  0.916x                         2089.04          2149.41   0.972x
(128000, 1536, 5120, 2)  MoEScalingType.MXFP8           7328.54               7860.22  0.932x                         1995.23          2212.77   0.902x
(128000, 1536, 5120, 4)  MoEScalingType.MXFP8           7060.54               7033.47  1.004x                         2050.53          1990.66   1.03x
(128000, 1536, 5120, 8)  MoEScalingType.MXFP8           6760.05               7056.42  0.958x                         1964.16          1849.44   1.062x
(16384, 2048, 7168, 1)   MoEScalingType.MXFP8           1471.04               1505.25  0.977x                          451.552          447.68   1.009x
(16384, 2048, 7168, 2)   MoEScalingType.MXFP8           1498.1                1701.9   0.88x                           465.92           459.936  1.013x
(16384, 2048, 7168, 4)   MoEScalingType.MXFP8           1583.15               1616.03  0.98x                           418.88           470.208  0.891x
(16384, 2048, 7168, 8)   MoEScalingType.MXFP8           1910.62               1986.62  0.962x                          433.088          521.056  0.831x
(128000, 2048, 7168, 1)  MoEScalingType.MXFP8          15687.9               11604.4   1.352x                         4461.5           3281.52   1.36x
(128000, 2048, 7168, 2)  MoEScalingType.MXFP8          12641.2               11413.1   1.108x                         3854.3           3110.98   1.239x
(128000, 2048, 7168, 4)  MoEScalingType.MXFP8          13843.6               10947.4   1.265x                         3650.66          3062.98   1.192x
(128000, 2048, 7168, 8)  MoEScalingType.MXFP8          12553.2                9969.89  1.259x                         3817.5           3178.42   1.201x

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 31, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/3271

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 14df3f3 with merge base 1e473ed (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 31, 2025
@danielvegamyhre danielvegamyhre changed the title [mxfp8 moe training] make scaling mode configurable and make rceil de… [mxfp8 moe training] make scaling mode configurable and make rceil default Oct 31, 2025
@danielvegamyhre danielvegamyhre added topic: not user facing Use this tag if you don't want this PR to show up in release notes mx moe labels Oct 31, 2025
@danielvegamyhre danielvegamyhre merged commit f657903 into main Nov 3, 2025
19 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. moe mx topic: not user facing Use this tag if you don't want this PR to show up in release notes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants