[Common, pyTorch] Grouped MXFP8 dequantize support by ptrendx · Pull Request #2722 · NVIDIA/TransformerEngine

ptrendx · 2026-03-02T19:13:32Z

Description

Support dequantization for MXFP8 grouped tensors.

Type of change

Documentation change (change only to the documentation, either a fix or a new content)
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Infra/Build change
Code refactoring

Changes

Please list the changes introduced in this PR:

Grouped dequantization kernel for MXFP8
Exposed the functionality in PyTorch

Checklist:

I have read and followed the contributing guidelines
The functionality is complete
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

for more information, see https://pre-commit.ci

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

for more information, see https://pre-commit.ci

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

ptrendx added 4 commits February 26, 2026 23:19

Grouped dequantize for MXFP8

95995ef

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

Merge branch 'main' into pr_grouped_dequantize

884a312

Pytorch extension

c56cbf6

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

Fix CUDA graphs compatibility

f502a24

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

ptrendx requested a review from Oleg-Goncharov March 2, 2026 19:13

pre-commit-ci bot and others added 3 commits March 2, 2026 19:19

[pre-commit.ci] auto fixes from pre-commit.com hooks

cafb9d5

for more information, see https://pre-commit.ci

Handling non-full tiles

de132e1

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

[pre-commit.ci] auto fixes from pre-commit.com hooks

3be5fcc

for more information, see https://pre-commit.ci

ptrendx linked an issue Mar 2, 2026 that may be closed by this pull request

Dequantization support for the grouped tensor - MXFP8 #2725

Open

Fix

7fafb6d

Signed-off-by: Przemek Tredak <ptredak@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Common, pyTorch] Grouped MXFP8 dequantize support#2722

[Common, pyTorch] Grouped MXFP8 dequantize support#2722
ptrendx wants to merge 8 commits intoNVIDIA:mainfrom
ptrendx:pr_grouped_dequantize

ptrendx commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ptrendx commented Mar 2, 2026

Description

Type of change

Changes

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant