Skip to content

[JAX] Refactor + MXFP8 + GroupedGEMM#1627

Merged
phu0ngng merged 14 commits intoNVIDIA:mainfrom
phu0ngng:branch_for_25.04
Apr 1, 2025
Merged

[JAX] Refactor + MXFP8 + GroupedGEMM#1627
phu0ngng merged 14 commits intoNVIDIA:mainfrom
phu0ngng:branch_for_25.04

Conversation

@phu0ngng
Copy link
Collaborator

@phu0ngng phu0ngng commented Mar 31, 2025

Description

  • Introduced ScaledTensor and Quantizer class + code refactoring
  • MXFP8
  • Removed old custom calls with non-FFI
  • GroupedGEMM.

Type of change

  • Documentation change (change only to the documentation, either a fix or a new content)
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Infra/Build change
  • Code refactoring

Checklist:

  • I have read and followed the contributing guidelines
  • The functionality is complete
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Co-authored-by: Jeremy Berchtold <jberchtold@nvidia.com>
@phu0ngng
Copy link
Collaborator Author

/te-ci jax L1

phu0ngng and others added 2 commits March 31, 2025 13:31
Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Signed-off-by: Hua Huang <huah@nvidia.com>
Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
@phu0ngng
Copy link
Collaborator Author

/te-ci jax L1

phu0ngng and others added 3 commits March 31, 2025 18:19
Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
@phu0ngng
Copy link
Collaborator Author

phu0ngng commented Apr 1, 2025

/te-ci jax L1

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
@phu0ngng phu0ngng merged commit cf9a7c2 into NVIDIA:main Apr 1, 2025
11 checks passed
@phu0ngng phu0ngng deleted the branch_for_25.04 branch April 1, 2025 02:49
KshitijLakhani pushed a commit that referenced this pull request Apr 1, 2025
* refactor + mxfp8

* added grouped gemm

* rename linear to dense

* added cublas init phase for groupedGemm

* relax the tol of test encoder multiprocessing mxfp8 by 0.001

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>

---------

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Co-authored-by: Hua Huang <huah@nvidia.com>
Co-authored-by: Jeremy Berchtold <jberchtold@nvidia.com>
KshitijLakhani added a commit that referenced this pull request Apr 1, 2025
@phu0ngng phu0ngng mentioned this pull request Apr 1, 2025
13 tasks
lhb8125 pushed a commit to lhb8125/TransformerEngine that referenced this pull request Apr 8, 2025
* refactor + mxfp8

* added grouped gemm

* rename linear to dense

* added cublas init phase for groupedGemm

* relax the tol of test encoder multiprocessing mxfp8 by 0.001

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>

---------

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Co-authored-by: Hua Huang <huah@nvidia.com>
Co-authored-by: Jeremy Berchtold <jberchtold@nvidia.com>
wdykas pushed a commit to wdykas/TransformerEngine that referenced this pull request Apr 14, 2025
* refactor + mxfp8

* added grouped gemm

* rename linear to dense

* added cublas init phase for groupedGemm

* relax the tol of test encoder multiprocessing mxfp8 by 0.001

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>

---------

Signed-off-by: Phuong Nguyen <phuonguyen@nvidia.com>
Co-authored-by: Hua Huang <huah@nvidia.com>
Co-authored-by: Jeremy Berchtold <jberchtold@nvidia.com>
Signed-off-by: Peter Dykas <wdykas@nvidia.com>
@phu0ngng phu0ngng mentioned this pull request Apr 17, 2025
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants