-
Notifications
You must be signed in to change notification settings - Fork 110
Add TE's NVFP4 recipe to the test suite #2612
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
beverlylytle
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds NVFP4 recipe testing support to Thunder's transformer engine test suite. The implementation includes proper version checking and hardware capability validation to ensure the NVFP4 recipe is only used when supported.
- Adds NVFP4BlockScaling recipe to the test parameter list with disabled randomness for deterministic results
- Implements version and hardware capability checks for NVFP4 (requires TE v2.8+ and compute capability >= 10.0)
- Updates skip conditions across all test functions to handle NVFP4 recipe properly
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| thunder/tests/test_transformer_engine_executor.py | Adds NVFP4 recipe testing with proper availability checks and skip conditions |
| thunder/executors/transformer_engineex_impl.py | Updates GitHub URL in comment to point to official NVIDIA repository |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
kshitij12345
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, @riccardofelluga
Could you please confirm and document in the PR description that you have tested this on both supported and unsupported hardware? This is particularly important since nvFP4 will not run on GitHub CI.
Additionally, I'm wondering if you collected any performance benchmarks for this change on a representative model?
|
Thanks @kshitij12345!
Yess it has been tested on H100 and B200.
I haven't tested perf yet but it would be nice to add nvfp4 recipe at least to |
t-vi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @riccardofelluga @beverlylytle @kshitij12345
What does this PR do?
Now that the PR was merged in TE we can enable NVFP4BlockScaling recipe testing in Thunder. This PR does just that by adding the recipe to the list. Note that in this case nvfp4 recipe is used for both forward and backward as it supports it.
This recipe is available only from TE v2.8 and for compute capability >= 10.0 . Therefore I've added two variables that do the both of the checks. Availability is important because for TE < 2.8 the
recipe.nvfp4()method is not available.I've also disabled all randomness from the recipe otherwise the results wouldn't be deterministic/precise.