Skip to content

Conversation

@riccardofelluga
Copy link
Collaborator

@riccardofelluga riccardofelluga commented Oct 7, 2025

What does this PR do?

Now that the PR was merged in TE we can enable NVFP4BlockScaling recipe testing in Thunder. This PR does just that by adding the recipe to the list. Note that in this case nvfp4 recipe is used for both forward and backward as it supports it.

This recipe is available only from TE v2.8 and for compute capability >= 10.0 . Therefore I've added two variables that do the both of the checks. Availability is important because for TE < 2.8 the recipe.nvfp4() method is not available.

I've also disabled all randomness from the recipe otherwise the results wouldn't be deterministic/precise.

@riccardofelluga riccardofelluga marked this pull request as ready for review October 7, 2025 16:11
Copy link
Collaborator

@beverlylytle beverlylytle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@crcrpar crcrpar requested a review from Copilot October 8, 2025 08:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds NVFP4 recipe testing support to Thunder's transformer engine test suite. The implementation includes proper version checking and hardware capability validation to ensure the NVFP4 recipe is only used when supported.

  • Adds NVFP4BlockScaling recipe to the test parameter list with disabled randomness for deterministic results
  • Implements version and hardware capability checks for NVFP4 (requires TE v2.8+ and compute capability >= 10.0)
  • Updates skip conditions across all test functions to handle NVFP4 recipe properly

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
thunder/tests/test_transformer_engine_executor.py Adds NVFP4 recipe testing with proper availability checks and skip conditions
thunder/executors/transformer_engineex_impl.py Updates GitHub URL in comment to point to official NVIDIA repository

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link
Collaborator

@kshitij12345 kshitij12345 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, @riccardofelluga

Could you please confirm and document in the PR description that you have tested this on both supported and unsupported hardware? This is particularly important since nvFP4 will not run on GitHub CI.

Additionally, I'm wondering if you collected any performance benchmarks for this change on a representative model?

@riccardofelluga
Copy link
Collaborator Author

Thanks @kshitij12345!

Could you please confirm and document in the PR description that you have tested this on both supported and unsupported hardware? This is particularly important since nvFP4 will not run on GitHub CI.

Yess it has been tested on H100 and B200.

Additionally, I'm wondering if you collected any performance benchmarks for this change on a representative model?

I haven't tested perf yet but it would be nice to add nvfp4 recipe at least to benchmark_ligpt.py and/or benchmark_inference.py which could be a follow-up PR

Copy link
Collaborator

@t-vi t-vi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@t-vi t-vi merged commit bfcdd75 into main Oct 13, 2025
48 of 51 checks passed
@t-vi t-vi deleted the te-nvfp4-tests branch October 13, 2025 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants