[https://nvbugs/6211185][fix] Fix failed GSM8K accuracy tests for LagunaXS on B200/GB200/B300 by DomBrown · Pull Request #14580 · NVIDIA/TensorRT-LLM

DomBrown · 2026-05-26T15:02:31Z

Summary by CodeRabbit

Bug Fixes
- Optimized RoPE fusion performance for B200-class GPUs.
Tests
- Re-enabled Laguna model tests on B200 hardware previously marked as waived.

Description

Temporary workaround: Hopper fails without unfused RoPE for Laguna, while Blackwell has issues when RoPE is unfused.
This PR is to unblock Blackwell on main for Poolside while we find proper fixes.

Also adds B200 coverage to prevent regression. B300 in addition is unnecessary as the failure mode is the same. RTX Pro is unaffected.

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>

DomBrown · 2026-05-26T15:03:53Z

/bot run --disable-fail-fast

coderabbitai · 2026-05-26T15:09:27Z

📝 Walkthrough

Walkthrough

This PR adds hardware-specific RoPE fusion detection for Laguna attention on Blackwell GPUs (SM 100/103) and updates test configurations to validate the change. The implementation replaces a hardcoded rope_fusion=False with dynamic detection based on SM version, and removes test waivers to enable validation.

Changes

RoPE Fusion Hardware Workaround and Test Updates

Layer / File(s)	Summary
RoPE Fusion Hardware Workaround Implementation `tensorrt_llm/_torch/models/modeling_laguna.py`	Imported `get_sm_version`, computed `rope_fusion` dynamically for SM 100/103 with an inline workaround comment, and passed the computed flag to the base attention constructor instead of hardcoding `False`.
Test List and Waiver Updates `tests/integration/test_lists/test-db/l0_b200.yml`, `tests/integration/test_lists/waives.txt`	Added `TestLagunaXS::test_nvfp4` to the B200 pre-merge PyTorch test selection and removed skip waivers for `test_bf16`, `test_fp8`, and `test_nvfp4` to validate the RoPE fusion change.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

NVIDIA/TensorRT-LLM#14515: Both PRs modify TestLagunaXS test waiver handling in the same file, with this PR removing waivers that were previously added.

Suggested reviewers

LarryXFly
StanleySun639
ZhanruiSunCh

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	PR description explains the issue (RoPE fusion differences between Hopper and Blackwell) and the temporary workaround, but Test Coverage section is empty and lacks details about specific test cases.	Complete the Test Coverage section by specifying which tests (e.g., accuracy/test_llm_api_pytorch.py::TestLagunaXS::test_nvfp4) safeguard these changes and why they validate the RoPE fusion workaround.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: fixing GSM8K accuracy tests for LagunaXS on B200/GB200/B300 by implementing RoPE fusion logic based on SM version detection.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Stopped waiting for pipeline failures after 30000ms. One of your pipelines takes longer than our 30000ms fetch window to run, so review may not consider pipeline-failure results for inline comments if any failures occurred after the fetch window. Increase the timeout if you want to wait longer or run a @coderabbit review after the pipeline has finished.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/integration/test_lists/test-db/l0_b200.yml (1)
24-24: 🏗️ Heavy lift

Add a perf guard for the RoPE fusion path switch.

Line 24 improves functional coverage, but this PR also changes an attention-kernel execution path. Please add a LagunaXS perf-sanity entry in tests/integration/test_lists/test-db/l0_perf.yml (and the matching tests/integration/test_lists/qa/llm_perf_*.yml if it should run on QA schedules) so latency/throughput regressions are caught, not just accuracy failures.

As per coding guidelines: “If the PR touches performance-sensitive paths… check whether a perf test entry is present or updated in… test-db and QA perf lists… [and] note if only functional correctness is tested where a performance regression would not be caught.”
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/test_lists/test-db/l0_b200.yml` at line 24, Add a LagunaXS
performance guard entry corresponding to the functional test
accuracy/test_llm_api_pytorch.py::TestLagunaXS::test_nvfp4 by adding a
perf-sanity record in tests/integration/test_lists/test-db/l0_perf.yml (and
mirror it in the matching QA perf list
tests/integration/test_lists/qa/llm_perf_*.yml if this path should run on QA
schedules); the perf entry should reference the same test target
(TestLagunaXS::test_nvfp4 or an equivalent perf-target key), specify expected
latency/throughput thresholds, and include the same platform/fixture tags used
by the functional entry so any regression in the RoPE fusion attention-kernel
path is caught.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/integration/test_lists/test-db/l0_b200.yml`:
- Line 24: Add a LagunaXS performance guard entry corresponding to the
functional test accuracy/test_llm_api_pytorch.py::TestLagunaXS::test_nvfp4 by
adding a perf-sanity record in tests/integration/test_lists/test-db/l0_perf.yml
(and mirror it in the matching QA perf list
tests/integration/test_lists/qa/llm_perf_*.yml if this path should run on QA
schedules); the perf entry should reference the same test target
(TestLagunaXS::test_nvfp4 or an equivalent perf-target key), specify expected
latency/throughput thresholds, and include the same platform/fixture tags used
by the functional entry so any regression in the RoPE fusion attention-kernel
path is caught.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 657c3ac2-737f-44d7-81d6-f7ee666f1867

📥 Commits

Reviewing files that changed from the base of the PR and between 1f8312d and c1fd2d3.

📒 Files selected for processing (3)

tensorrt_llm/_torch/models/modeling_laguna.py
tests/integration/test_lists/test-db/l0_b200.yml
tests/integration/test_lists/waives.txt

💤 Files with no reviewable changes (1)

tests/integration/test_lists/waives.txt

tensorrt-cicd · 2026-05-26T15:09:38Z

PR_Github #50355 [ run ] triggered by Bot. Commit: c1fd2d3 Link to invocation

tensorrt-cicd · 2026-05-26T19:37:19Z

PR_Github #50355 [ run ] completed with state SUCCESS. Commit: c1fd2d3
/LLM/main/L0_MergeRequest_PR pipeline #39882 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

DomBrown · 2026-05-26T19:51:04Z

/bot run --disable-fail-fast

tensorrt-cicd · 2026-05-26T19:57:05Z

PR_Github #50383 [ run ] triggered by Bot. Commit: c1fd2d3 Link to invocation

tensorrt-cicd · 2026-05-26T22:43:15Z

PR_Github #50383 [ run ] completed with state SUCCESS. Commit: c1fd2d3
/LLM/main/L0_MergeRequest_PR pipeline #39910 completed with status: 'SUCCESS'

CI Report

Link to invocation

…unaXS on B200/GB200/B300 (NVIDIA#14580) Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>

fix: WAR to unblock blackwell on main for LagunaXS

c1fd2d3

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>

DomBrown self-assigned this May 26, 2026

DomBrown requested a review from a team as a code owner May 26, 2026 15:02

DomBrown requested a review from 2ez4bz May 26, 2026 15:02

coderabbitai Bot reviewed May 26, 2026

View reviewed changes

2ez4bz approved these changes May 26, 2026

View reviewed changes

DomBrown merged commit 92e601c into NVIDIA:main May 26, 2026
18 of 20 checks passed

DomBrown deleted the nvbugs/6211185 branch May 26, 2026 22:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[https://nvbugs/6211185][fix] Fix failed GSM8K accuracy tests for LagunaXS on B200/GB200/B300#14580

[https://nvbugs/6211185][fix] Fix failed GSM8K accuracy tests for LagunaXS on B200/GB200/B300#14580
DomBrown merged 1 commit into
NVIDIA:mainfrom
DomBrown:nvbugs/6211185

DomBrown commented May 26, 2026 •

edited

Loading

Uh oh!

DomBrown commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

DomBrown commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

DomBrown commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

DomBrown commented May 26, 2026

Uh oh!

coderabbitai Bot commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

DomBrown commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

tensorrt-cicd commented May 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DomBrown commented May 26, 2026 •

edited

Loading

coderabbitai Bot commented May 26, 2026 •

edited

Loading