[None][fix] Raise clear error when GPT-OSS is used with non-TRTLLM attention backend#13166
Conversation
Signed-off-by: Samaresh Kumar Singh <ssam3003@gmail.com>
📝 WalkthroughWalkthroughTwo files are updated to enhance validation and error messaging for attention backend compatibility in GPT-OSS models. A runtime validation is added to prevent GPT-OSS initialization with unsupported attention backends, and an existing assertion error message is clarified to include the configured backend. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (1 warning, 2 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
tensorrt_llm/_torch/models/modeling_gpt_oss.py (1)
1-1:⚠️ Potential issue | 🟠 MajorMissing required NVIDIA copyright header/year update.
This modified Python source file should include the standard NVIDIA copyright header (with year updated to latest meaningful modification).
As per coding guidelines, "Add NVIDIA copyright header on ALL new files and update year on modified files" and "All TensorRT-LLM source files must contain an NVIDIA copyright header with the year of latest meaningful modification".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/models/modeling_gpt_oss.py` at line 1, This file (modeling_gpt_oss.py) is missing the required NVIDIA copyright header; add the standard NVIDIA copyright header block at the very top of the file (above the import statements) and update the year to the latest meaningful modification year, ensuring the header matches the project's canonical header text and formatting used across other TensorRT-LLM source files.tensorrt_llm/_torch/modules/attention.py (1)
1-1:⚠️ Potential issue | 🟠 MajorMissing required NVIDIA copyright header/year update.
This modified Python source file should include the standard NVIDIA copyright header (with year updated to latest meaningful modification).
As per coding guidelines, "Add NVIDIA copyright header on ALL new files and update year on modified files" and "All TensorRT-LLM source files must contain an NVIDIA copyright header with the year of latest meaningful modification".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/modules/attention.py` at line 1, Add the standard NVIDIA copyright header (with the year updated to the latest meaningful modification) at the very top of this source file so it precedes the existing "import functools" line; ensure the header matches the project's canonical NVIDIA header format and contains the correct year and copyright notice used across other TensorRT-LLM files (so the file tensorrt_llm/_torch/modules/attention.py conforms to licensing guidelines).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tensorrt_llm/_torch/models/modeling_gpt_oss.py`:
- Line 1: This file (modeling_gpt_oss.py) is missing the required NVIDIA
copyright header; add the standard NVIDIA copyright header block at the very top
of the file (above the import statements) and update the year to the latest
meaningful modification year, ensuring the header matches the project's
canonical header text and formatting used across other TensorRT-LLM source
files.
In `@tensorrt_llm/_torch/modules/attention.py`:
- Line 1: Add the standard NVIDIA copyright header (with the year updated to the
latest meaningful modification) at the very top of this source file so it
precedes the existing "import functools" line; ensure the header matches the
project's canonical NVIDIA header format and contains the correct year and
copyright notice used across other TensorRT-LLM files (so the file
tensorrt_llm/_torch/modules/attention.py conforms to licensing guidelines).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: d6593c92-5d2d-41ad-bcac-18ce0ecb2178
📒 Files selected for processing (2)
tensorrt_llm/_torch/models/modeling_gpt_oss.pytensorrt_llm/_torch/modules/attention.py
Signed-off-by: Samaresh Kumar Singh <ssam3003@gmail.com>
|
/bot run --disable-fail-fast |
|
PR_Github #48119 [ run ] triggered by Bot. Commit: |
|
PR_Github #48119 [ run ] completed with state
|
@karljang Could you share which 6 tests failed in L0 #37946? The JUnit reports are redacted on my side. This PR only adds a ValueError in AttentionBlock.init and improves an assert message and it should not affect any test that wasn't already misconfigured. If the failures look unrelated, could you re-trigger CI? Thanks. |
|
The failed cases don't seem to be related to this PR~ I'll just rerun it. |
|
/bot run --disable-fail-fast |
|
PR_Github #48231 [ run ] triggered by Bot. Commit: |
|
PR_Github #48231 [ run ] completed with state |
GPT-OSS uses attention sinks which are only supported by the TRTLLM backend, but this was only enforced during the first forward pass deep inside executor warmup, giving a very confusing traceback. This moves the check into AttentionBlock.init so users get a clear, actionable ValueError the moment the model is constructed with an incompatible backend. Also tightened the fallback assert in attention.py to include the actual backend name in the message. Fixes #13156
Summary by CodeRabbit