[TRTLLM-12127][fix] VisualGen metadata updates#12862
Conversation
|
/bot run --disable-fail-fast |
|
PR_Github #42403 [ run ] triggered by Bot. Commit: |
|
PR_Github #42403 [ run ] completed with state
|
📝 WalkthroughWalkthroughThis pull request introduces a shared attention metadata state mechanism for the TRTLLM attention backend. A new Changes
Sequence DiagramsequenceDiagram
participant Config as DiffusionModelConfig
participant Module as attention.py<br/>(Attention Module)
participant Factory as utils.py<br/>(create_attention)
participant Backend as trtllm.py<br/>(TrtllmAttention)
Config->>Config: Create attention_metadata_state<br/>{"metadata": None, "capacity": (0,0)}
Module->>Module: Extract attention_metadata_state<br/>from config
Module->>Factory: Call create_attention(...)<br/>attention_metadata_state=state
Factory->>Factory: Validate: TRTLLM backend<br/>requires metadata_state
alt TRTLLM Backend Valid
Factory->>Backend: Instantiate with<br/>attention_metadata_state dict
Backend->>Backend: Store reference to<br/>shared metadata state
Backend-->>Factory: Return configured attention
else Missing metadata_state
Factory->>Factory: Raise ValueError
end
Factory-->>Module: Attention instance ready
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/visual_gen/config.py (1)
1-5:⚠️ Potential issue | 🟡 MinorAdd NVIDIA copyright header with 2026 year.
The file is missing the required SPDX copyright header. Add the header at the top:
# SPDX-FileCopyrightText: Copyright (c) 2025–2026, NVIDIA CORPORATION & AFFILIATES # SPDX-License-Identifier: Apache-2.0All TensorRT-LLM source files must include this header with the latest modification year.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/visual_gen/config.py` around lines 1 - 5, The file is missing the required SPDX copyright header; add the two-line header exactly as specified ("# SPDX-FileCopyrightText: Copyright (c) 2025–2026, NVIDIA CORPORATION & AFFILIATES" and "# SPDX-License-Identifier: Apache-2.0") at the very top of tensorrt_llm/_torch/visual_gen/config.py before any imports (above the existing imports such as json, Enum, Path) so every source file includes the latest modification year.
🧹 Nitpick comments (1)
tensorrt_llm/_torch/visual_gen/attention_backend/utils.py (1)
29-29: Unusedattention_configparameter and import.The
AttentionConfigis imported (line 29) andattention_configis accepted as a parameter (line 81), but neither is used in the function body. Onlyattention_metadata_stateis actually consumed (lines 113-119).If this is a placeholder for future backend-specific configuration (e.g., FA4), consider either:
- Adding a TODO comment explaining planned usage
- Removing it until needed to avoid dead code
Also applies to: 81-81, 103-103
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/visual_gen/attention_backend/utils.py` at line 29, The import AttentionConfig and the parameter attention_config on the function that consumes attention_metadata_state are unused; either remove the import and the attention_config parameter from the function signature (and any call sites) to eliminate dead code, or keep them and add a clear TODO comment explaining the intended future use for backend-specific configuration (e.g., FA4) near the AttentionConfig import and inside the function where attention_metadata_state is used; update references to attention_config in this module accordingly (symbols: AttentionConfig, attention_config, attention_metadata_state).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tensorrt_llm/_torch/visual_gen/config.py`:
- Around line 1-5: The file is missing the required SPDX copyright header; add
the two-line header exactly as specified ("# SPDX-FileCopyrightText: Copyright
(c) 2025–2026, NVIDIA CORPORATION & AFFILIATES" and "# SPDX-License-Identifier:
Apache-2.0") at the very top of tensorrt_llm/_torch/visual_gen/config.py before
any imports (above the existing imports such as json, Enum, Path) so every
source file includes the latest modification year.
---
Nitpick comments:
In `@tensorrt_llm/_torch/visual_gen/attention_backend/utils.py`:
- Line 29: The import AttentionConfig and the parameter attention_config on the
function that consumes attention_metadata_state are unused; either remove the
import and the attention_config parameter from the function signature (and any
call sites) to eliminate dead code, or keep them and add a clear TODO comment
explaining the intended future use for backend-specific configuration (e.g.,
FA4) near the AttentionConfig import and inside the function where
attention_metadata_state is used; update references to attention_config in this
module accordingly (symbols: AttentionConfig, attention_config,
attention_metadata_state).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 504b39da-72dc-46d1-880a-a4f54d1bcdc2
📒 Files selected for processing (12)
examples/visual_gen/visual_gen_wan_i2v.pyexamples/visual_gen/visual_gen_wan_t2v.pytensorrt_llm/_torch/visual_gen/attention_backend/trtllm.pytensorrt_llm/_torch/visual_gen/attention_backend/utils.pytensorrt_llm/_torch/visual_gen/config.pytensorrt_llm/_torch/visual_gen/models/ltx2/transformer_ltx2.pytensorrt_llm/_torch/visual_gen/modules/attention.pytests/unittest/_torch/visual_gen/multi_gpu/test_flux_ulysses.pytests/unittest/_torch/visual_gen/test_attention_integration.pytests/unittest/_torch/visual_gen/test_attention_perf.pytests/unittest/_torch/visual_gen/test_flux_attention.pytests/unittest/_torch/visual_gen/test_ltx2_attention.py
|
/bot run --disable-fail-fast |
|
PR_Github #42592 [ run ] triggered by Bot. Commit: |
|
PR_Github #42592 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
2 similar comments
|
/bot run --disable-fail-fast |
|
/bot run --disable-fail-fast |
|
PR_Github #43827 [ run ] triggered by Bot. Commit: |
|
/bot run --disable-fail-fast |
|
PR_Github #43833 [ run ] triggered by Bot. Commit: |
|
PR_Github #43827 [ run ] completed with state |
|
PR_Github #43833 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #43948 [ run ] triggered by Bot. Commit: |
|
PR_Github #43948 [ run ] completed with state
|
Signed-off-by: Olivia Stoner <245287810+o-stoner@users.noreply.github.com>
8390580 to
b300b14
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #44067 [ run ] triggered by Bot. Commit: |
|
PR_Github #44067 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #44511 [ run ] triggered by Bot. Commit: |
|
PR_Github #44511 [ run ] completed with state |
Summary by CodeRabbit
Refactor
Chores
Description
Reverted SageAttention PR by @xrq-phys #11718 included "pipeline-level metadata" updates (see example) which are not blocked by the new kernels or JIT workflow. This PR adds this behavior back to VisualGen. Currently, this PR does not add back in the test_attention_trtllm_sage.py AttentionOp unit test for Sage Attention, as this will only pass once SageAttention integration is complete. The rest of the tests previously modified under unittest/_torch/visual_gen have all been updated to include attention_metadata_state behavior, just without SageAttentionConfig integration.
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.