[None][fix] fix Qwen-VL processor _defaults mutation at the source by longlee0622 · Pull Request #14617 · NVIDIA/TensorRT-LLM

longlee0622 · 2026-05-27T04:42:11Z

The previous workaround (bypass_processor_output_validation) patched validate_typed_dict at module level to filter processor output keys out of HF's per-modality TypedDict validation. That hid a real upstream bug in Qwen2/2.5/3-VL processors: their _get_num_multimodal_tokens does

`<ProcessorKwargs>._defaults.get("<modality>_kwargs", {}).update(kwargs)`

on the class-level default dict (instead of a copy). The first call that forwards processor output keys (e.g. video_grid_thw) bakes them into the per-modality default, and every subsequent processor call then merges those keys into output_kwargs[<modality>] and trips ProcessorMixin._merge_kwargs validation.

Fix the bug at the source: install_qwen_vl_processor_defaults_fix() re-classes a loaded Qwen-VL processor instance to a thin TRT-LLM subclass that overrides only _get_num_multimodal_tokens and takes a defensive dict(...) copy before merging caller kwargs. No global state is touched; the override runs entirely on instance methods and local copies, so it is naturally thread-safe with no locks or refcounting. The process-wide module-level patch (bypass_processor_output_validation) is removed, along with its call sites in
modeling_qwen2vl.py / modeling_qwen3vl.py.

Adds a unit test module that verifies the fix installs (re-classes the instance), is idempotent, leaves class-level _defaults untouched after calls that pass output keys, and that the upstream bug still exists on the current transformers (so the regression test isn't vacuous).

Summary by CodeRabbit

Bug Fixes
- Resolved multimodal processor validation failures in Qwen-VL models by preventing processor output keys from leaking into preprocessing kwargs, improving overall stability.
Tests
- Added comprehensive regression tests to validate processor fix behavior across known and unknown processor types.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

coderabbitai · 2026-05-27T04:48:33Z

📝 Walkthrough

Walkthrough

This PR replaces a context-manager validation bypass with a targeted processor re-classing mechanism to prevent Qwen-VL ProcessorMixin validation failures. The fix installs a TRT-LLM subclass that defensively copies per-modality defaults before merging kwargs, preventing stale output keys from leaking across processor calls. Integration spans Qwen2-VL, Qwen3-VL, test utilities, and comprehensive regression tests.

Changes

Qwen-VL Processor Defaults Mutation Fix

Layer / File(s)	Summary
Core fix implementation: processor reclass and defaults scrubbing `tensorrt_llm/_torch/models/modeling_multimodal_utils.py`	Removes `contextlib` import and `bypass_processor_output_validation()` context manager. Adds Qwen-VL processor-to-kwargs mappings, identifies keys to scrub from class-level `_defaults`, and introduces `install_qwen_vl_processor_defaults_fix(processor)` which scrubs leaked keys, creates a safe TRT-LLM subclass with a defensive `_get_num_multimodal_tokens` override, and re-classes the processor instance.
Qwen2-VL multimodal preprocessing integration `tensorrt_llm/_torch/models/modeling_qwen2vl.py`	Updates imports to use `install_qwen_vl_processor_defaults_fix` instead of `bypass_processor_output_validation`. Applies the fix in `Qwen2VLInputProcessorBase.__init__` after loading the processor, and replaces wrapped processor calls in `_preprocess` with direct invocation.
Qwen3-VL multimodal preprocessing integration `tensorrt_llm/_torch/models/modeling_qwen3vl.py`	Mirrors Qwen2-VL changes: updates imports, installs the fix in `Qwen3VLInputProcessorBase.__init__`, and converts processor calls in `_preprocess` from context-manager-wrapped to direct, with comments explaining that the safe subclass prevents output-key leakage.
Existing test utilities updated and comprehensive regression test suite `tests/unittest/_torch/modeling/test_modeling_multimodal.py`, `tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py`	Updates `test_modeling_multimodal.py` to apply the fix and call processor directly. Adds new regression test module with stub image/video processors, import helpers, and seven test cases validating re-classing behavior, idempotency, pollution prevention, defaults scrubbing, preservation of remote-code subclasses, and unpatched baseline behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

Shixiaowei02
StanleySun639
xinhe-nv
yechank-nvidia
Funatiq
LarryXFly

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 44.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check	❓ Inconclusive	PR description clearly explains the issue, the solution, and the rationale. However, the PR title is missing (just shows '[None]') and the checklist sections are only partially filled.	Add a proper PR title following the template format (e.g., '[None][fix] Fix Qwen-VL processor defaults mutation at source'). Also ensure Description and Test Coverage sections are populated with the details already provided in the preamble.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly describes the main fix: addressing Qwen-VL processor _defaults mutation by applying a targeted subclass-based solution instead of a global workaround.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py (1)
104-258: QA list updates look unnecessary.

These additions are unit-test scoped under tests/unittest, so I would not expect any update to tests/integration/test_lists/qa/* in this PR.

As per coding guidelines, "This PR’s change is unit-test scoped (under tests/unittest), so you generally do not need to update these QA lists unless adding new end-to-end functional coverage."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py` around
lines 104 - 258, The PR added only unit tests under tests/unittest (e.g.,
functions like test_install_returns_true_for_known_processor,
test_install_is_idempotent,
test_defaults_not_polluted_after_call_with_output_keys) so the QA lists in
tests/integration/test_lists/qa/* should not be modified; revert any changes to
those QA list files and keep only the new unit test file changes, ensuring no
integration QA list entries were added for this unit-test-scoped change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py`:
- Around line 239-248: The test currently swallows every Exception when calling
proc._get_num_multimodal_tokens (image_sizes/video_sizes/video_grid_thw), which
masks unrelated failures; narrow the except to only the concrete exceptions
expected from the unpatched upstream stub (e.g., replace "except Exception:"
with "except (ValueError, TypeError):" or the exact exception class the upstream
stub raises) so unrelated errors still surface—update the except clause around
proc._get_num_multimodal_tokens accordingly and add a brief comment noting to
adjust the caught types if the upstream raises a different specific exception.

---

Nitpick comments:
In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py`:
- Around line 104-258: The PR added only unit tests under tests/unittest (e.g.,
functions like test_install_returns_true_for_known_processor,
test_install_is_idempotent,
test_defaults_not_polluted_after_call_with_output_keys) so the QA lists in
tests/integration/test_lists/qa/* should not be modified; revert any changes to
those QA list files and keep only the new unit test file changes, ensuring no
integration QA list entries were added for this unit-test-scoped change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b93ba615-b5c7-401f-80a1-19436ad42573

📥 Commits

Reviewing files that changed from the base of the PR and between 37079f6 and 2e3b8a7.

📒 Files selected for processing (5)

tensorrt_llm/_torch/models/modeling_multimodal_utils.py
tensorrt_llm/_torch/models/modeling_qwen2vl.py
tensorrt_llm/_torch/models/modeling_qwen3vl.py
tests/unittest/_torch/modeling/test_modeling_multimodal.py
tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py

The previous workaround (`bypass_processor_output_validation`) patched `validate_typed_dict` at module level to filter processor *output* keys out of HF's per-modality `TypedDict` validation. That hid a real upstream bug in `Qwen2/2.5/3-VL` processors: their `_get_num_multimodal_tokens` does `<ProcessorKwargs>._defaults.get("<modality>_kwargs", {}).update(kwargs)` on the class-level default dict (instead of a copy). The first call that forwards processor output keys (e.g. `video_grid_thw`) bakes them into the per-modality default, and every subsequent processor call then merges those keys into `output_kwargs[<modality>]` and trips `ProcessorMixin._merge_kwargs` validation. Fix the bug at the source: `install_qwen_vl_processor_defaults_fix()` re-classes a loaded Qwen-VL processor instance to a thin TRT-LLM subclass that overrides only `_get_num_multimodal_tokens` and takes a defensive `dict(...)` copy before merging caller kwargs. No global state is touched; the override runs entirely on instance methods and local copies, so it is naturally thread-safe with no locks or refcounting. The process-wide module-level patch (`bypass_processor_output_validation`) is removed, along with its call sites in `modeling_qwen2vl.py` / `modeling_qwen3vl.py`. Adds a unit test module that verifies the fix installs (re-classes the instance), is idempotent, leaves class-level `_defaults` untouched after calls that pass output keys, and that the upstream bug still exists on the current `transformers` (so the regression test isn't vacuous). Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>

Per review feedback, ``install_qwen_vl_processor_defaults_fix`` (and its helpers ``_make_safe_get_num_multimodal_tokens``, ``_QWEN_VL_KWARGS_CLASS_BY_PROCESSOR``, ``_QWEN_VL_PROCESSOR_OUTPUT_KEYS``) is Qwen-VL-specific and does not belong in the cross-model ``modeling_multimodal_utils`` surface. Move the workaround into ``modeling_qwen2vl.py`` (next to its only consumers) and re-import it from ``modeling_qwen3vl.py`` alongside the existing ``Qwen2_5_VLVisionAttention`` import. Tests follow the new import path. Behavior is unchanged. Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>

longlee0622 · 2026-05-28T03:24:20Z

/bot run

tensorrt-cicd · 2026-05-28T03:32:08Z

PR_Github #50682 [ run ] triggered by Bot. Commit: 4b36a02 Link to invocation

tensorrt-cicd · 2026-05-28T07:51:05Z

PR_Github #50682 [ run ] completed with state SUCCESS. Commit: 4b36a02
/LLM/main/L0_MergeRequest_PR pipeline #40170 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

longlee0622 requested review from a team as code owners May 27, 2026 04:42

longlee0622 requested review from 2ez4bz, moraxu, rakib-hasan and tijyojwad May 27, 2026 04:42

github-actions Bot assigned longlee0622 May 27, 2026

longlee0622 marked this pull request as draft May 27, 2026 04:42

coderabbitai Bot reviewed May 27, 2026

View reviewed changes

Comment thread tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py

longlee0622 added 2 commits May 28, 2026 12:16

longlee0622 force-pushed the fix/qwen-vl-defaults-mutation-subclass branch from e39ac28 to 4b36a02 Compare May 28, 2026 03:16

longlee0622 marked this pull request as ready for review May 28, 2026 03:16

longlee0622 enabled auto-merge (squash) May 28, 2026 07:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][fix] fix Qwen-VL processor _defaults mutation at the source#14617

[None][fix] fix Qwen-VL processor _defaults mutation at the source#14617
longlee0622 wants to merge 2 commits into
NVIDIA:mainfrom
longlee0622:fix/qwen-vl-defaults-mutation-subclass

longlee0622 commented May 27, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 27, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

longlee0622 commented May 28, 2026

Uh oh!

tensorrt-cicd commented May 28, 2026

Uh oh!

tensorrt-cicd commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

longlee0622 commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

coderabbitai Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

❌ Failed checks (1 warning, 1 inconclusive)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

longlee0622 commented May 28, 2026

Uh oh!

tensorrt-cicd commented May 28, 2026

Uh oh!

tensorrt-cicd commented May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

longlee0622 commented May 27, 2026 •

edited

Loading

coderabbitai Bot commented May 27, 2026 •

edited

Loading