Skip to content

[None][fix] fix Qwen-VL processor _defaults mutation at the source#14617

Open
longlee0622 wants to merge 2 commits into
NVIDIA:mainfrom
longlee0622:fix/qwen-vl-defaults-mutation-subclass
Open

[None][fix] fix Qwen-VL processor _defaults mutation at the source#14617
longlee0622 wants to merge 2 commits into
NVIDIA:mainfrom
longlee0622:fix/qwen-vl-defaults-mutation-subclass

Conversation

@longlee0622
Copy link
Copy Markdown
Collaborator

@longlee0622 longlee0622 commented May 27, 2026

The previous workaround (bypass_processor_output_validation) patched validate_typed_dict at module level to filter processor output keys out of HF's per-modality TypedDict validation. That hid a real upstream bug in Qwen2/2.5/3-VL processors: their _get_num_multimodal_tokens does

`<ProcessorKwargs>._defaults.get("<modality>_kwargs", {}).update(kwargs)`

on the class-level default dict (instead of a copy). The first call that forwards processor output keys (e.g. video_grid_thw) bakes them into the per-modality default, and every subsequent processor call then merges those keys into output_kwargs[<modality>] and trips ProcessorMixin._merge_kwargs validation.

Fix the bug at the source: install_qwen_vl_processor_defaults_fix() re-classes a loaded Qwen-VL processor instance to a thin TRT-LLM subclass that overrides only _get_num_multimodal_tokens and takes a defensive dict(...) copy before merging caller kwargs. No global state is touched; the override runs entirely on instance methods and local copies, so it is naturally thread-safe with no locks or refcounting. The process-wide module-level patch (bypass_processor_output_validation) is removed, along with its call sites in
modeling_qwen2vl.py / modeling_qwen3vl.py.

Adds a unit test module that verifies the fix installs (re-classes the instance), is idempotent, leaves class-level _defaults untouched after calls that pass output keys, and that the upstream bug still exists on the current transformers (so the regression test isn't vacuous).

Summary by CodeRabbit

  • Bug Fixes

    • Resolved multimodal processor validation failures in Qwen-VL models by preventing processor output keys from leaking into preprocessing kwargs, improving overall stability.
  • Tests

    • Added comprehensive regression tests to validate processor fix behavior across known and unknown processor types.

Review Change Stack

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@longlee0622 longlee0622 requested review from a team as code owners May 27, 2026 04:42
@longlee0622 longlee0622 marked this pull request as draft May 27, 2026 04:42
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

📝 Walkthrough

Walkthrough

This PR replaces a context-manager validation bypass with a targeted processor re-classing mechanism to prevent Qwen-VL ProcessorMixin validation failures. The fix installs a TRT-LLM subclass that defensively copies per-modality defaults before merging kwargs, preventing stale output keys from leaking across processor calls. Integration spans Qwen2-VL, Qwen3-VL, test utilities, and comprehensive regression tests.

Changes

Qwen-VL Processor Defaults Mutation Fix

Layer / File(s) Summary
Core fix implementation: processor reclass and defaults scrubbing
tensorrt_llm/_torch/models/modeling_multimodal_utils.py
Removes contextlib import and bypass_processor_output_validation() context manager. Adds Qwen-VL processor-to-kwargs mappings, identifies keys to scrub from class-level _defaults, and introduces install_qwen_vl_processor_defaults_fix(processor) which scrubs leaked keys, creates a safe TRT-LLM subclass with a defensive _get_num_multimodal_tokens override, and re-classes the processor instance.
Qwen2-VL multimodal preprocessing integration
tensorrt_llm/_torch/models/modeling_qwen2vl.py
Updates imports to use install_qwen_vl_processor_defaults_fix instead of bypass_processor_output_validation. Applies the fix in Qwen2VLInputProcessorBase.__init__ after loading the processor, and replaces wrapped processor calls in _preprocess with direct invocation.
Qwen3-VL multimodal preprocessing integration
tensorrt_llm/_torch/models/modeling_qwen3vl.py
Mirrors Qwen2-VL changes: updates imports, installs the fix in Qwen3VLInputProcessorBase.__init__, and converts processor calls in _preprocess from context-manager-wrapped to direct, with comments explaining that the safe subclass prevents output-key leakage.
Existing test utilities updated and comprehensive regression test suite
tests/unittest/_torch/modeling/test_modeling_multimodal.py, tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py
Updates test_modeling_multimodal.py to apply the fix and call processor directly. Adds new regression test module with stub image/video processors, import helpers, and seven test cases validating re-classing behavior, idempotency, pollution prevention, defaults scrubbing, preservation of remote-code subclasses, and unpatched baseline behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • Shixiaowei02
  • StanleySun639
  • xinhe-nv
  • yechank-nvidia
  • Funatiq
  • LarryXFly
🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive PR description clearly explains the issue, the solution, and the rationale. However, the PR title is missing (just shows '[None]') and the checklist sections are only partially filled. Add a proper PR title following the template format (e.g., '[None][fix] Fix Qwen-VL processor defaults mutation at source'). Also ensure Description and Test Coverage sections are populated with the details already provided in the preamble.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main fix: addressing Qwen-VL processor _defaults mutation by applying a targeted subclass-based solution instead of a global workaround.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py (1)

104-258: QA list updates look unnecessary.

These additions are unit-test scoped under tests/unittest, so I would not expect any update to tests/integration/test_lists/qa/* in this PR.

As per coding guidelines, "This PR’s change is unit-test scoped (under tests/unittest), so you generally do not need to update these QA lists unless adding new end-to-end functional coverage."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py` around
lines 104 - 258, The PR added only unit tests under tests/unittest (e.g.,
functions like test_install_returns_true_for_known_processor,
test_install_is_idempotent,
test_defaults_not_polluted_after_call_with_output_keys) so the QA lists in
tests/integration/test_lists/qa/* should not be modified; revert any changes to
those QA list files and keep only the new unit test file changes, ensuring no
integration QA list entries were added for this unit-test-scoped change.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py`:
- Around line 239-248: The test currently swallows every Exception when calling
proc._get_num_multimodal_tokens (image_sizes/video_sizes/video_grid_thw), which
masks unrelated failures; narrow the except to only the concrete exceptions
expected from the unpatched upstream stub (e.g., replace "except Exception:"
with "except (ValueError, TypeError):" or the exact exception class the upstream
stub raises) so unrelated errors still surface—update the except clause around
proc._get_num_multimodal_tokens accordingly and add a brief comment noting to
adjust the caught types if the upstream raises a different specific exception.

---

Nitpick comments:
In `@tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py`:
- Around line 104-258: The PR added only unit tests under tests/unittest (e.g.,
functions like test_install_returns_true_for_known_processor,
test_install_is_idempotent,
test_defaults_not_polluted_after_call_with_output_keys) so the QA lists in
tests/integration/test_lists/qa/* should not be modified; revert any changes to
those QA list files and keep only the new unit test file changes, ensuring no
integration QA list entries were added for this unit-test-scoped change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b93ba615-b5c7-401f-80a1-19436ad42573

📥 Commits

Reviewing files that changed from the base of the PR and between 37079f6 and 2e3b8a7.

📒 Files selected for processing (5)
  • tensorrt_llm/_torch/models/modeling_multimodal_utils.py
  • tensorrt_llm/_torch/models/modeling_qwen2vl.py
  • tensorrt_llm/_torch/models/modeling_qwen3vl.py
  • tests/unittest/_torch/modeling/test_modeling_multimodal.py
  • tests/unittest/_torch/modeling/test_qwen_vl_processor_defaults_fix.py

The previous workaround (`bypass_processor_output_validation`) patched
`validate_typed_dict` at module level to filter processor *output* keys
out of HF's per-modality `TypedDict` validation. That hid a real upstream
bug in `Qwen2/2.5/3-VL` processors: their `_get_num_multimodal_tokens`
does

    `<ProcessorKwargs>._defaults.get("<modality>_kwargs", {}).update(kwargs)`

on the class-level default dict (instead of a copy). The first call that
forwards processor output keys (e.g. `video_grid_thw`) bakes them into
the per-modality default, and every subsequent processor call then
merges those keys into `output_kwargs[<modality>]` and trips
`ProcessorMixin._merge_kwargs` validation.

Fix the bug at the source: `install_qwen_vl_processor_defaults_fix()`
re-classes a loaded Qwen-VL processor instance to a thin TRT-LLM subclass
that overrides only `_get_num_multimodal_tokens` and takes a defensive
`dict(...)` copy before merging caller kwargs. No global state is
touched; the override runs entirely on instance methods and local copies,
so it is naturally thread-safe with no locks or refcounting. The
process-wide module-level patch (`bypass_processor_output_validation`)
is removed, along with its call sites in
`modeling_qwen2vl.py` / `modeling_qwen3vl.py`.

Adds a unit test module that verifies the fix installs (re-classes the
instance), is idempotent, leaves class-level `_defaults` untouched after
calls that pass output keys, and that the upstream bug still exists on
the current `transformers` (so the regression test isn't vacuous).

Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
Per review feedback, ``install_qwen_vl_processor_defaults_fix`` (and its
helpers ``_make_safe_get_num_multimodal_tokens``,
``_QWEN_VL_KWARGS_CLASS_BY_PROCESSOR``,
``_QWEN_VL_PROCESSOR_OUTPUT_KEYS``) is Qwen-VL-specific and does not
belong in the cross-model ``modeling_multimodal_utils`` surface. Move
the workaround into ``modeling_qwen2vl.py`` (next to its only
consumers) and re-import it from ``modeling_qwen3vl.py`` alongside the
existing ``Qwen2_5_VLVisionAttention`` import. Tests follow the new
import path. Behavior is unchanged.

Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
@longlee0622 longlee0622 force-pushed the fix/qwen-vl-defaults-mutation-subclass branch from e39ac28 to 4b36a02 Compare May 28, 2026 03:16
@longlee0622 longlee0622 marked this pull request as ready for review May 28, 2026 03:16
@longlee0622
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50682 [ run ] triggered by Bot. Commit: 4b36a02 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #50682 [ run ] completed with state SUCCESS. Commit: 4b36a02
/LLM/main/L0_MergeRequest_PR pipeline #40170 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

@longlee0622 longlee0622 enabled auto-merge (squash) May 28, 2026 07:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants