[https://nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs by JunyiXu-nv · Pull Request #14068 · NVIDIA/TensorRT-LLM

JunyiXu-nv · 2026-05-13T02:54:14Z

Composite HF configs (Qwen2_5_VLConfig, Qwen3_VLConfig, etc.) in newer transformers versions delegate the instance-level model_type attribute to their text_config sub-config, so model_config.model_type returns e.g. qwen2_5_vl_text rather than the top-level qwen2_5_vl that matches the MULTIMODAL_PLACEHOLDER_REGISTRY registration key. This caused trtllm-serve to reject multimodal chat requests with TypeError: Unknown modality: image.

Prefer the class-level model_type attribute, which is the canonical AutoConfig registration key and is unaffected by this delegation.

Summary by CodeRabbit

Bug Fixes
- Enhanced model type resolution to ensure accurate chat template selection in standard and multimodal chat completions.

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

…onfigs Composite HF configs (Qwen2_5_VLConfig, Qwen3_VLConfig, etc.) in newer transformers versions delegate the instance-level model_type attribute to their text_config sub-config, so model_config.model_type returns e.g. qwen2_5_vl_text rather than the top-level qwen2_5_vl that matches the MULTIMODAL_PLACEHOLDER_REGISTRY registration key. This caused trtllm-serve to reject multimodal chat requests with TypeError: Unknown modality: image. Prefer the class-level model_type attribute, which is the canonical AutoConfig registration key and is unaffected by this delegation. Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>

JunyiXu-nv · 2026-05-13T02:55:29Z

Resolve conflict for PR #13855, since I can't push to trt bot's branch.

JunyiXu-nv · 2026-05-13T02:55:43Z

/bot run --disable-fail-fast

coderabbitai · 2026-05-13T02:56:47Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 84ed1c23-8ed6-4b06-aca2-805e0a55cad2

📥 Commits

Reviewing files that changed from the base of the PR and between 28da135 and bb4c4ad.

📒 Files selected for processing (3)

tensorrt_llm/serve/chat_utils.py
tensorrt_llm/serve/openai_server.py
tensorrt_llm/serve/responses_utils.py

📝 Walkthrough

Walkthrough

This PR introduces a helper function to correctly resolve top-level model types from composite HuggingFace configurations (avoiding nested delegation), then propagates its use through apply_chat_template and MultimodalDataTracker calls across the serving stack: chat_utils, openai_server, and responses_utils.

Changes

Resolve top-level model type across serving stack

Layer / File(s)	Summary
Helper definition and chat_utils integration `tensorrt_llm/serve/chat_utils.py`	New `resolve_top_level_model_type()` helper extracts top-level model_type from config classes. Used in `parse_chat_messages_coroutines` to initialize MultimodalDataTracker and select multimodal placeholder strategy with the resolved type.
OpenAI server chat completion routes `tensorrt_llm/serve/openai_server.py`	Both `openai_chat` and `openai_mm_encoder` handlers now use `resolve_top_level_model_type()` when calling `apply_chat_template` instead of directly accessing `model_config.model_type`.
Response token creation `tensorrt_llm/serve/responses_utils.py`	Non-harmony input token creation path now derives model_type via `resolve_top_level_model_type()` when calling `apply_chat_template`.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	❓ Inconclusive	The PR description clearly explains the problem and solution, but Test Coverage and PR Checklist sections are incomplete with no specific tests listed despite the checkbox being marked.	Specify which test cases validate the fix for composite HF configs (e.g., Qwen2_5_VL, Qwen3_VL) and multimodal chat request handling.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically describes the main change: resolving top-level model_type for composite HuggingFace configs, which directly addresses the core fix.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tensorrt-cicd · 2026-05-13T03:02:05Z

PR_Github #48080 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

tensorrt-cicd · 2026-05-13T09:35:28Z

PR_Github #48080 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #37912 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

JunyiXu-nv · 2026-05-13T10:12:13Z

/bot run

tensorrt-cicd · 2026-05-13T10:18:27Z

PR_Github #48163 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

tensorrt-cicd · 2026-05-13T11:04:19Z

PR_Github #48163 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #37985 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

JunyiXu-nv · 2026-05-13T13:41:23Z

/bot run

tensorrt-cicd · 2026-05-13T13:49:17Z

PR_Github #48183 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

tensorrt-cicd · 2026-05-13T14:32:28Z

PR_Github #48183 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #38003 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

Please check the failed tests and fix your PR
If you cannot view the failures, ask the CI triggerer to share details
Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

JunyiXu-nv · 2026-05-14T03:57:41Z

/bot run

tensorrt-cicd · 2026-05-14T04:03:40Z

PR_Github #48287 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

tensorrt-cicd · 2026-05-14T05:58:16Z

PR_Github #48287 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #38099 completed with status: 'SUCCESS'

CI Report

Link to invocation

QiJune

LGTM

JunyiXu-nv requested a review from a team as a code owner May 13, 2026 02:54

JunyiXu-nv requested a review from schetlur-nv May 13, 2026 02:54

github-actions Bot assigned JunyiXu-nv May 13, 2026

JunyiXu-nv changed the title ~~[nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs~~ [https://nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs May 13, 2026

JunyiXu-nv requested review from QiJune and removed request for schetlur-nv May 14, 2026 06:19

QiJune approved these changes May 14, 2026

View reviewed changes

JunyiXu-nv merged commit 9a9d73d into NVIDIA:main May 14, 2026
10 of 13 checks passed

karljang mentioned this pull request May 14, 2026

[Bug]: trtllm-serve 1.3.0rc10 returns "Unknown modality: image" for Qwen2.5-VL-7B-Instruct-NVFP4 on DGX Spark #12824

Open

4 tasks

Conversation

JunyiXu-nv commented May 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

JunyiXu-nv commented May 13, 2026

Uh oh!

JunyiXu-nv commented May 13, 2026

Uh oh!

coderabbitai Bot commented May 13, 2026

Walkthrough

Changes

❌ Failed checks (1 inconclusive)

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

JunyiXu-nv commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

JunyiXu-nv commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

tensorrt-cicd commented May 13, 2026

Uh oh!

JunyiXu-nv commented May 14, 2026

Uh oh!

tensorrt-cicd commented May 14, 2026

Uh oh!

tensorrt-cicd commented May 14, 2026

Uh oh!

QiJune left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

JunyiXu-nv commented May 13, 2026 •

edited by coderabbitai Bot

Loading