Skip to content

[https://nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs#14068

Merged
JunyiXu-nv merged 1 commit into
NVIDIA:mainfrom
JunyiXu-nv:user/junyix/fix-nvbug-6058251
May 14, 2026
Merged

[https://nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs#14068
JunyiXu-nv merged 1 commit into
NVIDIA:mainfrom
JunyiXu-nv:user/junyix/fix-nvbug-6058251

Conversation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator

@JunyiXu-nv JunyiXu-nv commented May 13, 2026

Composite HF configs (Qwen2_5_VLConfig, Qwen3_VLConfig, etc.) in newer transformers versions delegate the instance-level model_type attribute to their text_config sub-config, so model_config.model_type returns e.g. qwen2_5_vl_text rather than the top-level qwen2_5_vl that matches the MULTIMODAL_PLACEHOLDER_REGISTRY registration key. This caused trtllm-serve to reject multimodal chat requests with TypeError: Unknown modality: image.

Prefer the class-level model_type attribute, which is the canonical AutoConfig registration key and is unaffected by this delegation.

Summary by CodeRabbit

  • Bug Fixes
    • Enhanced model type resolution to ensure accurate chat template selection in standard and multimodal chat completions.

Review Change Stack

Description

Test Coverage

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

…onfigs

Composite HF configs (Qwen2_5_VLConfig, Qwen3_VLConfig, etc.) in newer
transformers versions delegate the instance-level model_type attribute to
their text_config sub-config, so model_config.model_type returns e.g.
qwen2_5_vl_text rather than the top-level qwen2_5_vl that matches the
MULTIMODAL_PLACEHOLDER_REGISTRY registration key. This caused trtllm-serve
to reject multimodal chat requests with TypeError: Unknown modality: image.

Prefer the class-level model_type attribute, which is the canonical
AutoConfig registration key and is unaffected by this delegation.

Signed-off-by: tensorrt-cicd <90828364+tensorrt-cicd@users.noreply.github.com>
@JunyiXu-nv JunyiXu-nv requested a review from a team as a code owner May 13, 2026 02:54
@JunyiXu-nv JunyiXu-nv requested a review from schetlur-nv May 13, 2026 02:54
@JunyiXu-nv JunyiXu-nv changed the title [nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs [https://nvbugs/6058251][fix] Resolve top-level model_type for composite HF configs May 13, 2026
@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

Resolve conflict for PR #13855, since I can't push to trt bot's branch.

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 84ed1c23-8ed6-4b06-aca2-805e0a55cad2

📥 Commits

Reviewing files that changed from the base of the PR and between 28da135 and bb4c4ad.

📒 Files selected for processing (3)
  • tensorrt_llm/serve/chat_utils.py
  • tensorrt_llm/serve/openai_server.py
  • tensorrt_llm/serve/responses_utils.py

📝 Walkthrough

Walkthrough

This PR introduces a helper function to correctly resolve top-level model types from composite HuggingFace configurations (avoiding nested delegation), then propagates its use through apply_chat_template and MultimodalDataTracker calls across the serving stack: chat_utils, openai_server, and responses_utils.

Changes

Resolve top-level model type across serving stack

Layer / File(s) Summary
Helper definition and chat_utils integration
tensorrt_llm/serve/chat_utils.py
New resolve_top_level_model_type() helper extracts top-level model_type from config classes. Used in parse_chat_messages_coroutines to initialize MultimodalDataTracker and select multimodal placeholder strategy with the resolved type.
OpenAI server chat completion routes
tensorrt_llm/serve/openai_server.py
Both openai_chat and openai_mm_encoder handlers now use resolve_top_level_model_type() when calling apply_chat_template instead of directly accessing model_config.model_type.
Response token creation
tensorrt_llm/serve/responses_utils.py
Non-harmony input token creation path now derives model_type via resolve_top_level_model_type() when calling apply_chat_template.

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description clearly explains the problem and solution, but Test Coverage and PR Checklist sections are incomplete with no specific tests listed despite the checkbox being marked. Specify which test cases validate the fix for composite HF configs (e.g., Qwen2_5_VL, Qwen3_VL) and multimodal chat request handling.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: resolving top-level model_type for composite HuggingFace configs, which directly addresses the core fix.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48080 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48080 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #37912 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48163 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48163 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #37985 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48183 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48183 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #38003 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@JunyiXu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48287 [ run ] triggered by Bot. Commit: bb4c4ad Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48287 [ run ] completed with state SUCCESS. Commit: bb4c4ad
/LLM/main/L0_MergeRequest_PR pipeline #38099 completed with status: 'SUCCESS'

CI Report

Link to invocation

@JunyiXu-nv JunyiXu-nv requested review from QiJune and removed request for schetlur-nv May 14, 2026 06:19
Copy link
Copy Markdown
Collaborator

@QiJune QiJune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@JunyiXu-nv JunyiXu-nv merged commit 9a9d73d into NVIDIA:main May 14, 2026
10 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants