[TRTLLM-11768][fix] Config updates to enable NVFP4#12776
Conversation
📝 WalkthroughWalkthroughAdded quantization configuration normalization to Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/_torch/models/modeling_nemotron_nano.py`:
- Around line 1487-1492: The code unconditionally sets llm_model_config._frozen
= True after remapping quant_config_dict which can change caller-visible state
and leave the config in the wrong frozen state if an exception occurs; fix it by
saving the original value (orig = llm_model_config._frozen), set
llm_model_config._frozen = False before modifying
llm_model_config.quant_config_dict, perform the remap in a try block, and in a
finally block restore llm_model_config._frozen = orig so the original frozen
state is preserved even on exceptions (referencing llm_model_config, _frozen,
quant_config_dict and _LM_PREFIX).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 49b54381-a07b-4e0b-b61e-e95a519c7cc0
📒 Files selected for processing (1)
tensorrt_llm/_torch/models/modeling_nemotron_nano.py
* Why? The Nemotron Nano VL model checkpoints for NVFP4 could not be loaded into TRT-LLM. * What? Makes the necessary config parsing changes to fix this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
/bot run --disable-fail-fast |
2 similar comments
|
/bot run --disable-fail-fast |
|
/bot run --disable-fail-fast |
|
PR_Github #41978 [ run ] triggered by Bot. Commit: |
|
PR_Github #41978 [ run ] completed with state |
* Why? The Nemotron Nano VL model checkpoints for NVFP4 could not be loaded into TRT-LLM. * What? This commit makes the necessary config parsing changes to fix this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
* Why? The Nemotron Nano VL model checkpoints for NVFP4 could not be loaded into TRT-LLM. * What? This commit makes the necessary config parsing changes to fix this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
* Why? The Nemotron Nano VL model checkpoints for NVFP4 could not be loaded into TRT-LLM. * What? This commit makes the necessary config parsing changes to fix this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Summary by CodeRabbit
Bug Fixes
Description
The Nemotron Nano VL model checkpoints for NVFP4 could not be loaded
into TRT-LLM.
Makes the necessary config parsing changes to fix this.
Test Coverage
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.