[None][fix] Add GlmMoeDsaForCausalLM to EPLB supported model list#12607
[None][fix] Add GlmMoeDsaForCausalLM to EPLB supported model list#12607qiaoxj07 merged 2 commits intoNVIDIA:mainfrom
Conversation
GlmMoeDsaForCausalLM (GLM-5) uses the DeepSeekV3 MoE architecture but was missing from moe_model_arch_list. This caused setup() to never be called on the load balancer config, so accessing num_local_slots during model init raised ValueError. Adding it to the list enables EPLB for GLM-5. Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
/bot run --disable-fail-fast |
📝 WalkthroughWalkthroughThis change extends the Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py (1)
1-1:⚠️ Potential issue | 🟠 MajorAdd NVIDIA copyright header in this modified Python file.
This file is modified but the provided content has no NVIDIA OSS copyright header at the top.
As per coding guidelines, "
All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the year of its latest meaningful modification."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py` at line 1, This file is missing the required NVIDIA OSS copyright header; add the standard NVIDIA copyright header (with the year of the latest meaningful modification) at the very top of the file before the first statement (before the existing "import ctypes") so the module tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py contains the correct license header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py`:
- Line 1: This file is missing the required NVIDIA OSS copyright header; add the
standard NVIDIA copyright header (with the year of the latest meaningful
modification) at the very top of the file before the first statement (before the
existing "import ctypes") so the module
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py contains the correct
license header.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 61380d38-9552-495d-ace6-636249243410
📒 Files selected for processing (1)
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py
|
PR_Github #40849 [ run ] triggered by Bot. Commit: |
|
PR_Github #40849 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #41051 [ run ] triggered by Bot. Commit: |
|
PR_Github #41051 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #41158 [ run ] triggered by Bot. Commit: |
|
PR_Github #41158 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #41283 [ run ] triggered by Bot. Commit: |
|
PR_Github #41283 [ run ] completed with state
|
|
/bot run --disable-fail-fast |
|
PR_Github #41348 [ run ] triggered by Bot. Commit: |
|
PR_Github #41348 [ run ] completed with state |
…IDIA#12607) Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Summary
GlmMoeDsaForCausalLM) uses the DeepSeekV3 MoE architecture but was missing frommoe_model_arch_listinmoe_load_balancer.py.moe_config.load_balancer.num_slotsis set,maybe_create_moe_load_balancer()skipssetup()because the arch is not in the list, butinterface.pystill accessesnum_local_slots(which requiressetup()), causingValueError: Cannot calculate num_local_slots.GlmMoeDsaForCausalLMto the supported architecture list.Test plan
moe_config.load_balancer.num_slots=256no longer crashes during model init🤖 Generated with Claude Code
Summary by CodeRabbit