Skip to content

[None][fix] Add GlmMoeDsaForCausalLM to EPLB supported model list#12607

Merged
qiaoxj07 merged 2 commits intoNVIDIA:mainfrom
qiaoxj07:fix/eplb-glm5-support
Apr 2, 2026
Merged

[None][fix] Add GlmMoeDsaForCausalLM to EPLB supported model list#12607
qiaoxj07 merged 2 commits intoNVIDIA:mainfrom
qiaoxj07:fix/eplb-glm5-support

Conversation

@qiaoxj07
Copy link
Copy Markdown
Collaborator

@qiaoxj07 qiaoxj07 commented Mar 31, 2026

Summary

  • GLM-5 (GlmMoeDsaForCausalLM) uses the DeepSeekV3 MoE architecture but was missing from moe_model_arch_list in moe_load_balancer.py.
  • When moe_config.load_balancer.num_slots is set, maybe_create_moe_load_balancer() skips setup() because the arch is not in the list, but interface.py still accesses num_local_slots (which requires setup()), causing ValueError: Cannot calculate num_local_slots.
  • Fix: add GlmMoeDsaForCausalLM to the supported architecture list.

Test plan

  • Verify GLM-5 with moe_config.load_balancer.num_slots=256 no longer crashes during model init
  • Verify existing EPLB models (DeepSeek V3, Qwen3 MoE) are unaffected

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added support for GlmMoeDsaForCausalLM model architecture in the MOE load balancer.

GlmMoeDsaForCausalLM (GLM-5) uses the DeepSeekV3 MoE architecture but
was missing from moe_model_arch_list. This caused setup() to never be
called on the load balancer config, so accessing num_local_slots during
model init raised ValueError. Adding it to the list enables EPLB for
GLM-5.

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
@qiaoxj07 qiaoxj07 requested a review from a team as a code owner March 31, 2026 03:55
@qiaoxj07 qiaoxj07 requested a review from yuxianq March 31, 2026 03:55
@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

📝 Walkthrough

Walkthrough

This change extends the moe_model_arch_list to include support for a new model architecture, 'GlmMoeDsaForCausalLM', in the MOE load balancer module. No control flow or logic modifications were made.

Changes

Cohort / File(s) Summary
MOE Model Architecture Support
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py
Added 'GlmMoeDsaForCausalLM' to the supported model architectures list in moe_model_arch_list.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely describes the main change: adding GlmMoeDsaForCausalLM to the supported model list. It follows the required format with [None][fix] prefix and directly relates to the changeset.
Description check ✅ Passed The description provides a clear summary of the issue, explains the root cause, describes the fix, and includes a comprehensive test plan. All critical sections are addressed, making the PR's intent and changes understandable.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py (1)

1-1: ⚠️ Potential issue | 🟠 Major

Add NVIDIA copyright header in this modified Python file.

This file is modified but the provided content has no NVIDIA OSS copyright header at the top.

As per coding guidelines, "All TensorRT-LLM Open Source Software code should contain an NVIDIA copyright header that includes the year of its latest meaningful modification."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py` at line 1, This
file is missing the required NVIDIA OSS copyright header; add the standard
NVIDIA copyright header (with the year of the latest meaningful modification) at
the very top of the file before the first statement (before the existing "import
ctypes") so the module
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py contains the correct
license header.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py`:
- Line 1: This file is missing the required NVIDIA OSS copyright header; add the
standard NVIDIA copyright header (with the year of the latest meaningful
modification) at the very top of the file before the first statement (before the
existing "import ctypes") so the module
tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py contains the correct
license header.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 61380d38-9552-495d-ace6-636249243410

📥 Commits

Reviewing files that changed from the base of the PR and between f6db7e3 and 8a114de.

📒 Files selected for processing (1)
  • tensorrt_llm/_torch/modules/fused_moe/moe_load_balancer.py

@qiaoxj07 qiaoxj07 requested a review from dc3671 March 31, 2026 03:59
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40849 [ run ] triggered by Bot. Commit: 8a114de Link to invocation

Copy link
Copy Markdown
Collaborator

@dc3671 dc3671 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dc3671 dc3671 requested a review from xxi-nv March 31, 2026 04:17
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #40849 [ run ] completed with state SUCCESS. Commit: 8a114de
/LLM/main/L0_MergeRequest_PR pipeline #31858 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41051 [ run ] triggered by Bot. Commit: 8a114de Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41051 [ run ] completed with state FAILURE. Commit: 8a114de
/LLM/main/L0_MergeRequest_PR pipeline #32027 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41158 [ run ] triggered by Bot. Commit: 6c52a3f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41158 [ run ] completed with state SUCCESS. Commit: 6c52a3f
/LLM/main/L0_MergeRequest_PR pipeline #32127 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 1, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41283 [ run ] triggered by Bot. Commit: 6c52a3f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41283 [ run ] completed with state FAILURE. Commit: 6c52a3f
/LLM/main/L0_MergeRequest_PR pipeline #32241 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

Link to invocation

@qiaoxj07
Copy link
Copy Markdown
Collaborator Author

qiaoxj07 commented Apr 2, 2026

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41348 [ run ] triggered by Bot. Commit: 6c52a3f Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #41348 [ run ] completed with state SUCCESS. Commit: 6c52a3f
/LLM/main/L0_MergeRequest_PR pipeline #32296 completed with status: 'SUCCESS'

CI Report

Link to invocation

@qiaoxj07 qiaoxj07 merged commit 5c1c1e2 into NVIDIA:main Apr 2, 2026
5 checks passed
karen-sy pushed a commit to karen-sy/TensorRT-LLM that referenced this pull request Apr 7, 2026
…IDIA#12607)

Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants