Skip to content

fix: prefer moe_config for num_experts in apply_ac#1361

Merged
akoumpa merged 1 commit intomainfrom
claude/sharp-cray
Feb 24, 2026
Merged

fix: prefer moe_config for num_experts in apply_ac#1361
akoumpa merged 1 commit intomainfrom
claude/sharp-cray

Conversation

@hemildesai
Copy link
Copy Markdown
Contributor

Summary

  • In apply_ac, the num_experts config field name varies across HuggingFace model configs (num_experts, moe_num_experts, n_routed_experts, etc.)
  • Now prioritizes moe_config.n_routed_experts (which has a stable, well-defined field name) before falling back to scanning model.config attributes
  • Added tests verifying moe_config derivation and its priority over config attributes

Test plan

  • test_apply_ac_derives_num_experts_from_moe_config — verifies moe_config path works
  • test_apply_ac_prefers_moe_config_over_config_attrs — verifies moe_config takes priority
  • All 38 existing parallelizer tests continue to pass

🤖 Generated with Claude Code

The num_experts config field name varies across HuggingFace model configs.
Prioritize moe_config.n_routed_experts (which has a stable field name)
before falling back to model.config attribute scanning.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 24, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@hemildesai
Copy link
Copy Markdown
Contributor Author

/ok to test 47e6fa4

@akoumpa akoumpa added the r0.3.0 Add for cherry-pick into release branch r0.3.0 label Feb 24, 2026
@akoumpa akoumpa merged commit d669eac into main Feb 24, 2026
51 checks passed
@akoumpa akoumpa deleted the claude/sharp-cray branch February 24, 2026 13:13
svcnvidia-nemo-ci pushed a commit that referenced this pull request Feb 24, 2026
The num_experts config field name varies across HuggingFace model configs.
Prioritize moe_config.n_routed_experts (which has a stable field name)
before falling back to model.config attribute scanning.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
thomasdhc pushed a commit that referenced this pull request Feb 24, 2026
…`r0.3.0` (#1362)

fix: prefer moe_config for num_experts in apply_ac (#1361)

The num_experts config field name varies across HuggingFace model configs.
Prioritize moe_config.n_routed_experts (which has a stable field name)
before falling back to model.config attribute scanning.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
akoumpa pushed a commit that referenced this pull request Feb 25, 2026
The num_experts config field name varies across HuggingFace model configs.
Prioritize moe_config.n_routed_experts (which has a stable field name)
before falling back to model.config attribute scanning.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
The num_experts config field name varies across HuggingFace model configs.
Prioritize moe_config.n_routed_experts (which has a stable field name)
before falling back to model.config attribute scanning.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

r0.3.0 Add for cherry-pick into release branch r0.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants