Skip to content

fix: update GLM 4.7 Flash TE DeepEP finetuning config#1401

Merged
HuiyingLi merged 1 commit intomainfrom
claude/elated-mccarthy
Feb 27, 2026
Merged

fix: update GLM 4.7 Flash TE DeepEP finetuning config#1401
HuiyingLi merged 1 commit intomainfrom
claude/elated-mccarthy

Conversation

@hemildesai
Copy link
Copy Markdown
Contributor

@hemildesai hemildesai commented Feb 27, 2026

Summary

  • Remove standalone moe_config section (previously using MoEParallelizerConfig)
  • Add sequence_parallel, activation_checkpointing, and moe subsection (reshard_after_forward, wrap_outer_model) to the distributed block

🤖 Generated with Claude Code

…tings

Replace standalone moe_config section with moe settings under distributed
block and add sequence_parallel/activation_checkpointing fields.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Feb 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@hemildesai hemildesai added the docs-only With great power comes great responsibility. label Feb 27, 2026
@hemildesai
Copy link
Copy Markdown
Contributor Author

/ok to test afaab4b

@HuiyingLi HuiyingLi merged commit 29977ff into main Feb 27, 2026
33 checks passed
@HuiyingLi HuiyingLi deleted the claude/elated-mccarthy branch February 27, 2026 03:55
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
fix: update GLM 4.7 Flash TE DeepEP config to use distributed MoE settings

Replace standalone moe_config section with moe settings under distributed
block and add sequence_parallel/activation_checkpointing fields.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-only With great power comes great responsibility.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants