Skip to content

fix: propagate moe_token_dispatcher_type in bridge model provider#1737

Merged
zhuzilin merged 1 commit intoTHUDM:mainfrom
nanjiangwill:fix/bridge-moe-token-dispatcher-type
Mar 22, 2026
Merged

fix: propagate moe_token_dispatcher_type in bridge model provider#1737
zhuzilin merged 1 commit intoTHUDM:mainfrom
nanjiangwill:fix/bridge-moe-token-dispatcher-type

Conversation

@nanjiangwill
Copy link
Copy Markdown
Collaborator

@nanjiangwill nanjiangwill commented Mar 18, 2026

fix #1725

  • Propagate moe_token_dispatcher_type from args to the bridge provider in model_provider.py, guarded by hasattr for compatibility with newer Megatron versions where the attribute may no longer exist on args.
  • Fixes the ValueError: Token dispatcher type: allgather does not support variable sequence length raised by finalize() when using bridge mode, since slime always sets variable_seq_lengths=True.

Bridge provider was missing moe_token_dispatcher_type propagation.
Since slime always sets variable_seq_lengths=True and allgather
dispatcher is incompatible with it, finalize() raises a ValueError.

Propagate moe_token_dispatcher_type from args (already corrected to
alltoall by validate_args) when the attribute exists. For newer
Megatron versions where the attribute was removed from args, the
provider is left untouched.

Closes THUDM#1725

Made-with: Cursor
@nanjiangwill nanjiangwill requested a review from zhuzilin March 18, 2026 05:59
@zhuzilin zhuzilin merged commit 73a1f4d into THUDM:main Mar 22, 2026
2 checks passed
@nanjiangwill nanjiangwill deleted the fix/bridge-moe-token-dispatcher-type branch March 22, 2026 14:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[bug] Bridge provider missing moe_token_dispatcher_type propagation with variable_seq_lengths

2 participants