Skip to content

fix: disable rope_fusion when context parallelism (cp > 1) is enabled#1440

Merged
HuiyingLi merged 1 commit intomainfrom
worktree-disable-rope-fusion-cp
Mar 4, 2026
Merged

fix: disable rope_fusion when context parallelism (cp > 1) is enabled#1440
HuiyingLi merged 1 commit intomainfrom
worktree-disable-rope-fusion-cp

Conversation

@hemildesai
Copy link
Copy Markdown
Contributor

@hemildesai hemildesai commented Mar 3, 2026

Summary

Related to #1439

  • Fused RoPE (rope_fusion) is broken when context parallelism (cp_size > 1) is enabled. This PR adds a guard in both the LLM (train_ft.py) and VLM (finetune.py) recipe setup() methods to automatically disable rope_fusion before model construction when cp_size > 1.
  • Adds 6 unit tests (3 per recipe) covering: cp > 1 disables it, cp == 1 leaves it alone, already-disabled stays disabled.

🤖 Generated with Claude Code

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@hemildesai hemildesai force-pushed the worktree-disable-rope-fusion-cp branch from ba3e60b to f1edde7 Compare March 3, 2026 20:18
@hemildesai
Copy link
Copy Markdown
Contributor Author

/ok to test f1edde7

Fused RoPE is incompatible with context parallelism (cp > 1). This adds
a guard in both the LLM and VLM recipe setup() methods to automatically
set rope_fusion=False before model construction when cp_size > 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
@hemildesai hemildesai force-pushed the worktree-disable-rope-fusion-cp branch from f1edde7 to 24dc2ad Compare March 3, 2026 22:21
@hemildesai
Copy link
Copy Markdown
Contributor Author

/ok to test 24dc2ad

Copy link
Copy Markdown
Contributor

@HuiyingLi HuiyingLi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@HuiyingLi HuiyingLi merged commit d83a268 into main Mar 4, 2026
53 checks passed
@HuiyingLi HuiyingLi deleted the worktree-disable-rope-fusion-cp branch March 4, 2026 02:32
SwekeR-463 pushed a commit to SwekeR-463/Automodel that referenced this pull request Mar 11, 2026
…NVIDIA-NeMo#1440)

Fused RoPE is incompatible with context parallelism (cp > 1). This adds
a guard in both the LLM and VLM recipe setup() methods to automatically
set rope_fusion=False before model construction when cp_size > 1.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: SwekeR-463 <swekerswasti@gmail.com>
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
…#1440)

Fused RoPE is incompatible with context parallelism (cp > 1). This adds
a guard in both the LLM and VLM recipe setup() methods to automatically
set rope_fusion=False before model construction when cp_size > 1.

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants