Move TE cross entropy guard to training args by yaoyu-33 · Pull Request #5162 · NVIDIA/Megatron-LM

yaoyu-33 · 2026-06-04T16:42:55Z

Summary

Follow-up to #5115. This keeps the TE cross entropy fusion safety guard in validate_args, but removes it from ModelParallelConfig.__post_init__ so programmatic config construction is not blocked.

Allow ModelParallelConfig(cross_entropy_loss_fusion=True, cross_entropy_fusion_impl='te') to be constructed.
Keep the training CLI / args validation assertion for the unsafe combination.
Update unit coverage to reflect that split: core config can represent the setting, training args reject it.

Test Plan

UV_CACHE_DIR=/home/yuya/Projects/Megatron-LM/.uv-cache uv run isort --check-only tests/unit_tests/test_model_parallel_config.py
UV_CACHE_DIR=/home/yuya/Projects/Megatron-LM/.uv-cache uv run black --check tests/unit_tests/test_model_parallel_config.py megatron/core/model_parallel_config.py
PYTHONPYCACHEPREFIX=/home/yuya/Projects/Megatron-LM/.pycache-check /home/yuya/mypython/bin/python -m py_compile megatron/core/model_parallel_config.py tests/unit_tests/test_model_parallel_config.py

Local focused pytest could not complete on this workstation: direct run is blocked by the local nvidia-resiliency-ext dev version assertion (0.6.0.dev69 compares below required 0.6.0); masking NVRx as unavailable gets past that but the local CUDA/PyTorch driver mismatch segfaults during import.

copy-pr-bot · 2026-06-04T16:42:59Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

yaoyu-33 · 2026-06-04T21:17:57Z

/ok to test bf436d6

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

ko3n1g · 2026-06-05T12:19:23Z

/ok to test 288795f

svcnvidia-nemo-ci · 2026-06-05T12:55:25Z

🔄 Merge queue validation started!

You can track the progress here: https://github.com/NVIDIA/Megatron-LM/actions/runs/27016083461

yaoyu-33 added the Run tests label Jun 4, 2026

Move TE cross entropy guard to training args

bf436d6

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

yaoyu-33 force-pushed the yuya/allow-te-ce-config branch from 0446db6 to bf436d6 Compare June 4, 2026 16:48

yaoyu-33 marked this pull request as ready for review June 4, 2026 16:53

yaoyu-33 requested review from a team as code owners June 4, 2026 16:53

svcnvidia-nemo-ci added Final Review PR is in the "final review" stage complexity: low labels Jun 4, 2026

cuichenx approved these changes Jun 4, 2026

View reviewed changes

deepakn94 approved these changes Jun 4, 2026

View reviewed changes

svcnvidia-nemo-ci added Approved All necessary approvals have been made and removed Final Review PR is in the "final review" stage labels Jun 4, 2026

cuichenx added Final Review PR is in the "final review" stage Run MBridge tests Attach this for testing this PR against MBridge main and removed Approved All necessary approvals have been made labels Jun 4, 2026

yaoyu-33 added Approved All necessary approvals have been made and removed Final Review PR is in the "final review" stage labels Jun 4, 2026

yaoyu-33 enabled auto-merge June 4, 2026 17:42

ko3n1g added the core_r0.18.0 Auto-cherrypick to release branch. Apply before merge; cherrypick happens after merge. label Jun 4, 2026

copy-pr-bot Bot temporarily deployed to public June 4, 2026 21:18 Inactive

copy-pr-bot Bot temporarily deployed to test June 4, 2026 21:18 Inactive

copy-pr-bot Bot temporarily deployed to public June 4, 2026 21:21 Inactive

copy-pr-bot Bot temporarily deployed to public June 4, 2026 21:22 Inactive

test: make TE cross entropy args validation test DP-safe

288795f

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

copy-pr-bot Bot temporarily deployed to public June 5, 2026 12:19 Inactive

copy-pr-bot Bot temporarily deployed to test June 5, 2026 12:20 Inactive

yaoyu-33 added this pull request to the merge queue Jun 5, 2026

Merged via the queue into NVIDIA:main with commit b574499 Jun 5, 2026
170 of 173 checks passed

yaoyu-33 deleted the yuya/allow-te-ce-config branch June 5, 2026 13:40

ko3n1g mentioned this pull request Jun 5, 2026

cp: Move TE cross entropy guard to training args (5162) into core_r0.18.0 #5185

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move TE cross entropy guard to training args#5162

Move TE cross entropy guard to training args#5162
yaoyu-33 merged 2 commits into
NVIDIA:mainfrom
yaoyu-33:yuya/allow-te-ce-config

yaoyu-33 commented Jun 4, 2026

Uh oh!

copy-pr-bot Bot commented Jun 4, 2026

Uh oh!

yaoyu-33 commented Jun 4, 2026

Uh oh!

ko3n1g commented Jun 5, 2026

Uh oh!

svcnvidia-nemo-ci commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

yaoyu-33 commented Jun 4, 2026

Summary

Test Plan

Uh oh!

copy-pr-bot Bot commented Jun 4, 2026

Uh oh!

yaoyu-33 commented Jun 4, 2026

Uh oh!

ko3n1g commented Jun 5, 2026

Uh oh!

svcnvidia-nemo-ci commented Jun 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants