Skip to content

Feature/parallelism#72

Merged
philipp-fischer merged 6 commits intodevelopfrom
feature/parallelism
Feb 18, 2025
Merged

Feature/parallelism#72
philipp-fischer merged 6 commits intodevelopfrom
feature/parallelism

Conversation

@philipp-fischer
Copy link
Copy Markdown
Collaborator

@philipp-fischer philipp-fischer commented Feb 17, 2025

Fix global state save/restore and improve docs around save/restore in distributed setting. Addresses #71

Also remove the src_rank argument from restore_state_global, since that would not work with tensor parallelism.

Comment thread src/megatron/energon/savable_loader.py
Comment thread src/megatron/energon/savable_loader.py
Comment thread docs/source/advanced/parallelism.md Outdated
Comment thread docs/source/advanced/parallelism.md
Comment thread src/megatron/energon/savable_loader.py
Co-authored-by: Lukas Voegtle <5764745+voegtlel@users.noreply.github.com>
@voegtlel voegtlel marked this pull request as ready for review February 18, 2025 10:50
@philipp-fischer philipp-fischer merged commit 9a4808a into develop Feb 18, 2025
@philipp-fischer philipp-fischer deleted the feature/parallelism branch February 18, 2025 10:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants