Skip to content

cp: fix: meta init with force_hf=True (1810) into r0.4.0#1822

Merged
akoumpa merged 1 commit intor0.4.0from
cherry-pick-1810-r0.4.0
Apr 14, 2026
Merged

cp: fix: meta init with force_hf=True (1810) into r0.4.0#1822
akoumpa merged 1 commit intor0.4.0from
cherry-pick-1810-r0.4.0

Conversation

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor

beep boop [🤖]: Hi @akoumpa 👋,

we've cherry picked #1810 into  for you! 🚀

Please review and approve this cherry pick by your convenience!

* when using force_hf meta device can still fail, catch and retry

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Remove tp_size from checkpoint robustness settings

* Apply suggestions from code review

Co-authored-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor Author

/ok to test a02059c

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 14, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@akoumpa akoumpa merged commit 5941d5a into r0.4.0 Apr 14, 2026
54 checks passed
@akoumpa akoumpa deleted the cherry-pick-1810-r0.4.0 branch April 14, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-pick Run CICD Trigger Testing CICD

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants