Skip to content

fix(ci): use requirements_public_train-cu12.txt in multi-gpu-tests#40

Merged
ivanbasov merged 1 commit into
NVIDIA:mainfrom
ivanbasov:fix/multi-gpu-requirements
Apr 2, 2026
Merged

fix(ci): use requirements_public_train-cu12.txt in multi-gpu-tests#40
ivanbasov merged 1 commit into
NVIDIA:mainfrom
ivanbasov:fix/multi-gpu-requirements

Conversation

@ivanbasov
Copy link
Copy Markdown
Member

Summary

multi-gpu-tests install step still referenced requirements_public_train.txt which was replaced by requirements_public_train-cu12.txt / -cu13.txt. This caused the job to fail immediately after checkout with:

ERROR: Could not open requirements file: No such file or directory: 'code/requirements_public_train.txt'

Switch to requirements_public_train-cu12.txt to match the runner's CUDA stack, consistent with mid-gpu-tests and gpu-coverage.

Note on mid-gpu-tests failures

The two mid-gpu-tests failures (test_he_reduces_error_weight and trainX should match for same seed) are pre-existing and unrelated to this change.

🤖 Generated with Claude Code

requirements_public_train.txt was replaced by cu12/cu13 variants;
multi-gpu-tests install step still referenced the old filename, causing
the job to fail immediately. Switch to cu12 to match the runner's CUDA
stack (consistent with mid-gpu-tests and gpu-coverage).

Signed-off-by: Ivan Basov <ibasov@nvidia.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ivanbasov ivanbasov requested review from bmhowe23 and kvmto April 2, 2026 16:02
Copy link
Copy Markdown
Collaborator

@bmhowe23 bmhowe23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ivanbasov ivanbasov merged commit b0e492d into NVIDIA:main Apr 2, 2026
14 checks passed
@ivanbasov ivanbasov deleted the fix/multi-gpu-requirements branch April 2, 2026 16:18
ivanbasov added a commit that referenced this pull request Apr 10, 2026
requirements_public_train.txt was replaced by cu12/cu13 variants;
multi-gpu-tests install step still referenced the old filename, causing
the job to fail immediately. Switch to cu12 to match the runner's CUDA
stack (consistent with mid-gpu-tests and gpu-coverage).

Signed-off-by: Ivan Basov <ibasov@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
ivanbasov added a commit that referenced this pull request Apr 10, 2026
requirements_public_train.txt was replaced by cu12/cu13 variants;
multi-gpu-tests install step still referenced the old filename, causing
the job to fail immediately. Switch to cu12 to match the runner's CUDA
stack (consistent with mid-gpu-tests and gpu-coverage).

Signed-off-by: Ivan Basov <ibasov@nvidia.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants