Skip to content

feat(base): add slim pytorch image (cuda13.0, py3.13, torch2.11)#7

Merged
arsac merged 4 commits intomainfrom
feat/base-pytorch
Apr 28, 2026
Merged

feat(base): add slim pytorch image (cuda13.0, py3.13, torch2.11)#7
arsac merged 4 commits intomainfrom
feat/base-pytorch

Conversation

@arsac
Copy link
Copy Markdown
Owner

@arsac arsac commented Apr 28, 2026

Summary

  • New base/pytorch/ slim PyTorch base image: CUDA 13.0.3 + Python 3.13 (uv-managed) + PyTorch 2.11.0+cu130 + xformers + triton + utility deps. Devel-only (deployment target) — no Dockerfile.runtime, since downstream apps need nvcc/NVRTC/CUPTI.
  • Gates release.yaml runtime-build matrix on Dockerfile.runtime existence so devel-only bases don't break the matrix.

Test plan

  • CI Build Base pytorch (devel) job succeeds and pushes ghcr.io/arsac/pytorch:cuda13.0-torch2.11-devel, :devel, plus semver tags.
  • CI Build Base pytorch (runtime) is skipped (no Dockerfile.runtime).
  • In-image build-time smoke tests pass: import torch; assert torch.version.cuda is not None and import xformers.
  • /constraints.txt contains pinned versions for torch ecosystem and nvidia-*-cu13 transitive deps.

Spec & plan

  • Spec: docs/superpowers/specs/2026-04-28-pytorch-base-image-design.md
  • Plan: docs/superpowers/plans/2026-04-28-pytorch-base-image.md

🤖 Generated with Claude Code

arsac and others added 4 commits April 28, 2026 09:39
Adds the build scaffolding for the new pytorch base image as a parallel
to base/cuda-ml. Defines image-devel, image-devel-local, and
image-devel-all targets with VERSION cuda13.0-torch2.11. The Dockerfile
and CI integration follow in subsequent tasks.
Devel-only base image: CUDA 13.0.3 cuDNN devel, uv-managed Python 3.13,
venv at /opt/venv. Single uv resolve pins torch 2.11, torchvision 0.26,
torchaudio 2.11, xformers 0.0.35, triton 3.6 against the cu130 wheel
index, with constraints.txt emitted for downstream apps to inherit.
Build-time validation imports torch and xformers without requiring a GPU.

uv binary is pulled from ghcr.io/astral-sh/uv via a named build stage so
${UV_VERSION} can be expanded — Dockerfile syntax does not allow ARG
expansion in COPY --from=<image>:<tag>, only in FROM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The build-bases (runtime) matrix previously iterated over every changed base
directory unconditionally. Devel-only base images (no Dockerfile.runtime) like
the new base/pytorch/ would fail at the bake-target lookup step.

Add a changed-bases-runtime output to the prepare job that filters
changed-bases to those with a Dockerfile.runtime, and switch build-bases to
consume that filtered list. Devel-only bases are now skipped at the matrix
level rather than failing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec validates 9 requirements (Python 3.13, CUDA 13.0, PyTorch 2.11 cu130,
cuDNN, NVRTC, cuSPARSELt, NVSHMEM, uv, devel variant) against the actual
nvidia/cuda image and torch wheel METADATA. Plan covers scaffolding,
Dockerfile, CI workflow gating, PR submission.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@arsac arsac merged commit 65576a7 into main Apr 28, 2026
4 of 5 checks passed
@arsac arsac deleted the feat/base-pytorch branch April 28, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant