feat(worker): native per-version GPU base images (AE-2827 parity) by deanq · Pull Request #94 · runpod-workers/flash

deanq · 2026-04-28T18:13:09Z

Summary

Stacks on top of #89. Replaces the runpod/pytorch + side-by-side install hack with native per-version base images on nvidia/cuda:12.8.1-cudnn-runtime-ubuntu22.04. Each published image variant has exactly one Python interpreter at /usr/local/bin/python (3.10 from upstream jammy, 3.11/3.12/3.13 from deadsnakes), with torch installed natively from the cu128 wheel index.

Eliminates the ~7 GB cold-start tax on non-3.12 images and decouples flash-worker from runpod/pytorch's Python release cadence. Adding 3.14/3.15 in the future is a CI matrix entry, not an upstream wait.

Phase

This is Phase 1 of the design at docs/superpowers/specs/2026-04-28-ae-2827-python-version-parity-design.md. Phase 2 lives in the stacked SDK PR on flash. Merge order: this PR after #89 → release-please publishes new image tags → flash SDK PR.

Changes

Dockerfile: rewrite — nvidia/cuda base, deadsnakes Python, native torch install, get-pip.py bootstrap.
Dockerfile-lb: rewrite — same shape with EXPOSE 80 and uvicorn entrypoint, byte-identical build chain to Dockerfile for diff-clean parity.
.github/workflows/ci.yml: add 3.13 to all docker test/release matrices (docker-test, docker-test-lb-cpu, docker-test-gpu, docker-test-lb, docker-prod-gpu, docker-prod-cpu, docker-prod-lb, docker-prod-lb-cpu). is-default: true stays on 3.12 (drives :latest aliases).

Test plan

All 16 docker-test cells green (4 image types × 4 Python versions: 3.10, 3.11, 3.12, 3.13)
Smoke handler passes inside each freshly-built image
After merge + release-please, all 16 runpod/flash*:py3.X-{tag} variants exist on Docker Hub

Known follow-ups (not blocking)

numpy install in both Dockerfiles is unpinned (preserved from prior worker setup). Worth pinning in a separate chore(dockerfile): pin numpy PR.

Add Python 3.10 and 3.11 support to GPU worker images via side-by-side torch install in the existing runpod/pytorch base. 3.12 keeps the fast path (torch pre-installed) to avoid the ~7 GB reinstall cost on hot deployments; 3.10/3.11 images pay that cost once per cold start per DC. Sibling to flash#322 which landed the SDK-level plumbing. Tags follow the same ``py${VERSION}-${TAG}`` scheme already in use for CPU images. - Dockerfile / Dockerfile-lb (GPU): accept PYTHON_VERSION build arg; install torch from download.pytorch.org/whl/cu128 and repoint /usr/local/bin/python for non-3.12 targets; validate interpreter matches the arg during build. - Dockerfile-cpu / Dockerfile-lb-cpu (CPU): surface PYTHON_VERSION at runtime via FLASH_PYTHON_VERSION env so the worker's startup check can read it. - src/version.py: new ``assert_python_version_matches_image`` — raises PythonVersionMismatchError at handler boot when ``sys.version_info`` disagrees with the image's stamped FLASH_PYTHON_VERSION. Caught before user code runs; skipped when the env var is unset (local dev). - src/handler.py / src/lb_handler.py: call the assertion immediately after logging setup, before ``maybe_unpack()`` and handler import. - tests/unit/test_version.py: 4 new cases covering env-unset skip, match, mismatch raise, and message contents. - tests/unit/test_lb_handler.py: extend the mocked ``version`` module with ``assert_python_version_matches_image`` so fresh-import tests don't break. - .github/workflows/ci.yml: expand CI to build GPU and LB images across {3.10, 3.11, 3.12}; align prod CPU and LB-CPU default to 3.12 (matches flash's DEFAULT_PYTHON_VERSION).

Ubuntu 22.04's system python3.10 has ensurepip disabled by Debian policy, which broke the side-by-side torch install for 3.10 GPU images (CI: docker-test-gpu (3.10), docker-test-lb (3.10)). python3.11 is a separate interpreter without the disable, so only 3.10 was affected. Use urllib+get-pip.py instead of ensurepip — works for any interpreter regardless of distro patching, and urllib is stdlib so no curl dep. Also corrects the outdated deadsnakes comment on both Dockerfiles: the runpod/pytorch base image layers alt-Python 3.11/3.12 on top of the system 3.10, not via deadsnakes.

Replace the runpod/pytorch + side-by-side install hack with a native per-version GPU base built directly on nvidia/cuda. Each image variant has exactly one Python interpreter at /usr/local/bin/python (3.10 from upstream jammy, 3.11/3.12/3.13 from deadsnakes), with torch installed natively for that interpreter from the cu128 wheel index. Eliminates the ~7 GB cold-start tax on non-3.12 images and decouples flash-worker from runpod/pytorch's Python release cadence. Adding 3.13 (or future 3.14/3.15) is now a CI matrix entry, not an upstream wait. Refs AE-2827.

Mirror the GPU worker rewrite for the load-balanced GPU image. Same nvidia/cuda + deadsnakes pattern, same native-per-version layout, just with EXPOSE 80 and the uvicorn entrypoint instead of the QB handler. Refs AE-2827.

Expands docker-test, docker-test-lb-cpu, docker-test-gpu, docker-test-lb, docker-prod-gpu, docker-prod-cpu, docker-prod-lb, and docker-prod-lb-cpu to include 3.13. is-default stays on 3.12 (drives :latest aliases). Refs AE-2827.

deanq mentioned this pull request Apr 28, 2026

feat(build): match local Python by default and add 3.13 runpod/flash#330

Open

4 tasks

Base automatically changed from deanq/ae-2827-multi-python-versions to main April 28, 2026 21:56

deanq added 5 commits April 29, 2026 00:21

deanq force-pushed the deanq/ae-2827-python-version-parity branch from 3c7632c to eec06e2 Compare April 29, 2026 07:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(worker): native per-version GPU base images (AE-2827 parity)#94

feat(worker): native per-version GPU base images (AE-2827 parity)#94
deanq wants to merge 5 commits intomainfrom
deanq/ae-2827-python-version-parity

deanq commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

deanq commented Apr 28, 2026

Summary

Phase

Changes

Test plan

Known follow-ups (not blocking)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant