Skip to content

feat(docker): add BINARY_SOURCE selector for prebuilt Rust binaries#945

Merged
jtoelke2 merged 4 commits intomainfrom
jtoelke/os-128-use-prebuilt-binaries-flag
Apr 24, 2026
Merged

feat(docker): add BINARY_SOURCE selector for prebuilt Rust binaries#945
jtoelke2 merged 4 commits intomainfrom
jtoelke/os-128-use-prebuilt-binaries-flag

Conversation

@jtoelke2
Copy link
Copy Markdown
Collaborator

Summary

Add a BINARY_SOURCE selector to deploy/docker/Dockerfile.images so the three final images (gateway, supervisor, cluster) can consume pre-built Rust binaries from the build context instead of compiling inside Docker. Default stays build (unchanged behavior); prebuilt is opt-in and inert until later Phase 4 PRs wire in the producer + workflow.

Related Issue

OS-49 runner migration, Phase 4 / OS-128. This is PR 4b of three:

  • PR 4a (not yet opened) — per-arch native cargo build jobs publish openshell-gateway / openshell-sandbox artifacts. Data-blocked on Phase 2 dispatch results.
  • PR 4b (this PR) — Dockerfile + script gain an opt-in path; default unchanged.
  • PR 4c (later) — flip the default, delete the in-Docker Rust builder stages, add multi-arch manifest merge.

Full plan + gotchas live on OS-128 as a Linear comment.

Changes

  • deploy/docker/Dockerfile.images:
    • ARG BINARY_SOURCE=build declared at global scope (required for BuildKit FROM *-${ARG} substitution).
    • Two new dual-source stages per binary:
      • gateway-binary-build (alias for gateway-builder) / gateway-binary-prebuilt (scratch + COPY deploy/docker/.build/prebuilt-binaries/${TARGETARCH}/openshell-gateway /build/out/openshell-gateway) → merged into gateway-binary via FROM gateway-binary-${BINARY_SOURCE}.
      • Same shape for supervisor-binary.
    • Four COPY --from=*-builder lines re-pointed at *-binary: supervisor-output, gateway, supervisor, cluster.
    • Key property: when BINARY_SOURCE=prebuilt, BuildKit skips the rust-builder-* / rust-deps / *-builder / *-workspace stages entirely — Docker becomes packaging-only. PR 4c just flips the default and deletes the now-unreferenced stages.
  • tasks/scripts/docker-build-image.sh:
    • Reads USE_PREBUILT_BINARIES env var. When true and target is gateway/supervisor/cluster/supervisor-output, passes --build-arg BINARY_SOURCE=prebuilt plus a fail-early sanity check that deploy/docker/.build/prebuilt-binaries/ exists.

Testing

  • mise run pre-commit passes
  • docker buildx build --call=check on gateway, supervisor, cluster targets — lint-clean, no warnings.
  • docker buildx build --call=check with BINARY_SOURCE=prebuilt override — lint-clean.
  • mise run build:docker:gateway (default BINARY_SOURCE=build) — completed end-to-end, produced openshell/gateway:dev (115MB). docker run openshell/gateway:dev --version returns openshell-gateway 0.0.0 (version-injection not exercised locally; CI path unchanged).
  • USE_PREBUILT_BINARIES=true mise run build:docker:gateway end-to-end — not exercised in this PR. The prebuilt path is inert until PR 4a lands and stages the binaries. First real end-to-end test ships in PR 4a.
  • E2E tests — N/A; Docker image shape unchanged on the default path.

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • No caller behavior changes (default BINARY_SOURCE=build routes through the same gateway-builder / supervisor-builder stages as before)
  • Architecture docs updated — N/A; plan lives on Linear OS-128 (as a comment; create_document MCP outage)

Signed-off-by: Jonas Toelke <jtoelke@nvidia.com>
@jtoelke2 jtoelke2 requested a review from a team as a code owner April 23, 2026 21:55
@jtoelke2 jtoelke2 self-assigned this Apr 23, 2026
@pimlock pimlock added test:e2e Requires end-to-end coverage and removed test:e2e Requires end-to-end coverage labels Apr 23, 2026
@pimlock
Copy link
Copy Markdown
Collaborator

pimlock commented Apr 23, 2026

FYI: I triggered E2E to verify image builds work in the CI as well.

pimlock
pimlock previously approved these changes Apr 23, 2026
Comment thread deploy/docker/Dockerfile.images Outdated
jtoelke2 and others added 2 commits April 23, 2026 22:58
Adds --chmod=755 to the COPY instructions in the scratch-based
prebuilt binary stages. Without this, binaries produced by PR 4a and
shuttled through actions/upload-artifact + download-artifact lose
their executable bit during the roundtrip, and the resulting image's
ENTRYPOINT fails at runtime.

Signed-off-by: Jonas Toelke <jtoelke@nvidia.com>
@jtoelke2
Copy link
Copy Markdown
Collaborator Author

@pimlock your earlier approval was dismissed when I pushed commit 51bc7544 (the only delta since your review — adds --chmod=755 to both prebuilt COPY lines per your comment). Could you re-approve when you have a sec?

pimlock
pimlock previously approved these changes Apr 24, 2026
…ebuilt-binaries-flag

# Conflicts:
#	tasks/scripts/docker-build-image.sh
@jtoelke2
Copy link
Copy Markdown
Collaborator Author

@pimlock — rebased onto latest main to resolve a conflict from #904 (docker buildx buildce_build in docker-build-image.sh). Merge commit 8d663d11; the only manual change was splicing my BINARY_SOURCE_ARGS line back into the ce_build invocation with matching tab indentation.

The rebase dismissed your approval automatically. When you have a sec, re-approve and squash-merge from the UI — no further changes expected from my side unless something else lands on main first. Sorry for the churn.

@pimlock
Copy link
Copy Markdown
Collaborator

pimlock commented Apr 24, 2026

@jtoelke2 I saw the rebase and approved already, so this is good to go.

@jtoelke2 jtoelke2 merged commit 8a3c0b0 into main Apr 24, 2026
20 checks passed
@jtoelke2 jtoelke2 deleted the jtoelke/os-128-use-prebuilt-binaries-flag branch April 24, 2026 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants