fix(openclaw): drive Dockerfile FROM via UPSTREAM build-arg + drift guard#419
fix(openclaw): drive Dockerfile FROM via UPSTREAM build-arg + drift guard#419prez2307 wants to merge 1 commit into
Conversation
…ft guard Two-source-of-truth bug: the Dockerfile FROM was a hardcoded literal (`alpine/openclaw:2026.4.25-slim` until #418, then `alpine/openclaw:2026.4.22`), while `openclaw-version.json#upstream` was used by build-openclaw-image.yml ONLY to compute the tag NAME (`${UPSTREAM}-${SHORT_SHA}`). When PR #415 rolled back upstream from 4.25 → 4.22 in the JSON file but didn't touch the Dockerfile, CI happily produced a mislabelled `:2026.4.22-dbf5da0` whose actual base layer was still `2026.4.25-slim`. Result: the dev container hung indefinitely on first provision because the upstream NFS+SQLite hang (#73517) was still inside. Fix the design, not just the symptom: 1. Dockerfile declares `ARG UPSTREAM` (no default) before the first FROM and uses `FROM alpine/openclaw:${UPSTREAM}` for stage 2. The base image is now determined purely by what the workflow passes in. No default = `docker build` from a clean checkout fails fast with a clear error instead of silently picking up a stale base. 2. build-openclaw-image.yml emits `upstream` as a step output and passes `--build-arg UPSTREAM=${...}` to `docker build`. `openclaw-version.json#upstream` is now the only source of truth for both the tag name and the actual base. 3. New guard step greps the Dockerfile for any FROM line containing a literal `alpine/openclaw:<non-$>` tag. If someone re-introduces a hardcoded version, the workflow fails with a clear message before producing a mislabelled image. Verified locally: passes on the new Dockerfile, fails on a synthetic `FROM alpine/openclaw:2026.4.22` regression. Stacks on top of #418 (the literal one-line rollback). Once this lands, the next bump only needs to edit `openclaw-version.json#upstream` — the Dockerfile will follow automatically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 10c61de701
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # so openclaw-version.json#upstream is the only source of truth. | ||
| # A literal tag in a FROM line means tag drift is back. | ||
| run: | | ||
| if grep -nE '^FROM[[:space:]].*alpine/openclaw:[^$]' apps/infra/openclaw/Dockerfile; then |
There was a problem hiding this comment.
Make drift guard match lowercase FROM instructions
The new guard in build-openclaw-image.yml only matches lines that start with uppercase FROM, but Dockerfile instructions are case-insensitive, so from alpine/openclaw:2026.4.22 is valid and would bypass this check. In that scenario, a hardcoded base tag can slip back in and reintroduce the tag-drift issue this step is meant to prevent.
Useful? React with 👍 / 👎.
|
Closing as obsolete. The OpenClaw Dockerfile was rewritten since this PR — the base is now |
Summary
ARG UPSTREAM(no default) and usesFROM alpine/openclaw:${UPSTREAM}. The workflow passes the value fromopenclaw-version.json#upstream. The version now lives in exactly one place.alpine/openclaw:<version>in a FROM line and fails the build if found. Stops a future fix(openclaw): Dockerfile FROM line was hardcoded — actual rollback to 2026.4.22 fat #418-style regression at the source.ARG. A cleandocker buildwithout--build-arg UPSTREAM=…errors out fast instead of silently picking up a stale base.Why
PR #415 rolled
openclaw-version.json#upstreamback from 4.25 → 4.22 to escape the upstream NFS+SQLite hang (#73517), but the Dockerfile FROM line was hardcoded — the workflow only usedupstreamto compute the tag NAME (\${UPSTREAM}-\${SHORT_SHA}), not the actual base. CI produced:2026.4.22-dbf5da0whose actual content was still2026.4.25-slim. Dev container hung indefinitely on first provision today; PR #418 patched the literal but didn't fix the design that allowed the drift.After this lands, the bump-upstream workflow is one file: edit
openclaw-version.json#upstream→ the Dockerfile follows automatically and CI rebuilds with the matching base.Test plan
FROM alpine/openclaw:\${UPSTREAM}) and trips on syntheticFROM alpine/openclaw:2026.4.22.build-openclaw-imageproduces a fresh2026.4.22-<newsha>ECR tag whose base layer is actuallyalpine/openclaw:2026.4.22(verify bydocker inspectof the new image).openclaw-version.json#dev.tagto the new SHA and re-provision the dev container — confirms the wedge no longer reproduces (noContainerRuntimeTimeoutErroron stop).🤖 Generated with Claude Code