feat(tests): one-command dogfood retest wrapper + minimal Docker image (AL-36)#22
Merged
Conversation
…e (AL-36)
Replaces the eight-line manual recipe documented in
.planning/quick/260503-dtx-.../260503-dtx-SUMMARY.md (Path A) with a
single command per Ubuntu version. Same security envelope as the
existing tests/docker/Dockerfile.ubuntu-N.M files (systemd as PID 1
with the documented mask list, --privileged + --cgroupns=host
recipe at run time) but with a deliberately minimal package
preinstall — only curl + ca-certificates above the bare Ubuntu base.
Why minimal: anything narrower (no curl preinstalled) shifts the
"first apt-get install" onto the user, which is the bootstrap step
this wrapper is designed to skip. Anything broader (sudo, file, jq,
gnupg, locales preinstalled) silently bypasses the AL-30 dependency-
auto-install gates and lets the dogfood mask the very class of
bugs we built it to catch.
New files:
- tests/docker/Dockerfile.dogfood — ARG-driven Ubuntu base
(UBUNTU_VERSION=22.04|24.04|26.04), curl + ca-certificates +
systemd installed, the canonical systemd-in-Docker mask list
applied, /sbin/init as CMD. Apt cache deliberately retained to
work around AL-37 (the installer's missing apt-get update before
installing locales).
- tests/docker/dogfood.sh — wrapper. One-arg form:
`tests/docker/dogfood.sh ubuntu-24.04 v0.3.2-rc2` builds the
image, runs the systemd container, fires the curl-pipe-bash with
AGENTLINUX_VERSION pinned (sidesteps AL-31 against pre-AL-31
RCs), and on success runs the canonical AGT-02 self-update
probe end to end. Exit codes: 0 on full pass, 64 on bad arg,
>0 on installer / agent-install / claude-update failure.
Tag-shape regex matches packaging/curl-installer/install.sh's
AGENTLINUX_VERSION validator, so a bad tag fails fast before
any docker work.
Verified end-to-end on the published v0.3.2-rc2 release across all
three Ubuntu versions:
ubuntu-22.04 install OK claude-code 2.1.98 -> 2.1.138 update OK
ubuntu-24.04 install OK claude-code 2.1.98 -> 2.1.138 update OK
ubuntu-26.04 install OK claude-code 2.1.98 -> 2.1.138 update OK
claude binary owned by agent:agent at /home/agent/.local/bin/claude
on every Ubuntu version. AGT-02 self-update invariant intact —
exactly the shape the bats tests/bats/51-agt02-release-gate.bats
asserts on but now reachable in one command.
Out of scope (filed as separate tickets / follow-ups):
- Publishing the image to GHCR so contributors do not need to build
locally. Useful but not blocking; contributor-laptop builds are
fast (~30s on a warm Docker layer cache).
- Wiring the wrapper into the release-pipeline workflow as a smoke
step. Better as its own PR with the gate-staging logic.
- AL-37 — the locales install in 10-agent-user.sh runs without a
preceding apt-get update; this PR works around it tactically by
retaining the apt cache in Dockerfile.dogfood, but the strategic
fix lives in the installer itself.
Refs: AL-36; AL-30 (the four-bug fix that made this image possible);
AL-31 (unpinned-resolution; this wrapper exports
AGENTLINUX_VERSION to sidestep it on retests against
pre-AL-31 published RCs); AL-37 (apt cache prereq).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cs; review nits Addresses qa-engineer review feedback on the AL-36 wrapper: 1. EACCES assertion on the claude-update transcript. The Anthropic updater can return exit 0 while still emitting permission-denied diagnostics on non-fatal recovery paths — that pattern would silently invalidate the AGT-02 release-gate invariant without an explicit grep. The wrapper now tees `claude update` output to a tmpfile and grep -E -i 's the captured transcript for both EACCES and "permission denied"; any match exits non-zero with the offending lines printed to stderr. Mirrors the assert_no_eacces pattern in tests/bats/51-agt02-release-gate.bats which already protects the bats suite. 2. Explicit agent:agent ownership check via stat -c "%U:%G" instead of eyeballing ls -la. The bash-level string comparison rejects anything other than "agent:agent" with a clear FAIL diagnostic (AGT-02 invariant: claude binary must be owned by the agent user so self-updates work without sudo or recursive shim). 3. ERR trap that dumps three diagnostic streams BEFORE the cleanup trap removes the container: tail -n 80 of /var/log/agentlinux- install.log, last 80 systemd journal entries, and the captured claude-update transcript if present. Without this, a failed installation step lost all log context the moment docker rm fired. 4. usage() heredoc now mentions the v0.3.2-rc2 default tag, the AL-31 workaround it represents, and the AGENTLINUX_KEEP_CONTAINER plus AGENTLINUX_DOGFOOD_TAG env-var overrides. Previous --help output left readers wondering which tag they were pinned to. 5. Dockerfile comment relocated. The "DO NOT add rm -rf /var/lib/apt/lists/" warning is now a sibling line immediately adjacent to the apt-get clean instruction so a future contributor doing an apt-cache cleanup PR sees the AL-37 link before deleting the cache. Previously the warning sat above the RUN block; the new placement makes the proximity unambiguous. Verified: shellcheck --severity=warning still clean; end-to-end re-run on ubuntu-24.04 against published v0.3.2-rc2 PASSes with the new assertions exercised — claude update 2.1.98 -> 2.1.138, transcript contains zero EACCES or permission-denied lines, claude owner:group reports exactly agent:agent. Refs: AL-36 (this PR's anchor); qa-engineer review feedback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces the eight-line manual dogfood recipe (Path A in
.planning/quick/260503-dtx-.../260503-dtx-SUMMARY.md) with a one-command wrapper. New shape:What's new
tests/docker/Dockerfile.dogfood— ARG-driven Ubuntu base (UBUNTU_VERSION=22.04|24.04|26.04), curl + ca-certificates + systemd preinstalled, canonical systemd-in-Docker mask list applied,/sbin/initas CMD. Deliberately minimal — only the two packages a real user with curl would have. Anything narrower or broader either makes the test irrelevant or hides AL-30-class bugs.tests/docker/dogfood.sh— wrapper. Builds the image, runs the systemd container, firescurl ... | bashwithAGENTLINUX_VERSIONpinned (sidesteps AL-31 against pre-AL-31 RCs), runs the canonical AGT-02 self-update probe (agentlinux install claude-code→claude --version→claude update→claude --version), reports PASS/FAIL banner.Verification (all on the published v0.3.2-rc2 release)
claudebinary owned byagent:agentat/home/agent/.local/bin/claudeon every Ubuntu version — AGT-02 invariant intact.Bug surfaced (now filed as AL-37, NOT in this PR's scope)
First end-to-end run of the wrapper failed:
plugin/provisioner/10-agent-user.shrunsapt-get install localeswithout a precedingapt-get update. On a freshly-pulled Ubuntu container with empty/var/lib/apt/lists/, that fails with "Package locales has no installation candidate". Earlier dogfood retests masked this because the manual recipe ranapt-get updatefirst.Tactical workaround in this PR: keep the apt cache populated in
Dockerfile.dogfood(don'trm -rf /var/lib/apt/lists/*after install). Documented inline. Strategic fix lives in AL-37 — the installer should runapt-get updateitself before any auto-install branch (mirroring the pattern already in30-nodejs.sh:33).Test plan
gate-1-precommit(only adds two new files, no changes to existing source)gate-2-docker × {22.04, 24.04, 26.04}(the new files don't affect the bats CI suite — only adds optional retest tooling)Out of scope (filed as follow-ups if useful)
release.ymlas a publish-time smoke stepRefs: AL-36 (this), AL-30, AL-31, AL-37