Skip to content

feat(tests): one-command dogfood retest wrapper + minimal Docker image (AL-36)#22

Merged
Roo4L merged 2 commits into
masterfrom
feat/al-36-dogfood-image
May 9, 2026
Merged

feat(tests): one-command dogfood retest wrapper + minimal Docker image (AL-36)#22
Roo4L merged 2 commits into
masterfrom
feat/al-36-dogfood-image

Conversation

@Roo4L
Copy link
Copy Markdown
Owner

@Roo4L Roo4L commented May 9, 2026

Summary

Replaces the eight-line manual dogfood recipe (Path A in .planning/quick/260503-dtx-.../260503-dtx-SUMMARY.md) with a one-command wrapper. New shape:

./tests/docker/dogfood.sh                          # ubuntu-24.04, default RC
./tests/docker/dogfood.sh ubuntu-22.04 v0.3.2-rc2  # explicit
./tests/docker/dogfood.sh ubuntu-26.04             # 26.04, default RC

What's new

  • tests/docker/Dockerfile.dogfood — ARG-driven Ubuntu base (UBUNTU_VERSION=22.04|24.04|26.04), curl + ca-certificates + systemd preinstalled, canonical systemd-in-Docker mask list applied, /sbin/init as CMD. Deliberately minimal — only the two packages a real user with curl would have. Anything narrower or broader either makes the test irrelevant or hides AL-30-class bugs.
  • tests/docker/dogfood.sh — wrapper. Builds the image, runs the systemd container, fires curl ... | bash with AGENTLINUX_VERSION pinned (sidesteps AL-31 against pre-AL-31 RCs), runs the canonical AGT-02 self-update probe (agentlinux install claude-codeclaude --versionclaude updateclaude --version), reports PASS/FAIL banner.

Verification (all on the published v0.3.2-rc2 release)

Ubuntu install agentlinux install claude-code (2.1.98) claude update new version
22.04 2.1.138
24.04 2.1.138
26.04 2.1.138

claude binary owned by agent:agent at /home/agent/.local/bin/claude on every Ubuntu version — AGT-02 invariant intact.

Bug surfaced (now filed as AL-37, NOT in this PR's scope)

First end-to-end run of the wrapper failed: plugin/provisioner/10-agent-user.sh runs apt-get install locales without a preceding apt-get update. On a freshly-pulled Ubuntu container with empty /var/lib/apt/lists/, that fails with "Package locales has no installation candidate". Earlier dogfood retests masked this because the manual recipe ran apt-get update first.

Tactical workaround in this PR: keep the apt cache populated in Dockerfile.dogfood (don't rm -rf /var/lib/apt/lists/* after install). Documented inline. Strategic fix lives in AL-37 — the installer should run apt-get update itself before any auto-install branch (mirroring the pattern already in 30-nodejs.sh:33).

Test plan

  • gate-1-precommit (only adds two new files, no changes to existing source)
  • gate-2-docker × {22.04, 24.04, 26.04} (the new files don't affect the bats CI suite — only adds optional retest tooling)

Out of scope (filed as follow-ups if useful)

  • Publishing the image to GHCR so contributors don't build locally
  • Wiring the wrapper into release.yml as a publish-time smoke step
  • A parallel-matrix dogfood across all three Ubuntus from a single invocation

Refs: AL-36 (this), AL-30, AL-31, AL-37

Roo4L and others added 2 commits May 9, 2026 09:31
…e (AL-36)

Replaces the eight-line manual recipe documented in
.planning/quick/260503-dtx-.../260503-dtx-SUMMARY.md (Path A) with a
single command per Ubuntu version. Same security envelope as the
existing tests/docker/Dockerfile.ubuntu-N.M files (systemd as PID 1
with the documented mask list, --privileged + --cgroupns=host
recipe at run time) but with a deliberately minimal package
preinstall — only curl + ca-certificates above the bare Ubuntu base.

Why minimal: anything narrower (no curl preinstalled) shifts the
"first apt-get install" onto the user, which is the bootstrap step
this wrapper is designed to skip. Anything broader (sudo, file, jq,
gnupg, locales preinstalled) silently bypasses the AL-30 dependency-
auto-install gates and lets the dogfood mask the very class of
bugs we built it to catch.

New files:

- tests/docker/Dockerfile.dogfood — ARG-driven Ubuntu base
  (UBUNTU_VERSION=22.04|24.04|26.04), curl + ca-certificates +
  systemd installed, the canonical systemd-in-Docker mask list
  applied, /sbin/init as CMD. Apt cache deliberately retained to
  work around AL-37 (the installer's missing apt-get update before
  installing locales).

- tests/docker/dogfood.sh — wrapper. One-arg form:
  `tests/docker/dogfood.sh ubuntu-24.04 v0.3.2-rc2` builds the
  image, runs the systemd container, fires the curl-pipe-bash with
  AGENTLINUX_VERSION pinned (sidesteps AL-31 against pre-AL-31
  RCs), and on success runs the canonical AGT-02 self-update
  probe end to end. Exit codes: 0 on full pass, 64 on bad arg,
  >0 on installer / agent-install / claude-update failure.
  Tag-shape regex matches packaging/curl-installer/install.sh's
  AGENTLINUX_VERSION validator, so a bad tag fails fast before
  any docker work.

Verified end-to-end on the published v0.3.2-rc2 release across all
three Ubuntu versions:

  ubuntu-22.04  install OK  claude-code 2.1.98 -> 2.1.138 update OK
  ubuntu-24.04  install OK  claude-code 2.1.98 -> 2.1.138 update OK
  ubuntu-26.04  install OK  claude-code 2.1.98 -> 2.1.138 update OK

claude binary owned by agent:agent at /home/agent/.local/bin/claude
on every Ubuntu version. AGT-02 self-update invariant intact —
exactly the shape the bats tests/bats/51-agt02-release-gate.bats
asserts on but now reachable in one command.

Out of scope (filed as separate tickets / follow-ups):

- Publishing the image to GHCR so contributors do not need to build
  locally. Useful but not blocking; contributor-laptop builds are
  fast (~30s on a warm Docker layer cache).
- Wiring the wrapper into the release-pipeline workflow as a smoke
  step. Better as its own PR with the gate-staging logic.
- AL-37 — the locales install in 10-agent-user.sh runs without a
  preceding apt-get update; this PR works around it tactically by
  retaining the apt cache in Dockerfile.dogfood, but the strategic
  fix lives in the installer itself.

Refs: AL-36; AL-30 (the four-bug fix that made this image possible);
      AL-31 (unpinned-resolution; this wrapper exports
      AGENTLINUX_VERSION to sidestep it on retests against
      pre-AL-31 published RCs); AL-37 (apt cache prereq).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…cs; review nits

Addresses qa-engineer review feedback on the AL-36 wrapper:

1. EACCES assertion on the claude-update transcript. The Anthropic
   updater can return exit 0 while still emitting permission-denied
   diagnostics on non-fatal recovery paths — that pattern would
   silently invalidate the AGT-02 release-gate invariant without an
   explicit grep. The wrapper now tees `claude update` output to a
   tmpfile and grep -E -i 's the captured transcript for both EACCES
   and "permission denied"; any match exits non-zero with the
   offending lines printed to stderr. Mirrors the assert_no_eacces
   pattern in tests/bats/51-agt02-release-gate.bats which already
   protects the bats suite.

2. Explicit agent:agent ownership check via stat -c "%U:%G" instead
   of eyeballing ls -la. The bash-level string comparison rejects
   anything other than "agent:agent" with a clear FAIL diagnostic
   (AGT-02 invariant: claude binary must be owned by the agent user
   so self-updates work without sudo or recursive shim).

3. ERR trap that dumps three diagnostic streams BEFORE the cleanup
   trap removes the container: tail -n 80 of /var/log/agentlinux-
   install.log, last 80 systemd journal entries, and the captured
   claude-update transcript if present. Without this, a failed
   installation step lost all log context the moment docker rm fired.

4. usage() heredoc now mentions the v0.3.2-rc2 default tag, the
   AL-31 workaround it represents, and the AGENTLINUX_KEEP_CONTAINER
   plus AGENTLINUX_DOGFOOD_TAG env-var overrides. Previous --help
   output left readers wondering which tag they were pinned to.

5. Dockerfile comment relocated. The "DO NOT add rm -rf
   /var/lib/apt/lists/" warning is now a sibling line immediately
   adjacent to the apt-get clean instruction so a future contributor
   doing an apt-cache cleanup PR sees the AL-37 link before deleting
   the cache. Previously the warning sat above the RUN block; the new
   placement makes the proximity unambiguous.

Verified: shellcheck --severity=warning still clean; end-to-end re-run
on ubuntu-24.04 against published v0.3.2-rc2 PASSes with the new
assertions exercised — claude update 2.1.98 -> 2.1.138, transcript
contains zero EACCES or permission-denied lines, claude owner:group
reports exactly agent:agent.

Refs: AL-36 (this PR's anchor); qa-engineer review feedback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Roo4L Roo4L merged commit 77043fa into master May 9, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant