fix: three dogfood-discovered bugs in user-facing install path by Roo4L · Pull Request #7 · Roo4L/Agent-Linux

Roo4L · 2026-05-02T12:23:09Z

Summary

Fixes three bugs in the v0.3.0/v0.4.0 user-facing install path discovered while running the rc12 dogfood (Tier 1, fresh ubuntu:24.04 Docker container). All three reproduce against current master / v0.4.0; the bats suite passed without catching them because no test exercised the user-facing curl-pipe-bash path end-to-end on a fresh target.

#	Bug	Fix
1	`install.sh` defaults `ORG=agentlinux` → `https://github.com/agentlinux/agent-linux/...` 404s. The v0.4.0 one-liner is broken.	Default to `ORG=Roo4L`. Override-via-env intact.
2	`--purge` leaves `/etc/sudoers.d/agentlinux` behind — orphaned NOPASSWD grant after the agent user is removed (Phase 5.1 / ADR-012 / BHV-07 regression).	Add Step 3.5 to `run_purge`: `rm -f /etc/sudoers.d/agentlinux`.
3	`claude-code/uninstall.sh` trips its own `${AGENTLINUX_AGENT_HOME:?}` guard during `--purge` (cosmetic; `log_warn` caught it but output was noisy).	Export `AGENTLINUX_AGENT_HOME=/home/agent` when `run_purge` invokes per-agent `uninstall.sh`.

Bats coverage extended: tests/bats/40-registry-cli.bats INST-04 @test now also asserts /etc/sudoers.d/agentlinux is gone after --purge so this regression cannot recur silently.

Dogfood evidence (fresh `ubuntu:24.04` Docker, Tier 1)

=== STEP 1: pipe install.sh through bash ===
[INFO] agentlinux-install complete (transcript: /var/log/agentlinux-install.log)

=== STEP 2: agentlinux --version === → 0.3.0
=== STEP 3: agentlinux list === → 3 agents

=== STEP 4: agentlinux install claude-code ===
claude-code: installed, reports: 2.1.98 (Claude Code)
claude-code: install complete (AGT-02b version-lock satisfied)

=== STEP 5: claude --version === → 2.1.98 (Claude Code)  ✓ AGT-02b

=== STEP 6: claude update (AGT-02 — THE bug class) ===
Current version: 2.1.98
Successfully updated from 2.1.98 to version 2.1.126
OK_NO_PERMISSION_ERRORS                                  ✓ AGT-02

=== STEP 7: --purge ===
[INFO] running uninstall.sh for claude-code
claude-code: uninstall complete
[INFO] --purge complete

=== POST-PURGE FILE-SYSTEM CHECKS ===
OK_opt_gone
OK_user_gone
OK_sudoers_gone        ← the new fix
OK_home_gone
OK_node_kept

8 of 8 dogfood signals pass.

Test plan

Local pre-commit (12 hooks) green including gitleaks and Detect hardcoded secrets.
Local Tier 1 dogfood end-to-end in fresh docker run --rm ubuntu:24.04.
CI test workflow green on this branch (Docker bats matrix + gitleaks + cli-unit).
Reviewer to confirm shape: ship as v0.4.1 patch release on merge, or batch with other v0.4.x work?

🤖 Generated with Claude Code

@test

…ll path Discovered while running the rc12 dogfood (Tier 1, fresh ubuntu:24.04 docker container). All three reproduce against current master / v0.4.0; the bats suite passed without catching them because none of the existing tests exercised the user-facing curl-pipe-bash path end-to-end on a fresh target. 1. Wrong default ORG in curl-installer: install.sh defaulted ORG=agentlinux. The actual repo is github.com/Roo4L/Agent-Linux, so the GET against https://github.com/agentlinux/agent-linux/releases/download/... returned HTTP 404. Fixed: default to ORG=Roo4L. 2. --purge regression: sudoers drop-in left behind (INST-04 / BHV-07): Phase 5.1 (ADR-012) added /etc/sudoers.d/agentlinux via 20-sudoers.sh but run_purge was never updated. After --purge, the drop-in stayed — orphaned NOPASSWD grant after the agent user was removed. Fixed: rm -f /etc/sudoers.d/agentlinux as Step 3.5 of run_purge. 3. Per-agent uninstall.sh tripped AGENTLINUX_AGENT_HOME guard on --purge: run_purge invoked uninstall.sh recipes without setting AGENTLINUX_AGENT_HOME (runner.ts normally provides it). Cosmetic warning, purge continued, but noisy. Fixed: export the var when invoking the recipes. Bats coverage extended: 40-registry-cli.bats INST-04 @test now also asserts /etc/sudoers.d/agentlinux is gone after --purge. Verified: full local Tier 1 dogfood in fresh ubuntu:24.04 docker container with the fixed install.sh. SHA256 verify ✓, provisioner ✓, agentlinux install claude-code ✓, claude --version = 2.1.98 ✓, claude update exits 0 with zero EACCES (2.1.98 → 2.1.126) ✓, --purge leaves opt + agent user + /home/agent + sudoers gone, Node kept ✓. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@test

Discovered by user dogfood: `agentlinux install gsd` reported success but Claude Code showed zero /gsd-* commands. Root cause: get-shit-done-cc is a BOOTSTRAPPER, not the GSD command set itself. npm install only puts the bootstrapper binary on PATH; the user (or our recipe) must then run `get-shit-done-cc --global --claude` to actually copy the GSD skill set (~79 skills + hooks + statusline) into ~/.claude/skills/. Our recipe was technically correct (npm install succeeded, binary on PATH, banner matched pin) but the user-visible intent ("install GSD") was not satisfied. AGT-04 only checked the banner; it never checked that ~/.claude/skills/gsd-* was populated. install.sh: add post-npm-install step `get-shit-done-cc --global --claude`. uninstall.sh: add pre-npm-uninstall step `get-shit-done-cc --global --claude --uninstall` for symmetric removal. Also `hash -r` before the post-uninstall `command -v` check — bash hashes the binary path during the bootstrapper invocation; without hash -r the cached entry reports the now-deleted file as still-resolvable. tests/bats/50-agents.bats AGT-04: new @test asserts ~/.claude/skills/gsd-* count >= 10 after `agentlinux install gsd`. Closes the bats coverage gap that allowed the regression. Verified end-to-end in fresh ubuntu:24.04 docker container: install → 79 gsd-* skills wired ✓ remove → 0 gsd-* skills, binary gone ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@tests

…i (agent CLI) The user's intent for the "playwright" catalog entry was Microsoft's @playwright/cli — the token-efficient command-line tool for coding agents that ships a Claude Code skill — NOT the playwright test framework. The previous recipe installed `playwright@1.59.1` (the test framework), downloaded ~281 MB of chromium browser, and called `--with-deps` to apt-install system libraries — none of which the user wanted. The user wanted: an agent-friendly CLI that surfaces /playwright skills inside Claude Code, the same shape as gsd. Catalog: replace the `playwright` entry with `playwright-cli`: id = playwright-cli npm_package_name = @playwright/cli pinned_version = 0.1.11 (latest published, verified via npm registry) homepage = https://playwright.dev/agent-cli/installation Recipe (mirrors the gsd skill-bootstrapper pattern from the prior commit): install.sh 1. npm install -g @playwright/cli@<pin> 2. playwright-cli install --skills # wires ~/.claude/skills/playwright-cli/ uninstall.sh 1. playwright-cli install --skills --uninstall (best-effort) + defensive `find ... -iname '*playwright*' -exec rm -rf` to handle version drift in the bootstrapper's --uninstall flag coverage 2. npm uninstall -g @playwright/cli 3. hash -r before the post-uninstall command -v check Bats coverage updated: tests/bats/50-agents.bats AGT-05 — three @tests now cover playwright-cli instead of the test framework: --version matches pin, ~/.claude/skills/ has a *playwright* entry (catches recipe regressing to npm-only), and CLI-03 idempotency holds on re-install. Verified end-to-end in fresh ubuntu:24.04 docker container: agentlinux install playwright-cli → ~/.claude/skills/playwright-cli/ ✓ playwright-cli --version → 0.1.11 ✓ agentlinux remove playwright-cli → skill gone, binary gone ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

@test

Findings consolidated from bash-engineer + security-engineer + qa-engineer + catalog-auditor + behavior-coverage-auditor. BLOCKERS (would have failed CI): - tests/bats/50-agents.bats AGT-05: typo `playwright-cli-cli` → `playwright-cli` on the @test name (line 251) and the `agentlinux install` invocation inside the idempotency @test (line 286). The previous string would have hit "no such agent" and made the @test vacuously red. - tests/bats/40-registry-cli.bats CAT-01: `grep -qw 'playwright'` would NOT match `playwright-cli` because `-` is not a word boundary. Switched to `grep -qF ' playwright-cli '` against a space-padded id stream so the whole-token check is preserved without the word-boundary trap. - tests/bats/40-registry-cli.bats CAT-04: stale spot-checks for id=="playwright" + 1.59.1. Updated to id=="playwright-cli" + 0.1.11. IMPORTANT (would have shipped quality regressions): - tests/bats/50-agents.bats teardown_file: orphan `remove --force playwright` → `remove --force playwright-cli` so teardown actually cleans the new id. - tests/bats/50-agents.bats AGT-01 mode loop: parity with the claude variant — added `grep -Eq '[0-9]+\.[0-9]+\.[0-9]+'` semver-shape check so an exit-0 with empty output (e.g. an upstream regression in --version under non-TTY stdin) would not silently pass. - tests/bats/50-agents.bats AGT-04/AGT-05 stale-state false-positive: added defensive `rm -rf ~/.claude/skills/{gsd-*,*playwright*}` in setup_file BEFORE the per-agent installs so the skill-wired assertions fail loud when the recipe regresses to "npm install only" — the regression those @tests are supposed to catch. - plugin/catalog/agents/gsd/uninstall.sh: bootstrapper `--uninstall` is best-effort (older versions don't ship it). Added defensive `find ~/.claude/skills -maxdepth 1 -type d -name 'gsd-*' -exec rm -rf {} +` after the bootstrapper call so 79+ skill dirs don't leak when --uninstall is broken/absent. Mirrors playwright-cli's pattern. - plugin/catalog/agents/{gsd,playwright-cli}/install.sh: bootstrapper invocations were unguarded under `set -e`. A re-install on `--force` or a partial-state recovery would abort the recipe even though npm install + version check already succeeded. Wrap each with `|| echo ... >&2` and rely on the post-bootstrapper skill-presence assertion as the real truth check — closes idempotency hole. - plugin/catalog/agents/playwright-cli/install.sh: tighten skill-presence match from broad `-iname '*playwright*'` to `-name 'playwright-cli*'` (-maxdepth 1, -type d). Matches the install side and avoids accidentally asserting against unrelated user skills. - plugin/catalog/agents/playwright-cli/uninstall.sh: tighten the skill cleanup find from `-iname '*playwright*'` (could collateral-damage a hand-rolled `~/.claude/skills/playwright-notes/`) to `-name 'playwright-cli*'`. Drop `2>/dev/null` so legit rm errors surface in the installer transcript. NITS: - tests/bats/50-agents.bats AGT-04/05 __fail messages: dropped the leading `agent` token (copy-paste artifact); replaced `~/.claude/...` (would shellcheck-warn SC2088 inside single quotes) with literal `/home/agent/.claude/...`. - CLI-02 stale comment: "three real agents (claude-code, gsd, playwright)" → `playwright-cli`. Diagnostic strings updated. Verified end-to-end in fresh ubuntu:24.04 docker container: install gsd → 79 gsd-* skills wired ✓ install playwright-cli → 1 playwright-cli* skill dir wired ✓ re-install playwright-cli → "already installed; no-op" ✓ remove gsd → 0 gsd-* skills, binary gone ✓ remove playwright-cli → 0 playwright-cli* dirs, binary gone ✓ DEFERRED (flagged in PR comments, not blocking): - env-via-`env` hardening at plugin/bin/agentlinux-install:250 (pre-existing pattern; matters only if a future stricter sudoers ships). - REQUIREMENTS.md AGT-04 / AGT-05 spec text drift vs new test contract (architectural — cleaner as a separate spec-update PR, possibly introducing a cross-cutting AGT-06 for "agent install wires its skill set into Claude Code" so per-agent IDs don't duplicate the contract). - Tag re-validation symmetry in packaging/curl-installer/install.sh:111-113 (pre-existing on master; not introduced by this PR). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

PR #7 review pass missed a runtime issue that the bats CI surfaced: `playwright-cli install --skills` calls initWorkspace() which mkdirs ./.playwright in the CURRENT directory. AgentLinux dispatches recipes from /opt/agentlinux-src/ — a read-only repo bind-mount in the Docker harness — so the bootstrapper crashes with EACCES on .playwright before it can write any skill into ~/.claude/skills/. Local dogfood missed this because the local container had been manipulated (catalog overlays, etc.) leaving CWD writable. The CI test file (50-agents.bats) runs the recipe AFTER 40-registry-cli.bats's INST-04 --purge, the recovery installer kicks in, and the recipe dispatches with CWD anchored to /opt/agentlinux-src/. Fix: wrap the bootstrapper invocation in a subshell with `cd "${AGENTLINUX_AGENT_HOME}"` so .playwright lands at /home/agent/.playwright (agent-owned, writable, cleaned by `userdel -r` on --purge). Verified by reproducing the CI scenario locally: install (FRESH) before fix → EACCES, no skill wired ✗ install (FRESH) after fix → 100% download, skill wired ✓ install (POST-PURGE) after recovery + fix → skill wired ✓ Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ler allowlist Two blockers surfaced by the first 26.04 PR cycle (the third — Playwright's chromium-install rejecting ubuntu26.04 — is moot after master's #7 dogfood fix swapped the catalog from `playwright` (full + chromium) to `playwright-cli` (@playwright/cli, JS-only, no per-OS browser recipe)). 1. INST-02 / BHV-07 byte-stable re-run failed with "install: No such file or directory" only on the second installer pass. Diagnosed via strace: Ubuntu 26.04 ships uutils-coreutils 0.7.0 (Rust rewrite). Its `install` recursively readlink-chases /dev/stdin → /proc/self/fd/0 → "pipe:[NNN]" and ENOENTs whenever the destination already exists. The first run creates the file (succeeds); the idempotent re-run tries to overwrite the existing file (fails). GNU coreutils opens fd 0 directly and never hits this path. Fix: add a portable `write_file_atomic <mode> <dest>` helper to plugin/lib/idempotency.sh — same atomic-rename semantics as `install -m <mode> /dev/stdin <dest>` but via a same-directory tmpfile so it works on both GNU and uutils. Function-scoped RETURN trap mirrors ensure_marker_block's tmpfile cleanup pattern. Three call sites in plugin/provisioner/40-path-wiring.sh (profile.d, agentlinux.env, cron.d) migrated. The /dev/null source path is unaffected and stays. 2. INST-03 curl-installer fixture rejected 26.04 because packaging/curl-installer/install.sh has its own detect_ubuntu_version allowlist that the AGE-11 patch missed. Extended to 22.04|24.04|26.04 in lockstep with plugin/lib/distro_detect.sh; matching error message updated. Comment makes the lockstep invariant explicit. Verified locally on the rebased branch: tests/docker/run.sh ubuntu-{22.04,24.04,26.04} all PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… targets (#5) * feat(matrix): add Ubuntu 26.04 (Resolute Raccoon) to v0.3.0 supported targets AGE-11. Wires 26.04 LTS (released 2026-04-23, codename `resolute`) into the v0.3.0 plugin matrix end-to-end: - tests/docker/Dockerfile.ubuntu-26.04 (mirrors 24.04 sibling) - tests/docker/run.sh accepts ubuntu-26.04 - tests/qemu/cloud-images.txt + boot.sh codename map (resolute) - plugin/lib/distro_detect.sh + agentlinux-install --help - CI matrices: test.yml + nightly-qemu.yml + release.yml gates - README, PROJECT, REQUIREMENTS, CLAUDE.md, HARNESS.md copy refresh Empirical "installer green on 26.04" verification deferred to the next test.yml PR run + first nightly-qemu cycle. * fix(26.04): unblock CI on Ubuntu 26.04 — uutils install + curl-installer allowlist Two blockers surfaced by the first 26.04 PR cycle (the third — Playwright's chromium-install rejecting ubuntu26.04 — is moot after master's #7 dogfood fix swapped the catalog from `playwright` (full + chromium) to `playwright-cli` (@playwright/cli, JS-only, no per-OS browser recipe)). 1. INST-02 / BHV-07 byte-stable re-run failed with "install: No such file or directory" only on the second installer pass. Diagnosed via strace: Ubuntu 26.04 ships uutils-coreutils 0.7.0 (Rust rewrite). Its `install` recursively readlink-chases /dev/stdin → /proc/self/fd/0 → "pipe:[NNN]" and ENOENTs whenever the destination already exists. The first run creates the file (succeeds); the idempotent re-run tries to overwrite the existing file (fails). GNU coreutils opens fd 0 directly and never hits this path. Fix: add a portable `write_file_atomic <mode> <dest>` helper to plugin/lib/idempotency.sh — same atomic-rename semantics as `install -m <mode> /dev/stdin <dest>` but via a same-directory tmpfile so it works on both GNU and uutils. Function-scoped RETURN trap mirrors ensure_marker_block's tmpfile cleanup pattern. Three call sites in plugin/provisioner/40-path-wiring.sh (profile.d, agentlinux.env, cron.d) migrated. The /dev/null source path is unaffected and stays. 2. INST-03 curl-installer fixture rejected 26.04 because packaging/curl-installer/install.sh has its own detect_ubuntu_version allowlist that the AGE-11 patch missed. Extended to 22.04|24.04|26.04 in lockstep with plugin/lib/distro_detect.sh; matching error message updated. Comment makes the lockstep invariant explicit. Verified locally on the rebased branch: tests/docker/run.sh ubuntu-{22.04,24.04,26.04} all PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Patch on top of v0.3.1 carrying the master-merged follow-ups since the first dogfood failure (AL-18): - PR #7 — three dogfood-discovered installer-path bugs (curl-installer ORG default, --purge sudoers cleanup, GSD + Playwright CLI skill bootstrap wiring, AGENTLINUX_AGENT_HOME export during purge, playwright-cli cd to writable home). - PR #5 — Ubuntu 26.04 (Resolute Raccoon) added to v0.3.0 supported targets. - PR #11 — bump GitHub Actions to Node 24-ready versions. - PR #13 — review-reminder Stop hook + ADR-010 refinement (AL-23). - PR #14 — workspace-cleanup skill. - PR #4 / #9 / #10 — CI / website-deploy fixes. scripts/build-release.sh enforces a three-way version lock — the tag's base version (after stripping any -rc suffix) must equal both plugin/cli/package.json.version and plugin/catalog/catalog.json.version. Bumping both files to 0.3.2 so v0.3.2-rc1 (and eventually v0.3.2 final) clear the gate. Refs: AL-18 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Roo4L and others added 5 commits May 2, 2026 13:46

Roo4L force-pushed the fix/curl-installer-org-default-and-purge-sudoers branch from a3eae67 to 7402fce Compare May 2, 2026 13:46

Roo4L merged commit 3ebbab0 into master May 2, 2026
10 of 11 checks passed

Roo4L deleted the fix/curl-installer-org-default-and-purge-sudoers branch May 2, 2026 13:59

Roo4L mentioned this pull request May 2, 2026

chore(release): bump version to 0.4.1 for rc1 #12

Closed

3 tasks

Roo4L mentioned this pull request May 3, 2026

chore(release): bump version to 0.3.2 for rc1 release #16

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: three dogfood-discovered bugs in user-facing install path#7

fix: three dogfood-discovered bugs in user-facing install path#7
Roo4L merged 5 commits into
masterfrom
fix/curl-installer-org-default-and-purge-sudoers

Roo4L commented May 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Roo4L commented May 2, 2026

Summary

Dogfood evidence (fresh ubuntu:24.04 Docker, Tier 1)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Dogfood evidence (fresh `ubuntu:24.04` Docker, Tier 1)