feat: end-to-end install.sh with welcome banner, sudo escalation, setup + agent-setup orchestration#153
Conversation
uzyn
left a comment
There was a problem hiding this comment.
Review: PR #153 — end-to-end install.sh
Changes Overview
install.sh grows from a narrow download-and-place script into a full onboarding pipeline: welcome banner, up-front sudo escalation, download+install, sudo aimx setup with </dev/tty reattached, drop to $SUDO_USER for aimx agent-setup, closing message with the configured domain. Adds a POSIX-sh test harness (tests/install_sh.sh) that sources the script under INSTALL_SH_TEST=1 and exercises the helpers. The upgrade path is preserved and all previously-supported flags/env vars still work. .PLAN.md is correctly absent from the diff.
Scope Alignment
The PR meets every bullet on its own requirements list. No Rust code changes, no new subcommands, no unrelated refactors. Backup is config.toml only (not DKIM/TLS), per spec. --tag, --target, --to, --force, AIMX_VERSION, AIMX_PREFIX, AIMX_DRY_RUN, AIMX_VERBOSE, GITHUB_TOKEN all still honored. Local sh tests/install_sh.sh run: 34/34 pass.
Potential Bugs
1. sudo -u "${_invoker}" -- sh -c 'aimx agent-setup …' may write MCP config into /root, not the invoker's home — Blocker
File: install.sh line 591.
On distros where sudoers does NOT set env_reset (the non-default behavior — some custom-built minimal distros / containers), sudo -u alice preserves the invoking user's HOME=/root. aimx agent-setup then writes its agent plugin bundles (Claude Code, Codex CLI, etc.) under $HOME/.claude/…, $HOME/.config/… — i.e. into /root/.claude/ rather than /home/alice/.claude/. The invoker then cannot find the MCP config in their own shell.
Even on distros that do default to env_reset, sudo -u without -H or -i is documented as not guaranteed to reset HOME across all sudoers configurations. The robust form is sudo -u "${_invoker}" -H -- sh -c '…' or sudo -iu "${_invoker}" -- aimx agent-setup </dev/tty.
Since step 6's entire purpose is to wire MCP into the invoker's agent config, writing it to the wrong HOME silently defeats the feature. No test covers this.
Recommend: add -H (or switch to -i) and, while here, propagate a minimal env whitelist (TERM, LANG) so the TUI picker renders correctly.
2. backup_existing_config and upgrade-path commands unconditionally invoke sudo — Blocker for root-without-sudo installs
File: install.sh line 272, plus every sudo … invocation in the upgrade branch (789–828) and in extract_domain (616).
ensure_sudo short-circuits with return 0 when id -u is 0 — so on a plain root shell (common in containers, Alpine minimal, Docker USER root), the script never verifies sudo is on PATH before continuing. Then:
backup_existing_configrunssudo mv -f "${_cfg}" "${_bak}"— fails when sudo is absent.- The upgrade path runs
sudo sh -c '…'for stop/start service andsudo install -m 0755 …,sudo mv -f …— all fail under root-without-sudo. extract_domain's fallback branch callssudo awk.
On a root shell without sudo installed, the fresh-install path will err out of backup_existing_config even though root can trivially mv the file directly. The upgrade path will fail at stop_service (exit 127 wrapped in || echo unknown, so _init=""), then sudo install fails — partial state.
Simplest fix: early in main, pick the root of privileged commands once:
if [ "$(id -u)" -eq 0 ]; then
SUDO=""
else
SUDO="sudo"
fiand use ${SUDO} mv -f …, ${SUDO} install -m 0755 … everywhere. The ensure_sudo call can then be scoped to the non-root branch. Alternatively, add sudo to the need list inside a [ "$(id -u)" -ne 0 ] guard.
Prior version of the script used install -m 0755 without sudo inside a branch that first verified -w "${PREFIX}"; the writability check is now deleted, so the only remaining path to ${PREFIX} assumes sudo is present.
3. Upgrade-path stop_service inlined in sudo sh -c drops the "no init system" warning — Non-blocker
File: install.sh lines 789–803.
The outer stop_service function (lines 446–464) prints say "warning: no systemd or OpenRC detected; skipping service stop" as a safety message. The inlined version in the upgrade branch omits that warning — silently falls through to printf unknown. Operators on non-systemd/non-openrc hosts lose the diagnostic. Also, the stop_service "systemd service present but not active" branch still prints nothing, so _init="" is captured and the later case '${_init}' in systemd) does not match — meaning if the service was stopped out-of-band before upgrade, the install-side restart is silently skipped. This was latent in the previous code too; flagged here because the rewrite makes it easier to fix (the outer stop_service is still defined and reachable).
Cleaner: keep using the outer stop_service/start_service functions and simply prepend sudo to the systemctl/rc-service calls inside them — rather than re-inlining the whole state machine twice.
4. printf systemd without newline, followed by return 0 — Non-blocker
In both the inlined and outer stop_service, printf systemd emits no trailing newline, which is fine for $(…) capture. But the && return 0 chain means if printf itself fails (EPIPE), return 0 is skipped. Extremely unlikely to bite, but if you're rewriting this block anyway for bug #2, use printf 'systemd'; return 0 unconditionally.
5. ensure_sudo hard-exits on password-prompt failure — Non-blocker
Line 233: sudo -v </dev/tty runs under set -e. If the user mistypes the password three times, sudo -v returns non-zero and the script dies with a cryptic set -e abort rather than a user-visible message. Wrap in if ! sudo -v </dev/tty; then ui_error "…"; exit 1; fi.
6. _msg / _url / _dst / _tag scope leakage — Non-blocker (style)
POSIX sh has no local. All UI helpers use bare _msg="$1", which leaks into the caller's scope. This is consistent with existing style (say, err, need), but worth noting because the new helpers are called more often now (e.g. ui_info inside the upgrade loop clobbers any outer _msg). Not a functional bug unless a caller depends on _msg across the call — none currently do.
Security
SUDO_USER ingestion as sudo -u target — Low
detect_invoker consumes $SUDO_USER verbatim and passes it to sudo -u "${_invoker}" -- sh -c '…'. The command template is single-quoted so no expansion occurs, and sudo -u rejects non-existent users. A hostile actor could set SUDO_USER=$'\nrm -rf'in the env of the install-script pipe, but they're already running the script and have whatever privilege they invoked it with, so no privilege escalation. Still, belt-and-braces: validate${invoker}matches^[a-zA-Z][a-zA-Z0-9_.-]*$before passing tosudo -u`.
backup_existing_config backup naming uses UTC timestamp, no randomness — Low
Two install.sh runs within the same second (e.g. racing CI) would collide: config.toml.bak-YYYYMMDD-HHMMSS → the second sudo mv -f clobbers the first backup. Non-exploitable but loses data. Add $$ or $RANDOM (the latter is bash-only; use date -u +%Y%m%d-%H%M%S-%N where GNU date is available, or append $$).
Test Coverage
Strengths:
- Syntax validation under both
sh -nanddash -n. INSTALL_SH_TEST=1source guard is a clean pattern for testing without runningmain.- Helper-level coverage of
detect_invoker(three branches),backup_existing_config(rename + no-op-when-absent),ensure_sudo(passwordless, password-prompt-message, no-sudo-binary),parse_args. - Dry-run smoke asserts banner + all six step titles + "no filesystem changes."
Gaps:
- No test for
run_agent_setup_as_invoker— the function with the highest concentration of branching logic (invoker detection × TTY presence × euid). Bug #1 above would have been caught by a test that verifiessudo -uis called with-H(or that HOME is reset). - No test for
extract_domain— the domain pulled from/etc/aimx/config.tomldrives the closing message; regressions here would silently show<your-domain>in production. - No test for
print_closing_message— operators see this on every fresh install; worth asserting the domain gets substituted. - No test for
compare_tags— untouched by this PR but the upgrade decision depends on it; note for future. backup_existing_configtest uses asudopassthrough stub — so it doesn't exercise the "root without sudo" path (bug #2). A follow-up test should point PATH at an empty dir, setid -u=0 shim, and assert the helper succeeds (after fix).- No test for
ensure_sudoreturning 0 when euid is 0 — theid -umocking needed is awkward in POSIX sh, but doable with a wrapper script. - No test for
--tag=value/--target=value/--to=valueequals-form — only the space-separated forms are tested.
Code Quality
- Upgrade-branch
sudo sh -c '…long inlined function…'(lines 789–803, 810, 820, 824, 828) is significantly harder to read than the previousstop_service/start_servicecall sites. Prefer theSUDO=prefix pattern suggested in bug #2 and keep the helper functions. ui_*helpers write to stderr,saywrites to stderr. Consistent — good. But the closing-messageclaude -p "…"example is output viasay(stderr), which means an operator cannotcurl | sh 2>/dev/nulland still see the prompt-to-paste. Minor.- Banner text
"🐦⬛ AIMX installer"uses a ZWJ emoji sequence. Acceptable per author note; flagging for awareness.
Summary and Recommended Actions
Overall verdict: Needs minor fixes.
Blockers:
sudo -u "${_invoker}"without-H/-imay write MCP config to/rootinstead of the invoker's home depending on sudoersenv_resetsetting (line 591).backup_existing_config, upgrade-path commands, andextract_domainall callsudounconditionally; they will fail on a root shell that does not havesudoinstalled (lines 272, 616, 789–828).
Non-blockers:
- Upgrade-path
stop_servicere-inlined insidesudo sh -cdrops the "no init system" warning and duplicates existing logic; refactor to reuse the outerstop_service/start_servicewith aSUDO=prefix. ensure_sudopassword-prompt failure path relies onset -erather than emitting a user-visible error.- Missing tests for
run_agent_setup_as_invoker,extract_domain,print_closing_message, and the--flag=valueequals-form ofparse_args. - Backup timestamp has 1-second granularity; second run within the same second silently clobbers the first backup.
_msg/ other underscore-prefixed temporaries leak into the caller scope (POSIX limitation; consistent with existing style).
Nice-to-haves:
- Validate
${_invoker}shape before passing tosudo -u. - Assert on
sudo-calling behavior via test stubs rather than only asserting on user-visible messages.
|
Addressed the two blockers plus three of the non-blockers in 2bba4e6. Blockers fixed
Non-blockers fixed
Left as-is
Verification
|
uzyn
left a comment
There was a problem hiding this comment.
Re-Review: PR #153 — end-to-end install.sh (commit 2bba4e6)
Re-reviewed the diff between da4b778 and 2bba4e6, re-ran sh -n install.sh, dash -n install.sh, and sh tests/install_sh.sh against the PR head. 49/49 tests pass. All previously flagged blockers are fixed and four of the five non-blockers are addressed.
Previously Flagged Blockers — Resolved
B1: sudo -u could write MCP config into /root — FIXED
run_agent_setup_as_invoker (install.sh:608-646) now:
_inv_home="$(getent passwd "${_invoker}" 2>/dev/null | cut -d: -f6)"
if [ -z "${_inv_home}" ]; then
_inv_home="/home/${_invoker}"
fi
sudo -H -u "${_invoker}" env HOME="${_inv_home}" sh -c 'aimx agent-setup </dev/tty' || { … }Belt-and-braces: -H resets HOME via sudoers and env HOME=… pins it explicitly regardless of sudoers policy. The getent passwd lookup sources the authoritative home, with a sane /home/${_invoker} fallback when NSS returns nothing. The sudo stub-based test at tests/install_sh.sh:378-449 asserts [-H], [-u] [alice], and [HOME=/home/alice] all land in argv, and that the outer HOME=/tmp/bogus-outer-home does NOT leak through. The blocker is resolved.
B2: Unconditional sudo fatal on root-without-sudo — FIXED
resolve_sudo_prefix (install.sh:235-244) sets SUDO="" when euid is 0 or when sudo is absent, SUDO="sudo" otherwise. All previously-unconditional sudo invocations now go through ${SUDO}:
backup_existing_config→${SUDO} mv -f(line 305)stop_service→${SUDO} systemctl stop/${SUDO} rc-service … stop(lines 483, 491)start_service→${SUDO} systemctl start/${SUDO} rc-service … start(lines 504, 508)extract_domain→${SUDO} awk(line 659)- Upgrade path:
${SUDO} mkdir -p,${SUDO} mv -f,${SUDO} install -m 0755(lines 816, 840, 847, 850, 859) - Fresh-install path:
${SUDO} install -m 0755(line 880)
ensure_sudo is still defined and called before resolve_sudo_prefix in main (lines 812-813), so non-root-with-sudo still caches credentials up front. Section 7c of the tests (lines 452-516) exercises the root-without-sudo path with an empty PATH (no sudo binary), stubs id -u → 0, and confirms backup_existing_config succeeds and SUDO=[] was resolved. The blocker is resolved.
The one remaining unprefixed sudo outside ensure_sudo/error messages is the intentional sudo -H -u <invoker> de-escalation at line 633 — that one must stay as raw sudo (it's dropping privilege, not escalating), and is only reached on the euid=0 branch where sudo is by definition available.
Previously Flagged Non-Blockers
- NB1 (upgrade path re-inlined
stop_service) — FIXED. The upgrade branch now calls the outerstop_service/start_servicehelpers (install.sh:835, 842, 852, 856, 860). The "no systemd or OpenRC detected" diagnostic (line 495) is reachable again. Logic is no longer duplicated. - NB2 (
ensure_sudosilentset -eon wrong password) — FIXED.ensure_sudonow captures_sudo_rcfromsudo -vand, on failure, emits[error] failed to obtain sudo credentialsbeforeexit 1(lines 258-267). - NB3 (test coverage) — FIXED. Test count is now 49 (was 34). New sections: 7b
run_agent_setup_as_invoker(argv-recording sudo shim asserts-H/-u/HOME=…pinning), 7c root-without-sudo SUDO prefix, 7dextract_domain(happy path + missing-config fallback), 7eprint_closing_message(domain substitution), plus--tag=VAL/--to=VAL/--target=VALequals-form coverage in section 7. Ran locally — 49/49 pass. - NB4 (backup timestamp collision) — FIXED.
_bak="${_cfg}.bak-${_ts}-$$"(line 304) appends the PID; concurrent runs in the same second no longer clobber. - NB5 (underscore-prefixed temporary leakage) — intentionally deferred as consistent with existing POSIX-sh style. Agreed.
New Issues Found
None. No regressions introduced by the fixes.
Still Unresolved
Nothing from the prior review remains unresolved in the blocker/non-blocker tier. The nice-to-have from the first review (regex-validate ${_invoker} before passing to sudo -u) is still outstanding but was classified Low at the time and remains acceptable given sudo -u validates usernames itself.
Summary and Recommended Actions
Overall verdict: Ready to merge.
Both blockers are fixed with proper test coverage, four of five non-blockers are addressed, and the one deferred non-blocker was an explicit style choice. Local verification: sh -n install.sh ✓, dash -n install.sh ✓, sh tests/install_sh.sh ✓ (49/49).
Recommended merge commit message
feat: end-to-end install.sh with welcome banner, sudo escalation, and setup + agent-setup orchestration
Expand install.sh from a download-and-place script into the full onboarding
flow. Prints a branded welcome banner with a six-item checklist, escalates
to sudo once up front (with /dev/tty reattached so `curl | sh` still
prompts), downloads and installs the binary, runs `sudo aimx setup` for
preflight / DNS / TLS / trust / service install, drops back to the
invoking user (with HOME pinned via `sudo -H` + `env HOME=…` from
`getent passwd` so MCP config lands in the invoker's home regardless of
sudoers env_reset policy) for `aimx agent-setup`'s interactive MCP
picker, and closes with a domain-aware example prompt.
A `resolve_sudo_prefix` helper picks `SUDO=""` (already root) or
`SUDO="sudo"` (non-root) once in main() and is threaded through every
privileged call site, so root-shell installs in minimal containers
without sudo installed now succeed. The upgrade path reuses the outer
`stop_service` / `start_service` helpers (no inlined state machine) and
preserves the "no init system" diagnostic. `ensure_sudo` surfaces an
explicit error when `sudo -v` fails instead of a silent set-e abort.
Pre-existing `/etc/aimx/config.toml` is backed up to
`config.toml.bak-YYYYMMDD-HHMMSS-<pid>` (PID suffix prevents same-second
collisions); DKIM keys and TLS certs are left in place.
Add tests/install_sh.sh (49 tests: syntax, INSTALL_SH_TEST=1 source
guard, detect_invoker, backup_existing_config, ensure_sudo,
run_agent_setup_as_invoker with argv-recording sudo shim, root-without-
sudo SUDO="" path, extract_domain, print_closing_message, parse_args
including --flag=value equals-form, and AIMX_DRY_RUN=1 smoke). Script
stays POSIX sh and passes both `sh -n` and `dash -n`.
Update book/installation.md to reflect the new end-to-end flow and the
config.toml backup behavior.
uzyn
left a comment
There was a problem hiding this comment.
Re-Review (Cycle 3): PR #153 — setup wizard moved into the binary (commit 56831fa)
Re-reviewed the diff 2bba4e6..56831fa against the cycle-3 scope (move banner / 6-step checklist / agent-setup orchestration into the binary; thin install.sh to a downloader). Local verification:
cargo fmt -- --checkcleancargo clippy --all-targets -- -D warningsclean- Targeted Rust tests:
setup::tests307 pass,term::tests9 pass (coversstep_glyph+ the newChecklist/SETUP_STEPS/banner/argv/section-order/re-entrant-skip cases) sh -n install.sh,dash -n install.shcleansh tests/install_sh.sh40/40 pass (matches the implementer's claim)
Changes Overview
The cycle-3 commit reorganises aimx setup to (a) own the welcome banner, the six-step checklist, and the closing message, and (b) drive aimx agent-setup via Command::status() so control returns and the closing message prints after step 6. install.sh is thinned: removed print_welcome_banner / print_final_banner / print_step_list / set_step / _step_glyph / STEP{N}_* / run_aimx_setup / run_agent_setup_as_invoker / extract_domain / print_closing_message, replaced with a thin two-line print_install_banner. ensure_sudo now runs before the GitHub tag-resolution roundtrip. New term::StepState + term::step_glyph (Unicode ☐ ◐ ☑ ☒ ✗ on TTY, ASCII [ ] [~] [x] [-] [!] non-TTY). The DKIM keypair is generated earlier — in step 2 — so the DNS guidance table can render the actual p= value.
Resolved from prior cycles
- Section ordering (Preflight → Domain & DNS → TLS → Trust → Install → MCP) — the six
section_header(N, SETUP_STEPS[N-1])calls appear in spec order at unconditional positions inrun_setup.SETUP_STEPSmatches the spec wording exactly. - Step 6 double-invocation —
install.shno longer callsaimx agent-setup; only the binary'sdrop_through_to_agent_setupdoes. Confirmed by source-grep (run_agent_setup_as_invokeris gone; tests/install_sh.sh has a regression guard at lines 159–172). - TTY handoff —
drop_through_to_agent_setupnow usesCommand::status()(notCommandExt::exec) and therunuser -u $SUDO_USER -- /proc/self/exe agent-setupinvocation inherits the parent's stdin/stdout/stderr so the agent-setup TUI still gets the controlling terminal. Control returns after the TUI exits. ensure_sudoreordering —main()runsensure_sudo+resolve_sudo_prefixonly whenDRY_RUN != "1", before anydownloadcall. The new fail-fast test (lines 362–442) confirmscurlis never invoked beforeensure_sudoexits when sudo is missing.- CLAUDE.md compliance — no raw
.red()/.green()/.yellow()/.blue()/.bold()calls insetup.rsorterm.rsoutsideterm.rs's own helper bodies; noSprint/User Story/Acceptance criteriastrings in code orbook/.
Potential Bugs
1. Step-6 final state never visible to the operator — Non-blocker
run_setup (lines 2293–2315) prints print_final_banner(&pre_handoff) with step 6 forced to Running (◐) before the drop-through. After drop_through_to_agent_setup returns, the checklist is mutated with mcp_state (Done / Skipped) and print_closing_message(&domain) is called, but print_final_banner is not reprinted. Effect: the operator sees step 6 as ◐ during the TUI handoff, and then a closing-message paragraph — but never sees step 6 flip to ☑ or ☒. The PR description's "tick 6 → closing message" sequence does not match what the code actually emits.
This contradicts the PR's stated UX contract ("operators see ☐ flip to ☑ / ☒ as a clear summary"). Fix is small: call print_final_banner(&checklist) after checklist.set(6, mcp_state) and before print_closing_message. Non-blocker because the closing message itself is informative, but worth fixing.
2. Closing-message wording diverges from spec on the trailing word "emails" — Non-blocker
Spec text from the review request:
"Your agents now have access to set up, send and receive emails from @ emails."
Implementation (print_closing_message, line 2344):
println!(
"Your agents now have access to set up, send and receive emails from {}.",
term::highlight(&format!("@{domain}"))
);The trailing word "emails" is dropped, and book/setup.md matches the code rather than the spec. Meaning is unambiguous so this is non-blocking, but flagging since the user explicitly highlighted the trailing "emails" in the review brief.
3. Re-entrant gating uses two different criteria for steps 3 vs 4 — Non-blocker (informational)
Step 3 (TLS) skips when already_configured (= service_running && cert_exists && dkim_exists). Step 5 (Install) uses the same criterion. Step 4 (Trust) skips when config_path.exists() alone. On a partial install — config file exists but service hasn't been started — step 3 runs fully (TLS gen) and step 4 silently skips (trust preserved). The trust skip is actually correct (preserve operator's prior choice), but the criterion mismatch is worth a comment so a future maintainer doesn't accidentally collapse them. No functional bug.
4. Early DKIM keypair generation leaks state on Ctrl-C between step 2 and step 5 — Non-blocker
Lines 2147–2150 generate the DKIM keypair at step 2 (DNS section) so the p= value is available for the DNS table. generate_keypair(&dkim_root, false) is not internally idempotent (returns Err if private.key exists and force=false), but the call is guarded by an explicit !dkim_root.join("private.key").exists() check so re-runs are fine. finalize_setup (step 5) has the same guard at line 1496–1502 and prints "DKIM keypair already exists." Idempotent end-to-end.
The minor regression: if the operator aborts at the trust prompt (step 4) on a fresh install, /etc/aimx/dkim/{private,public}.key are now present on disk while config.toml is not. Re-running picks up where they left off (good). But "wipe and start over" now requires deleting both dkim/ and config.toml, where previously DKIM keys were only created in step 5. The implementer's own scope note marks this as an unplanned change; flagging as informational.
5. Source-grep section-order assertion only validates lexical order — Non-blocker
run_setup_emits_section_headers_in_spec_order (lines 5656–5698) finds the literal substrings "section_header(1," … "section_header(6," in setup.rs's source and asserts the byte offsets are monotonically increasing. The implementer's note acknowledges that per-thread stdout capture made dup2-based output assertion impractical. The source-grep is a reasonable proxy because the section_header calls are at unconditional top-level positions in run_setup and not inside a helper or branch — but if a future refactor moves any of these calls into a helper (or inside an early-return branch), the test would silently stop testing runtime order. Worth a comment in the test pinning the assumption ("call sites must remain top-level in run_setup for this source-grep to mean what it claims").
The complementary run_setup_reentrant_path_marks_tls_trust_install_skipped test (lines 5700–5730) is also pure source-grep on checklist.set(N, term::StepState::Skipped) and "Step N skipped" literals. Same caveat.
Security
Nothing new. The Command::status() invocation passes runuser -u $SUDO_USER -- /proc/self/exe agent-setup [--data-dir …]. $SUDO_USER flows into the runuser -u argument unvalidated; runuser itself rejects malformed usernames so injection isn't possible, but the empty-string case is handled separately (line 2411) and the code never feeds $SUDO_USER into a shell. build_agent_setup_argv deliberately omits --dangerously-allow-root (regression-tested at lines 5489–5502).
Test Coverage
Strong on the new pieces:
setup_steps_match_spec_wordinglocks in the exact six titleschecklist_defaults_all_pending_and_supports_per_step_mutationcovers theChecklistAPIprint_step_list_renders_six_titled_lines/welcome_banner_includes_title_and_step_titles/final_banner_signals_completion_and_renders_step_listlock in the banner structure (source-grep, see #5)build_agent_setup_argv_*(4 tests) cover the runuser argv shape, including the never-implicit---dangerously-allow-rootregression guardrun_setup_marks_step_six_skipped_when_sudo_user_unsetconfirms all threeAgentSetupOutcomearms are referenced inrun_setup(source-grep)run_setup_reentrant_path_marks_tls_trust_install_skippedsource-greps the skip-mutation contract for steps 3/4/5term::tests::step_glyphs_use_unicode_on_tty/step_glyphs_use_ascii_fallback_on_non_ttycover the new Unicode/ASCII switch
Gaps:
- No behavioural test that the closing message prints AFTER
drop_through_to_agent_setupreturns (only an argv-shape test plus a source-grep that the function exists). A future refactor that inadvertently moves the closing message before the handoff would not be caught. - No test that
print_final_banneris reprinted with step-6 ☑/☒ at the end — and indeed the code does not reprint it (see Bug #1). A test for this would have caught the gap. - No coverage of the
AgentSetupOutcome::Faileduser-facing message distinction. BothSkippedandFailedmap toterm::StepState::Skipped, but the operator-visible eprintln forFaileddiffers (warning + status code vs. guidance to run agent-setup). Worth at least a snapshot-style test.
Code Quality
AgentSetupOutcome::FailedandAgentSetupOutcome::Skippedboth render as ☒. The user-facing distinction is preserved via different eprintln wording (drop_through_to_agent_setuplines 2412–2421 vs 2449–2456). Acceptable, but I'd argue the operator would benefit fromAgentSetupOutcome::Failedrendering asStepState::Error(✗ red) instead ofSkipped(☒ cyan) so a TUI crash is visually distinct from a "no SUDO_USER, run it manually" advisory. Currently they share the same glyph. Non-blocker.- The double banner —
install.sh's "AIMX installer" two-liner immediately followed by the binary's "🐦⬛ AIMX setup" welcome — is documented as intentional in the PR description. Acceptable. print_install_bannerwrites to stderr (lines 115–119) consistent with the rest ofinstall.sh. The binary's banner / step list / closing message all write to stdout. Operators pipingcurl | sh > log.txt 2>&1see them interleaved correctly; pipingcurl | sh > log.txtonly would lose the install.sh banner but capture all the binary output. Acceptable for an installer.
Summary
Verdict: Ready to merge.
The cycle-3 refactor delivers what the PR description promises: the binary owns the wizard UX, install.sh is a thin downloader, the section order matches the spec, the agent-setup drop-through returns control, and the previously flagged blockers (B1: HOME pinning; B2: root-without-sudo) remain fixed. All 40 install-script tests pass; targeted Rust tests pass; no clippy warnings; no fmt drift.
The five non-blockers above are minor (closing-message phrasing trim, step-6 banner not reprinted, source-grep scope, early DKIM key write, gating-criterion drift). None are correctness or security issues; none block merge. The implementer should consider addressing #1 (reprint final banner with terminal step-6 state) and #2 (closing-message trailing "emails") in a follow-up since they are user-facing.
Recommended merge commit message
feat: end-to-end install.sh + binary-owned setup wizard
Move the operator-facing setup UX into the `aimx setup` binary so the
same flow runs whether the operator hit the installer one-liner or
invoked `aimx setup` directly. The binary now owns the welcome banner,
the six-step checklist (☐ → ◐ → ☑ / ☒ via `term::step_glyph`, with
ASCII fallback), the agent-setup drop-through, and the closing message.
Sections render in spec order: Preflight → Domain & DNS → TLS → Trust →
Install → MCP. Re-entry on a configured host marks steps 3, 4, 5 as
skipped (☒) instead of re-running. Drop-through uses `Command::status()`
so control returns after the agent-setup TUI exits and the closing
message prints after step 6.
`install.sh` is reduced to a thin downloader: parse args → two-line
"AIMX installer" banner → platform/need checks → AIMX_DRY_RUN early
exit → `ensure_sudo` (now BEFORE the GitHub tag-resolution call so a
non-root user without sudo bails before any network roundtrip) →
`resolve_sudo_prefix` (`SUDO=""` when already root or sudo absent;
`SUDO="sudo"` otherwise) → download + extract → install binary →
`backup_existing_config` → `exec ${SUDO} aimx setup </dev/tty`
(non-TTY fallback for CI). Upgrade path unchanged — non-interactive
binary swap + service restart, reusing the outer `stop_service` /
`start_service` helpers.
Tests: 40/40 in `tests/install_sh.sh` (syntax, source-guard, helper
removal regression, fail-fast-before-network, `parse_args` equals-form,
root-without-sudo SUDO=""); new Rust tests for `SETUP_STEPS` wording,
`Checklist` mutations, `term::step_glyph` (TTY + non-TTY), welcome /
final banner structure, section-order invariant, re-entrant skipped-
step contract, `AgentSetupOutcome` dispatch, and `build_agent_setup_argv`
shape (including never-implicit-`--dangerously-allow-root`). Updates
`book/installation.md` and `book/setup.md` to reflect the binary's
ownership of the wizard.
…rify re-entrant gating Five non-blocker items from the cycle-3 review on PR #153: setup.rs (run_setup): - Reprint print_final_banner AFTER drop_through_to_agent_setup returns, so the operator actually sees step 6 flip from ◐ to ☑ / ☒ before the closing message. Previously the banner was only printed BEFORE the drop-through with step 6 forced to Running, and the resolved step-6 state was never visible. - Restore the trailing word "emails" in the closing message so it reads "...emails from @<DOMAIN> emails." per the spec. - Add a comment above the trust-step skip-gate documenting why it uses `config_path.exists()` instead of the composite `already_configured` used by steps 3 and 5: the goal is to never overwrite an operator's prior trust decision, even on partial- install re-entries where `already_configured` would be false. - Add CAVEAT comments above the source-grep section-order and re-entrant-skip tests pinning the assumption that section_header and checklist.set calls remain literal top-level call sites in run_setup. Future helper extractions need to either inline the helper for the test or convert to stdout-capture invariants. - Add closing_message_preserves_trailing_emails_word test that locks in both the source template and the rendered substring. - Add run_setup_reprints_final_banner_after_drop_through test that asserts run_setup contains at least two print_final_banner calls and that the second one comes AFTER drop_through_to_agent_setup. book/setup.md: - Sync the closing-message reference with the trailing "emails" word. book/troubleshooting.md: - Add a "Restarting setup from scratch" section explaining that aborting at the trust prompt now leaves the DKIM keypair on disk (because step 2 generates it early so the DNS guidance table can print the actual `p=` value). A hard reset means clearing both /etc/aimx/config.toml and /etc/aimx/dkim/, plus optionally /etc/ssl/aimx/ for a fresh TLS cert. Verification: - cargo fmt clean - cargo clippy --all-targets -- -D warnings clean - cargo test: 1357 + 95 + ... all green; setup::tests 309 (was 307) - sh -n install.sh / dash -n install.sh clean - sh tests/install_sh.sh: 40/40 pass - AIMX_DRY_RUN=1 sh install.sh prints the thin banner and the binary handoff line as before.
|
Addressed the five non-blockers from the cycle-3 re-review (commit
Verification (all green):
|
uzyn
left a comment
There was a problem hiding this comment.
Re-Review: PR #153 — cycle 4 (3cac841)
Changes Overview
Cycle-4 commit (3cac841) addresses the 5 non-blockers from the cycle-3 review. Touches src/setup.rs (+96 lines, including 2 new tests), book/setup.md (1 line), and book/troubleshooting.md (+20 lines). No install.sh changes, so the previously-resolved cycle-2 blockers (HOME pinning for agent-setup, ${SUDO} prefix for root-without-sudo) cannot regress in this diff.
Resolution of Cycle-3 Non-Blockers
NB1 — Step 6 final state never visible — FIXED.
Verified runtime order in run_setup (src/setup.rs:2300-2326):
- line 2304:
print_final_banner(&pre_handoff)with step 6 forced toRunning(◐) — interim banner before handoff. - line 2313:
drop_through_to_agent_setup(...)— control returns when the TUI exits (becauseCommand::status()is used, notexec). - line 2319:
checklist.set(6, mcp_state)resolves step 6 toDone/Skippedbased on the agent-setup outcome. - line 2323:
print_final_banner(&checklist)reprints the banner with step 6 in its terminal state — operator now sees the ◐→☑/☒ flip. - line 2326:
print_closing_message(&domain)fires last.
The test run_setup_reprints_final_banner_after_drop_through (src/setup.rs:5462-5497) source-greps the run_setup body to assert (a) ≥2 print_final_banner( call sites, and (b) at least one of them comes AFTER drop_through_to_agent_setup(. Both assertions hold against the current source. The test passes (cargo test setup::tests::run_setup_reprints_final_banner_after_drop_through).
NB2 — Closing message dropped trailing "emails" — FIXED.
src/setup.rs:2354-2357 now reads:
println!(
"Your agents now have access to set up, send and receive emails from {} emails.",
term::highlight(&format!("@{domain}"))
);Test closing_message_preserves_trailing_emails_word (src/setup.rs:5435-5459) asserts the literal substring "emails from @{domain} emails." against both the source template and the rendered output (with domain = "agent.example.com" substituted). Not a fuzzy match — it's rendered.contains(&format!("emails from @{domain} emails.")), which is a literal-string String::contains call. book/setup.md:76 agrees with the code.
NB3 — Re-entrant gating uses two different criteria — DOCUMENTED.
src/setup.rs:2244-2250 adds a 7-line comment above the trust-step skip-gate explaining that trust uses config_path.exists() (broader, preserves operator decision) while steps 3 and 5 use the composite already_configured (stricter, only skips when full prior install exists). Code-clarity fix only; no behavior change. Reads naturally and is in the right place.
NB4 — Early DKIM keys leak on Ctrl-C — DOCUMENTED.
book/troubleshooting.md:40-58 adds a "Restarting setup from scratch" section that explains why the DKIM keypair persists after an aborted run (step 2 generates it early so the DNS guidance table can render the actual p= value), and gives a copy-pasteable hard-reset recipe (rm -rf /etc/aimx/config.toml /etc/aimx/dkim/, optional /etc/ssl/aimx/). Operator-facing guidance is sufficient; the underlying lifecycle quirk is intentional.
NB5 — Source-grep tests don't validate runtime order — DOCUMENTED.
CAVEAT comments added above run_setup_emits_section_headers_in_spec_order (src/setup.rs:5731-5739) and run_setup_reentrant_path_marks_tls_trust_install_skipped (src/setup.rs:5784-5790). Each comment names the assumption (literal top-level call sites in run_setup) and states the recovery path if the assumption ever breaks (inline the helper for the test, OR convert to a stdout-capture invariant). Documentation-only, no behavior change.
Verification
All checks clean against 3cac841:
cargo fmt -- --check— cleancargo clippy --all-targets -- -D warnings— cleancargo test— 1357 unit + 95 integration + smaller crates all pass; 0 failures- The two new named tests pass individually:
setup::tests::closing_message_preserves_trailing_emails_word— 1 passedsetup::tests::run_setup_reprints_final_banner_after_drop_through— 1 passed
sh -n install.sh— cleandash -n install.sh— cleansh tests/install_sh.sh— 40/40 passed
No Regressions
The cycle-4 commit only touches book/setup.md, book/troubleshooting.md, and src/setup.rs. The previously-resolved cycle-2 blockers (B1 HOME pinning via runuser -u <sudo_user>, which inherits HOME from the user's passwd entry — equivalent to the sudo -H flag; B2 ${SUDO} prefix for root-without-sudo) live in install.sh and drop_through_to_agent_setup's runuser invocation respectively, and neither is touched.
Summary and Recommended Actions
- Overall verdict: Ready to merge.
- Blockers: none.
- Non-blockers: none — all 5 from the cycle-3 review are resolved or properly documented.
- Nice-to-haves: none worth deferring.
Recommended merge commit message
feat: end-to-end install.sh + binary-owned setup wizard with branded onboarding
Expand the one-line installer from a narrow download-and-place script into a
full onboarding pipeline. install.sh becomes a thin downloader: ensure-sudo
fail-fast (before any GitHub roundtrip), download the tarball, install the
binary, exec sudo aimx setup. The binary owns the welcome banner, the six-step
checklist (☐/◐/☑/☒/✗ glyphs with ASCII fallback), the section ordering
(Preflight → Domain & DNS → TLS → Trust → Install → MCP), the runuser-based
drop-through to aimx agent-setup as $SUDO_USER, and the closing message with
@<DOMAIN> substitution.
Setup is idempotent on re-entry: TLS / DKIM / install steps skip when a prior
install is detected; trust is preserved whenever any config exists. Step 6
fires Command::status() so control returns to the wizard, the final banner
reprints with the resolved step-6 state, and the closing message lands last.
install.sh: HTTPS-only download, --tag/--target/--to/--force flags +
AIMX_VERSION/AIMX_PREFIX/AIMX_DRY_RUN/AIMX_VERBOSE/GITHUB_TOKEN env vars
preserved, atomic upgrade swap (stop_service → mv → start_service) intact,
config.toml backup with $$ collision suffix, ${SUDO} prefix for root-without-
sudo containers, sudo -v with /dev/tty reattached so curl|sh prompts work.
Tests: 1357 unit + 95 integration + 40 install.sh shell tests all green.
sh -n / dash -n clean; cargo fmt / clippy clean.
uzyn
left a comment
There was a problem hiding this comment.
Re-Review: PR #153 — cycle 5 (bcac18d)
Changes Overview
Cycle-5 commit (bcac18d) addresses four UX gaps:
prompt_domainandprompt_trusted_sendersare reworked into a block-prompt → echo →[Y/n]confirm → loop pattern via three new private helpers (prompt_confirm,prompt_confirm_list,read_yn_line). Invalid input no longer cancels — it loops; declining the echo re-prompts; empty trusted-senders surfaces an explicit "Leave hooks disabled? [Y/n]" instead of silently disabling them.- Each
Done/Skippedstep inrun_setupredraws the full six-row checklist via new helpersprint_step_complete(n, &checklist)andprint_step_skipped(n, &checklist, reason)that wrap a status line +print_step_listredraw + bracketing blanks. drop_through_to_agent_setupnow resolvesargv[0]viastd::env::current_exe()instead of the literal/proc/self/exesorunuserno longer re-execs/usr/sbin/runuserand dies withexe: user agent-setup does not exist. On the rarecurrent_exe()failure the wizard surfaces the error and lands step 6 asSkippedrather than hardcoding/usr/local/bin/aimx(which would trip theinstall_service_file_has_no_silent_fallbackregression guard).agent_tui::renderheader is reworded to "Setting up MCP integration for AI agents for<USER>." + sub-line "Select which AI agents you want to set up AIMX MCP for:". The "Space toggles..." hint moved below the rows, right above the cursor.interact'sclear_last_linesis bumped fromrows.len() + 2torows.len() + 4to match the new layout (3-line header band + N rows + 1-line trailing hint). Username flows in viaOption<&str>(resolved at therun_tuicallsite viaenv.caller_username()), with a literal<user>fallback when the lookup returns None. The doc-comment example inagent_tui.rsis updated to match.
14 new tests added (1357 → 1371 unit tests); install.sh tests grew 40 → 45 between cycle 4 and cycle 5 (the +5 are from ac36cf5's install.sh keyboard-hang fix, not from bcac18d).
Verification
All checks clean against bcac18d:
cargo fmt -- --check— cleancargo clippy --all-targets -- -D warnings— cleancargo test— 1371 unit + 95 integration + others all pass; 0 failures;setup::tests::prompt_*family runs green; newagent_tui::tests::render_*cases passsh tests/install_sh.sh— 45/45 pass (the +5 vs cycle 4 are fromac36cf5's[ -t 0 ]gating fix forcurl|sh, unrelated to cycle 5)
Resolution of Cycle-5 scope
1. Block-form prompt + echo + [Y/n] confirm — DONE
prompt_domain (src/setup.rs:1848-1883) wraps the read in a loop. The block form is:
Enter the domain you want to use for email
(e.g. agent.example.com).
>
Then the operator's input goes through validate_domain (failure prints ✗ <reason> and continues — no abort), through prompt_confirm(reader, "Domain", &domain) (echo + [Y/n]), and finally through the existing read_yn_line for the DNS-access sanity check. Sanity-check kept its existing inline question form deliberately, which I think is correct — the secondary "Do you control this domain?" gate is supplementary to the echo+confirm pattern, not redundant with it.
prompt_trusted_senders (src/setup.rs:1771-1816) adopts the same shape: block prompt → > cursor → on Ok(senders) non-empty, prompt_confirm_list echoes Trusted senders: + bullet list + Confirm? [Y/n]; on Ok(empty), prints the warning then read_yn_line(reader, "Leave hooks disabled?") so empty input is no longer a silent commit. On invalid sender, the existing MAX_TRUSTED_SENDERS_ATTEMPTS budget still applies; the cycle-5 reset of attempt = 0 on operator-decline is correct (only invalid-parse entries should burn the budget; "let me try a different list" is operator-driven and shouldn't trip the abort threshold).
The three helpers (prompt_confirm, prompt_confirm_list, read_yn_line) all use the same !buf.trim().chars().next().is_some_and(|c| c == 'n' || c == 'N') form, which means:
- blank Enter → YES (default)
y/yes/ any input not starting withn/N→ YESn/no/nope/N→ NO
Consistent with previous behavior; confirmed by the six dedicated prompt_confirm_yn_* tests at lines 4858-4906.
2. Per-step checklist redraw — DONE
print_step_complete and print_step_skipped (src/setup.rs:2112-2134) emit blank line + status line + blank + 6-row checklist + blank. Six call-sites in run_setup (steps 1, 2, 3, 4, 5; some Skipped) all route through these helpers. The print_step_complete_emits_full_checklist test (line 4894) source-greps to verify (a) the helper redraws the full list and (b) no straggling raw println!("Step N complete.") survived in run_setup.
This adds ~54 lines of output across the 6 steps (9 lines × 6) plus the welcome+final banner redraws (~12 lines). It's verbose but matches the implementer's stated "operators see ☐ flip to ☑/☒ as a clear summary after every section" UX promise. I flagged this in the brief as "potential redraw spam" — I think it's fine as-is for a one-shot wizard run; it would be different if it ran in a tight loop.
3. runuser /proc/self/exe bug fix — DONE, deviation is sound
drop_through_to_agent_setup (src/setup.rs:2547-2636) now resolves argv[0] in our process context via std::env::current_exe() (line 2575). The deviation from the plan's unwrap_or_else(_ -> /usr/local/bin/aimx) fallback is the right call:
- The codebase already has a
install_service_file_has_no_silent_fallbackregression guard (src/setup.rs:5658-5754) that AST-walkssetup.rsand asserts the literal/usr/local/bin/aimxdoes NOT appear outside#[cfg(test)]modules. The plan's hardcoded fallback would have tripped it. - The graceful-skip path (lines 2576-2590) is consistent with how
get_aimx_binary_path(line 419) already handlescurrent_exe()failures — that one propagates via?and aborts service-file install. The drop-through can't abort because step 6 is best-effort (operator can re-run agent-setup manually); skipping with a guidance message is the right operator-visible behavior. - The operator-visible message is actionable: prints a
warning:line with the underlyingio::Error, then a prompt-mark line "Runaimx agent-setupas<sudo_user>to wire aimx into Claude Code, Codex, etc." — covers the recovery path. - On a healthy box,
current_exe()essentially never fails (it's a thin wrapper around/proc/self/exereadlink on Linux; failure paths are chroot without/procmounted, or sandboxes that block the syscall). So the only behavior change vs. the spec'd fallback is in scenarios where the spec's fallback would have also been wrong (the literal/usr/local/bin/aimxmay not exist under customAIMX_PREFIX).
The renamed test build_agent_setup_argv_uses_resolved_exe_path (src/setup.rs:5887-5926) replaces the old ..._proc_self_exe_shape and asserts (a) std::env::current_exe() is called somewhere in setup.rs, and (b) the literal PathBuf::from("/proc/self/exe") does NOT appear inside the drop_through_to_agent_setup body. Both assertions are scoped tightly enough not to false-positive on doc comments elsewhere.
4. agent_tui::render header rewording — DONE, deviation is acceptable
render (src/agent_tui.rs:339-396) now takes username: Option<&str> directly rather than &dyn AgentEnv. The interact function threads it through from run_tui's call site (line 131-132), where env.caller_username() is invoked once. The deviation from the plan's "thread &dyn AgentEnv" is fine:
AgentEnvis a trait with many methods, only one of which (caller_username) is needed inrender. Threading the entire trait would couplerenderto the widerAgentEnvsurface for no benefit.- Resolving once at the
run_tuicall site is correct —caller_username()is fixed for the lifetime of the process (it'sgetpwuid(geteuid())), so re-resolving on every redraw would be wasteful. - The
Option<&str>shape signals "may be absent" at the type level. Theunwrap_or("<user>")fallback at line 345 producesSetting up MCP integration for AI agents for \`.` when the passwd lookup returns None.
The <user> fallback isn't ideal (the user brief asked for "your user" or omit-entirely). It's not wrong — the angle-bracket form is a recognizable placeholder convention — but it's not maximally graceful either. Non-blocker, flagged below.
The redraw-bounds bump in interact (src/agent_tui.rs:294: rows.len() + 2 → rows.len() + 4) is correct: the new layout is banner-line + sub-line + blank + N rows + hint = N + 4. Verified by reading render body.
The doc-comment example at lines 14-25 matches the live render output exactly (including the Setting up MCP integration for AI agents for \ubuntu`.` header, the sub-line, the blank, the rows, the blank, and the trailing hint with the arrow).
New issues found
NB1 — <user> fallback when caller_username() returns None — Non-blocker
render falls back to a literal <user> token (src/agent_tui.rs:345):
Setting up MCP integration for AI agents for `<user>`.
This is a recognizable placeholder convention but reads slightly awkwardly. Two readable alternatives:
term_handle.write_line(&term::header("Setting up MCP integration for AI agents.").to_string())?whenusername.is_none()— drop the trailingfor X.clause entirely.- Replace
<user>withyour userso the sentence reads naturally.
In practice this branch is essentially never taken (every container with passwd-resolvable user hits the Some(...) path), so this is purely cosmetic for the long tail of edge cases. Non-blocker.
NB2 — prompt_confirm first-character-only check accepts typos starting with y/anything-else — Non-blocker (informational, pre-existing)
The check !buf.trim().chars().next().is_some_and(|c| c == 'n' || c == 'N') means nope / no / n → NO, but anything else → YES, including typos like s (slip on the keyboard), q, 1, etc. The existing prompt_domain_* tests document this contract via prompt_confirm_yn_parses_no_word (covers no\n) but no test covers a deliberate typo. Pre-existing pattern in the codebase (the previous prompt_domain confirm logic was identical), so the cycle-5 commit is just inheriting it. Non-blocker.
NB3 — prompt_confirm_list and prompt_confirm have copy-pasted parser logic — Non-blocker (style)
Both helpers parse the response with the same first-char-of-trim is_some_and(|c| c == 'n' || c == 'N') chain. So does read_yn_line. Consider extracting a parse_yn_default_yes(buf: &str) -> bool helper. Three call sites is the threshold where extraction starts paying for itself. Non-blocker; readability tweak only.
NB4 — Output stream consistency: status messages on stdout, warnings on stderr — Non-blocker (informational)
print_step_complete / print_step_skipped use println! (stdout). The current_exe() failure path in drop_through_to_agent_setup uses eprintln! for the warning but println! for the recovery prompt. This is consistent with how the rest of run_setup handles things — happy-path narration on stdout, errors on stderr — so no change needed. Flagging only because the user brief asked me to spot-check.
Mergeable status
gh pr view 153 --json mergeable,headRefOid returns:
mergeable: "CONFLICTING"mergeStateStatus: "DIRTY"headRefOid: "bcac18d84601955a582b300c7232ab84638db7c0"
The PR currently conflicts with main. This is not a code-quality issue in the cycle-5 commit; it's because main advanced during the cycle. Operator action: rebase onto current main and force-push. No code change needed.
Things that remain fixed from earlier cycles
Spot-checked — all previously-resolved blockers from cycles 2-4 remain in place:
- HOME pinning (cycle-2 B1):
runuser -u $SUDO_USER -- <exe> agent-setupinherits HOME fromgetpwuid(SUDO_USER).pw_dirperrunuser's standard behavior, equivalent tosudo -H -u. No regression. - Root-without-sudo (cycle-2 B2):
${SUDO}prefix ininstall.shstill resolved viaresolve_sudo_prefix. No regression. - TTY-stdin-isatty gate (cycle-3 fix in
ac36cf5):[ -t 0 ]gating on</dev/ttyredirection still in place. No regression. - Step-6 final state visible (cycle-3 NB1):
print_final_banner(&checklist)reprint afterdrop_through_to_agent_setupreturns, at the same call-site (src/setup.rsaround line 2459). Still in place. - Closing-message trailing "emails" (cycle-3 NB2):
"...emails from @<DOMAIN> emails."still inprint_closing_message. - Section ordering (cycle-3): six
section_header(N, SETUP_STEPS[N-1])calls in spec order. No regression. build_agent_setup_argvno implicit--dangerously-allow-root(cycle-3): regression guard still asserted.
Summary and Recommended Actions
- Overall verdict: Ready to merge — pending rebase.
- Blockers: none.
- Non-blockers:
- NB1 (
<user>placeholder fallback) — cosmetic - NB3 (copy-paste in three confirm helpers) — readability tweak
- NB1 (
- Nice-to-haves: none worth deferring.
- Mergeable note: PR is currently
CONFLICTING/DIRTYagainstmain. Rebase and force-push before merging — no code changes required.
Recommended merge commit message
feat: end-to-end install.sh + binary-owned setup wizard with confirm-after-prompt UX
Expand the one-line installer from a narrow download-and-place script into
a full onboarding pipeline. install.sh becomes a thin downloader: ensure-
sudo fail-fast (before any GitHub roundtrip), download the tarball,
install the binary, exec sudo aimx setup. The binary owns the welcome
banner, the six-step checklist (☐/◐/☑/☒/✗ glyphs with ASCII fallback),
the section ordering (Preflight → Domain & DNS → TLS → Trust → Install →
MCP), the runuser-based drop-through to aimx agent-setup as $SUDO_USER,
and the closing message with @<DOMAIN> substitution.
Setup is idempotent on re-entry: TLS / DKIM / install steps skip when a
prior install is detected; trust is preserved whenever any config exists.
Step 6 fires Command::status() so control returns to the wizard, the
final banner reprints with the resolved step-6 state, and the closing
message lands last.
Operator UX:
- prompt_domain and prompt_trusted_senders use a block-prompt + echo +
[Y/n] confirm + loop pattern. Invalid input re-prompts (no abort);
declining the echo re-prompts; empty trusted-senders surfaces an
explicit "Leave hooks disabled?" rather than silently disabling them.
- Each Done / Skipped step in run_setup redraws the full six-row
checklist so progress is visible after every section.
- drop_through_to_agent_setup resolves argv[0] via std::env::current_exe()
instead of the literal /proc/self/exe (runuser would otherwise re-exec
/usr/sbin/runuser and parse "agent-setup" as a username).
- agent_tui header reworded to "Setting up MCP integration for AI agents
for `<user>`." with a "Select which AI agents you want to set up AIMX
MCP for:" sub-line; the Space/Enter/q hint moved below the rows.
install.sh: HTTPS-only download, --tag/--target/--to/--force flags +
AIMX_VERSION/AIMX_PREFIX/AIMX_DRY_RUN/AIMX_VERBOSE/GITHUB_TOKEN env vars
preserved, atomic upgrade swap (stop_service → mv → start_service)
intact, config.toml backup with $$ collision suffix, ${SUDO} prefix for
root-without-sudo containers, sudo -v with /dev/tty reattached so
curl|sh prompts work, [ -t 0 ] gating so direct `./install.sh` invocations
don't double-attach /dev/tty (which would deadlock under sudo use_pty).
Tests: 1371 unit + 95 integration + 45 install.sh shell tests all green.
sh -n / dash -n clean; cargo fmt / clippy clean.
… + agent-setup orchestration Expand the one-line installer from a narrow download-and-place script into the full onboarding flow. Prints a branded welcome banner with a six-item checklist, escalates to sudo once up front (with /dev/tty reattached so `curl | sh` still gets a password prompt), downloads and installs the binary, runs `sudo aimx setup` for preflight / DNS / TLS / trust / service install, drops back to the invoking user for `aimx agent-setup`'s interactive MCP picker, and closes with a domain-aware example prompt. Preserves the existing upgrade path (service stop -> atomic swap -> restart), all flags and env vars (--tag / --target / --to / --force, AIMX_VERSION / AIMX_PREFIX / AIMX_DRY_RUN / AIMX_VERBOSE / GITHUB_TOKEN), and HTTPS-only download policy. Backs up any pre-existing /etc/aimx/config.toml to config.toml.bak-YYYYMMDD-HHMMSS before setup runs; DKIM keys and TLS certs are left in place so deliverability survives re-runs. Add tests/install_sh.sh which sources install.sh under INSTALL_SH_TEST=1 (new guard that suppresses main auto-invocation) and unit-tests detect_invoker, backup_existing_config, ensure_sudo, parse_args, plus an AIMX_DRY_RUN=1 smoke that confirms the banner, step list, and absence of filesystem changes. Script stays POSIX sh (no bashisms) and passes both `sh -n` and `dash -n`. Update book/installation.md to reflect the new end-to-end flow and the config.toml backup behavior.
- Drop into invoker with `sudo -H -u <user> env HOME=<passwd-home>` so `aimx agent-setup` writes MCP config under the invoker's home instead of /root on distros that don't force env_reset in sudoers. - Resolve a $SUDO prefix once in main() and thread it through every privileged call site (backup_existing_config, run_aimx_setup, extract_domain, stop_service/start_service, upgrade mv/install). Scripts running as root on containers without sudo now succeed. - Refactor upgrade path to reuse stop_service/start_service helpers (removes the inlined sudo sh -c state machine and restores the "no init system" diagnostic). - ensure_sudo surfaces an explicit error when sudo -v fails instead of a silent set -e abort. - Append $$ to config.toml backup suffix so concurrent runs cannot collide within the same second. - Tests: cover run_agent_setup_as_invoker (asserts -H / -u / HOME pinning via a sudo argv-recording shim), SUDO="" on root-without-sudo, --flag=value equals-form of parse_args, extract_domain, and print_closing_message.
The welcome banner now uses ☐ / ◐ / ☑ / ☒ / ✗ glyphs (with [ ] / [~] /
[x] / [-] / [!] ASCII fallbacks for non-UTF-8 locales) and is reprinted
once at the end with the final per-step states, so operators see ☐ flip
to ☑ / ☒ as a clear summary. Each of run_aimx_setup, the upgrade path,
and run_agent_setup_as_invoker mutates STEP{1..6}_STATE at the right
boundaries (running → done / skipped / error) so both the closing banner
and the upgrade-path banner reflect what actually happened.
Locale detection caches the UTF-8 decision in _AIMX_UTF8 once at import
time so banner prints stay cheap. Color emission still goes through
_ui_color_enabled, so NO_COLOR=1 and non-TTY stderr keep producing
plain glyphs.
Tests grow from 49 → 70: glyph coverage in both UTF-8 and ASCII-fallback
locales, set_step / print_step_list state mutation reflected in output,
print_welcome_banner asserts only pending glyphs render initially, and
print_final_banner asserts the post-run states render with the
completion title.
The shell script previously owned the welcome banner, the six-step
checklist, the agent-setup invocation, and the closing message. That
left two real problems: install.sh had no way to flip individual boxes
☐→☑ as `aimx setup`'s sections completed, and step 6 was being
double-invoked because the binary already drops through to
`aimx agent-setup` on its own.
This commit moves all of that into `aimx setup` so the binary works the
same end-to-end whether you ran it via the installer or directly.
setup.rs:
- Adds Checklist + SETUP_STEPS plus print_welcome_banner /
print_final_banner / print_step_list. Sections render in spec order:
Preflight → Domain & DNS → TLS → Trust → Install → MCP, with each
section header reading the literal spec wording.
- Reorders run_setup so the trust prompt comes after TLS and the DNS
guidance + verify loop sit under "Set up domain and DNS".
- On re-entry (cert + DKIM + service all present) steps 3, 4, 5 mark
☒ skipped instead of re-running.
- Switches drop_through_to_agent_setup from CommandExt::exec to
Command::status() so control returns to `aimx setup` and the closing
message can fire AFTER the agent-setup TUI exits. Step 6 ticks ☑ on
zero exit, ☒ on non-zero or `$SUDO_USER` unset (warning, not fatal).
- Adds a closing message with `@<DOMAIN>` substitution and the
`claude -p "..."` example. Removes the old single-line success
banner.
term.rs:
- Adds StepState + step_glyph: ☐ pending / ◐ running / ☑ done /
☒ skipped / ✗ error on TTY, [ ] / [~] / [x] / [-] / [!] non-TTY.
install.sh:
- Drops print_welcome_banner / print_final_banner / print_step_list /
set_step / _step_glyph / _supports_unicode / STEP{N}_TITLE,STATE /
run_aimx_setup / run_agent_setup_as_invoker / extract_domain /
print_closing_message. The binary owns all of this now.
- Replaces them with a thin two-line "AIMX installer" banner.
- Moves ensure_sudo earlier in main() so a non-root invoker without
sudo bails BEFORE the GitHub tag-resolution roundtrip. Dry-run is
honored first so unprivileged auditors can still audit.
- Fresh-install tail is now a single `exec ${SUDO} aimx setup
</dev/tty` (with non-TTY fallback). Backup-existing-config still
runs immediately before the exec.
- Upgrade path stays non-interactive; binary swap + service restart
only.
tests/install_sh.sh:
- Drops tests for removed helpers; adds a fail-fast test that
confirms ensure_sudo bails before the GitHub network call when
sudo is missing on a non-root box.
- Updates the dry-run smoke to expect the thin banner (no checklist).
book/installation.md and book/setup.md updated to reflect the binary's
ownership of the wizard UX, the spec section ordering, and the closing
message.
…rify re-entrant gating Five non-blocker items from the cycle-3 review on PR #153: setup.rs (run_setup): - Reprint print_final_banner AFTER drop_through_to_agent_setup returns, so the operator actually sees step 6 flip from ◐ to ☑ / ☒ before the closing message. Previously the banner was only printed BEFORE the drop-through with step 6 forced to Running, and the resolved step-6 state was never visible. - Restore the trailing word "emails" in the closing message so it reads "...emails from @<DOMAIN> emails." per the spec. - Add a comment above the trust-step skip-gate documenting why it uses `config_path.exists()` instead of the composite `already_configured` used by steps 3 and 5: the goal is to never overwrite an operator's prior trust decision, even on partial- install re-entries where `already_configured` would be false. - Add CAVEAT comments above the source-grep section-order and re-entrant-skip tests pinning the assumption that section_header and checklist.set calls remain literal top-level call sites in run_setup. Future helper extractions need to either inline the helper for the test or convert to stdout-capture invariants. - Add closing_message_preserves_trailing_emails_word test that locks in both the source template and the rendered substring. - Add run_setup_reprints_final_banner_after_drop_through test that asserts run_setup contains at least two print_final_banner calls and that the second one comes AFTER drop_through_to_agent_setup. book/setup.md: - Sync the closing-message reference with the trailing "emails" word. book/troubleshooting.md: - Add a "Restarting setup from scratch" section explaining that aborting at the trust prompt now leaves the DKIM keypair on disk (because step 2 generates it early so the DNS guidance table can print the actual `p=` value). A hard reset means clearing both /etc/aimx/config.toml and /etc/aimx/dkim/, plus optionally /etc/ssl/aimx/ for a fresh TLS cert. Verification: - cargo fmt clean - cargo clippy --all-targets -- -D warnings clean - cargo test: 1357 + 95 + ... all green; setup::tests 309 (was 307) - sh -n install.sh / dash -n install.sh clean - sh tests/install_sh.sh: 40/40 pass - AIMX_DRY_RUN=1 sh install.sh prints the thin banner and the binary handoff line as before.
The post-install handoff was unconditionally re-pointing stdin at
/dev/tty when the controlling terminal existed:
if has_tty; then
exec ${SUDO} aimx setup </dev/tty
That breaks on modern Ubuntu/Debian where sudo ships with `Defaults
use_pty`. With use_pty, sudo allocates a pty and bridges its own
stdin↔pty-master to the child. When the operator runs `./install.sh`
directly (stdin already the terminal), redirecting `</dev/tty` opens a
separate fd to the same controlling terminal — sudo bridges from that
fd while keystrokes are queued on the original fd 0, the child blocks
on read forever, and the foreground process group disconnect prevents
SIGINT from reaching it. Operator sees a dead keyboard and a
non-responsive Ctrl+C.
Standard installer pattern (rustup, nvm, brew, OpenClaw): only
re-attach /dev/tty when stdin is NOT already a terminal — i.e., the
curl|sh case. Gate on `[ -t 0 ]` instead of /dev/tty existence. Apply
the same fix to ensure_sudo's `sudo -v` site since the latent defect
is identical (passwordless sudo on the OVH image masked it during
testing). Drop the now-unused has_tty helper.
Add five regression assertions in tests/install_sh.sh:
- `[ -t 0 ]` appears in 2+ sites
- unredirected `exec ${SUDO} aimx setup` form is present
- redirected `exec ${SUDO} aimx setup </dev/tty` appears exactly once
- redirected handoff is gated behind `elif [ -e /dev/tty ]`
- `sudo -v </dev/tty` is gated behind `elif [ -e /dev/tty ]`
Test count 40 → 45, all pass; sh -n / dash -n clean;
AIMX_DRY_RUN=1 smoke clean.
Four UX fixes for the `aimx setup` and `aimx agent-setup` flows: - prompt_domain and prompt_trusted_senders now use the standard block-prompt + echo + [Y/n] confirm + loop pattern. The operator sees the parsed value (or list) before committing, can decline to re-enter, and empty trusted-senders surfaces an explicit "leave hooks disabled?" confirm rather than silently disabling them. - Each `Done` / `Skipped` step in `run_setup` now redraws the full six-row checklist via new helpers `print_step_complete` and `print_step_skipped` so progress is visible after every section, not just at the welcome and final banners. - `drop_through_to_agent_setup` resolves its argv[0] via `std::env::current_exe()` instead of passing the literal string "/proc/self/exe". `runuser` was re-execing /proc/self/exe in its own context — which points at /usr/sbin/runuser — causing the child runuser to parse "agent-setup" as a username and die with "exe: user agent-setup does not exist". Step 6 now lands as the multi-select TUI drop-through it was always meant to be. - The agent-setup TUI header is reworded to "Setting up MCP integration for AI agents for `<user>`." with a sub-line "Select which AI agents you want to set up AIMX MCP for:". The username comes from the existing `AgentEnv::caller_username()` (getpwuid(geteuid())) helper. The "Space toggles, Enter confirms, q cancels." hint moved below the rows, right above the cursor.
- agent_tui::render: when caller_username() returns None, omit the trailing `for `<user>`` clause entirely instead of rendering the literal placeholder. Header becomes the clean "Setting up MCP integration for AI agents." Add a render_omits_for_clause_when_ username_is_none test. - setup::parse_yn: extract the !buf.trim().chars().next().is_some_and( |c| c == 'n' || c == 'N') idiom shared by prompt_confirm, prompt_confirm_list, and read_yn_line into a single private helper. Add a parse_yn_handles_default_yes_and_n_first_char test covering blank, y/Y/yes/random, n/N/no/NO/nope, and leading whitespace.
bcac18d to
d80c89e
Compare
|
Cycle 6 — review polish + rebase on Non-blockers addressed
Rebase Branch was rebased on top of current Verification
PR is now |
uzyn
left a comment
There was a problem hiding this comment.
Re-review (cycle 6)
This re-review verifies cycle 6's claims: NB1 (<user> placeholder fallback) and NB3 (parse_yn extraction) are addressed, the rebase on main did not regress earlier cycles, and CI is now green.
Changes Overview
Cycle 6 polishes two non-blocker items from cycle 5 and rebases the branch on top of main:
src/agent_tui.rs: whencaller_username()returnsNone, the header now readsSetting up MCP integration for AI agents.(no trailingfor ...clause, no<user>placeholder). New testrender_omits_for_clause_when_username_is_noneasserts the forbidden placeholder string is not present in therenderbody and smoke-renders withusername=None.src/setup.rs: extracted a singleparse_yn(&str) -> boolhelper;prompt_confirm,prompt_confirm_list, andread_yn_linenow all delegate to it. New testparse_yn_handles_default_yes_and_n_first_charcovers blank/Enter/whitespace/y/Y/n/N/leading-whitespace cases.- 40 commits from
main(mostly dependabot bumps +aed7c21"friendly aimx uninstall") came in via rebase. Thebook/setup.mdconflict was resolved in favor of the PR's wording, which accurately describes the cycle-5 wizard behavior (welcome banner, six-section ordering, re-entry skipping steps 3/4/5, step-6 drop-through viaCommand::status()).
Verification
NB1 (agent_tui placeholder) — fixed in src/agent_tui.rs:347-350. The match username produces a clean fallback string with no for ... clause when None. The new test (render_omits_for_clause_when_username_is_none) walks the render function body and asserts the literal <user> token is absent — a strong, source-grep-based regression guard. Verified.
NB3 (parse_yn extraction) — fixed in src/setup.rs:1719-1733 (helper definition) with all three callers (prompt_confirm, prompt_confirm_list, read_yn_line) now collapsed to Ok(parse_yn(&buf)). The new unit test exercises 18 input cases including blank, whitespace-only, y, Y, yes, Yeah, ok, 1, n, N, no, NO, nope\n, n, \tno\n, and N . The default-yes / first-char-n semantics are preserved exactly. Verified.
Rebase preservation spot-checks:
install.sh:776still has the cycle-4[ -t 0 ]keyboard-hang fix (lines 776–784:if [ -t 0 ]; then exec ${SUDO} aimx setup; elif [ -e /dev/tty ] && [ -r /dev/tty ]; ...).src/setup.rsstill hasSETUP_STEPS,print_welcome_banner,print_final_banner,print_step_list,print_step_complete, andprint_step_skipped. Each section usessection_header(N, SETUP_STEPS[N-1])thenprint_step_complete(N, &checklist)to redraw the full ticking checklist.src/setup.rs:2600usesstd::env::current_exe()with explicit error handling that surfaces a warning and returnsAgentSetupOutcome::Skippedon failure — no literal/proc/self/exeis hardcoded into the runtime path. The book documentation reference to/proc/self/exeis descriptive, not a runtime constant.src/agent_tui.rs:127-130uses the spec wordingSetting up MCP integration for AI agentsand now correctly handlesNoneper NB1.
book/setup.md conflict resolution — the PR's wording in lines 58–82 correctly describes the cycle-5 behavior (welcome banner + six-line checklist, six sections in spec order, ☒ skipped semantics on re-entry for steps 3/4/5, Command::status() drop-through, the closing message format with @<DOMAIN> substitution). Accurate.
Build / lint / test:
cargo fmt --check— clean.cargo clippy --all-targets -- -D warnings— clean.cargo test --bin aimx— 1373 passed, 0 failed, 3 ignored (matches claim).cargo test --test '*'— 95 passed, 0 failed, 5+1+1 ignored = 7 ignored across integration suites (matches claim of 95 integration).sh tests/install_sh.sh— 45/45 pass (the script reports 45, slightly higher than the 40 claimed in the PR body — additional cycle-5 tests landed; not a problem).
CI status (gh pr checks 153): all required checks PASS — Analyze (rust), Analyze (actions), Analyze (javascript-typescript), CodeQL, build, core-tests, docs-build, verifier-tests, dependabot. deploy is skipping (preview-only, expected on PRs). The earlier UNSTABLE state was simply checks-pending, not failures.
Mergeable state: MERGEABLE, mergeStateStatus: CLEAN (was CONFLICTING / UNSTABLE before this cycle).
Summary
- Overall verdict: Ready to merge
- Blockers: none
- Non-blockers: none
- Nice-to-haves: none
Both cycle-5 non-blockers are genuinely fixed (not just claimed) — NB1 has a new regression test that walks the render function body to forbid the placeholder string, and NB3's parse_yn is exercised by 18 input cases. The rebase preserved every cycle-1..5 change I spot-checked: the keyboard-hang [ -t 0 ] gate, the welcome banner / 6-step checklist / per-step print_step_complete redraw, the current_exe() fix in drop_through_to_agent_setup, and the agent_tui spec wording. The book/setup.md conflict resolution accurately describes the actual cycle-5 wizard behavior.
Recommended merge commit message
feat: end-to-end install.sh with welcome banner, sudo escalation, setup + agent-setup orchestration (#153)
Make `curl -fsSL https://aimx.email/install.sh | sh` drive the full
provisioning flow in one continuous wizard instead of leaving the
operator with two more manual commands. The binary owns the welcome
banner, the six-section checklist (Preflight → DNS → TLS → Trust →
Install → MCP), and the closing message. `install.sh` becomes a thin
downloader: fail fast on missing sudo before any network call,
download the tarball, install the binary, then `exec sudo aimx setup`
with stdin handled correctly for both interactive shells and
`curl|sh` pipelines. Re-entry preserves existing TLS / trust /
service config (steps 3-5 mark skipped) so `aimx setup` is also the
"wire another agent" path. Step 6 re-execs `aimx agent-setup` as
`$SUDO_USER` via `runuser` and `Command::status()` so control returns
to the wizard for the closing message; `current_exe()` resolves the
real binary path with explicit error handling instead of relying on a
`/proc/self/exe` constant. Trust prompt validates entries with up to
five re-prompts, defaults to empty list under `AIMX_NONINTERACTIVE=1`
with a logged warning. Adds `parse_yn` helper, Unicode/non-TTY step
glyphs, and 1373 unit + 95 integration + 45 install_sh tests.
When `aimx setup` drops through to `aimx agent-setup` via `runuser`, the child inherits sudoers' `secure_path` — typically just system bin dirs. Per-user installs of `claude`, `codex`, `gemini`, `goose`, etc. under `~/.local/bin`, `~/.npm-global/bin`, or nvm shims aren't on that PATH, so template registration fails even though the operator's interactive shell sees the binary fine. Add a `probe_login_shell` fallback that spawns `sh -lc 'command -v "$1"'` (POSIX, not bash — works on Alpine / busybox), passes the binary name positionally as `$1` to keep argv injection structurally impossible, and filters the result to absolute paths pointing at executable regular files. Wire it into `AgentEnv::probe_binary`'s default impl after the fast `probe_path` walk; the fast path still wins when the binary lives in `$PATH`. Refresh the operator-facing error message to reflect that we now search both PATH and the login shell.
uzyn
left a comment
There was a problem hiding this comment.
Cycle-7 re-review: PR #153
Cycle-7 scope (sole focus)
AgentEnv::probe_binary now falls back to a POSIX login-shell probe (sh -lc 'command -v "$1"' sh <name>) when probe_path returns None. All cycle-7 verification points check out:
Argv-injection invariant — confirmed safe
The spawn is Command::new("sh").arg("-lc").arg("command -v \"$1\"").arg("sh").arg(binary_name). With sh -lc <cmd>, the next positional arg becomes $0 and subsequent args become $1, $2, …. The binary name therefore reaches command -v as a single literal argv entry — never re-parsed by the shell. probe_login_shell_immune_to_argv_injection pins this by attempting a sentinel-deletion payload (foo; rm -f <sentinel>) and asserting the sentinel survives. The test would fail loudly if the implementation ever switched to string-interpolation, so the property is genuinely locked in.
Edge cases — all handled and (mostly) tested
- Empty
binary_name→ guard at line 788–790 returnsNone. Tested byprobe_login_shell_returns_none_for_empty_name. - Builtins (
cd) →command -vprints the bare namecd, which fails thepath.is_absolute()filter at line 809. Tested byprobe_login_shell_rejects_non_absolute_resolution. - Functions / aliases → same path as builtins (bare name, not absolute). Same coverage.
- Non-existent binary → exit status non-zero / empty stdout. Both branches covered (
output.status.success()at 798,raw.is_empty()at 803). Tested byprobe_login_shell_returns_none_for_missing. - Path resolves but isn't a regular file →
meta.is_file()filter at 813. (Untested directly but theis_file()predicate is unambiguous.) - Path is a regular file but not executable →
nix::unistd::access(..., X_OK)filter at 816–818 catches it. (Untested directly but the access call is the standard X_OK probe.)
Sequential probe ordering — fast path stays fast
probe_path(name).or_else(|| probe_login_shell(name)): when $PATH contains the binary, probe_path returns Some and or_else is never evaluated, so no sh is spawned. The fallback only fires when the inherited $PATH actually misses, which in practice is once per register_template call (≤7 agents).
Error message update — correctly threaded
src/agent_setup.rs:1377 now emits "Could not find '{bin}' in $PATH or your login shell. Install it, then re-run 'aimx agent-setup {}'." The pinning test at src/agent_setup.rs:2503 (msg.contains("Could not find 'claude' in $PATH or your login shell")) was updated to match. No stale assertions remain.
Integration-style fallback test — faithful runuser simulation
probe_binary_default_falls_back_when_path_misses is the right shape. It:
- Builds a tmp
home/bin/<sentinel>outside any system PATH. - Writes
~/.profilethat prependshome/binto PATH. - Sets the parent
$PATHto asecure_path-shaped string (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin) that letsshspawn but cannot resolve the sentinel. - Sets HOME to the tmp dir so
sh -lsources the test's~/.profile. - Calls
env.probe_binary(sentinel)and asserts the sentinel resolves.
If probe_login_shell were absent or unreachable from probe_binary, probe_path would miss against the secure-like PATH and expect() would panic — so this is a real positive signal, not a tautology. Env restoration before the assertions prevents leakage. serial_test::serial correctly serializes the env mutation.
Other cycle-7 details
sh(POSIX) notbash— works on Alpine/musl. ✓- Returns canonicalised
PathBufmatchingprobe_path's shape. ✓ - Doc comment on
probe_login_shelldescribes the use case (runuser secure_path) and the safety invariants (positional-arg passing). ✓
Prior cycles' work — spot check
install.sh:248,776still gates the</dev/ttyredirect on[ -t 0 ](keyboard-hang fix from cycle 5).src/setup.rsstill has welcome banner,print_step_completeper-step redraw,current_exe()for the runuser drop-through, and theparse_ynextraction.src/agent_tui.rs:348-349still rendersSetting up MCP integration for AI agents for \`.with theNonefallback toSetting up MCP integration for AI agents.`
No regressions to prior-cycle artifacts.
Verification (run locally)
| Check | Result |
|---|---|
cargo fmt --check |
clean |
cargo clippy --all-targets -- -D warnings |
clean |
cargo test --bins |
1379 passed (was 1373 → +6 cycle-7 tests) |
cargo test --tests |
95 passed |
sh -n install.sh |
clean |
dash -n install.sh |
clean |
sh tests/install_sh.sh |
45/45 pass |
| CI checks | All 9 pass (build, core-tests, verifier-tests, docs-build, Analyze rust/actions/javascript-typescript, CodeQL); deploy correctly skipped |
| Mergeable | MERGEABLE |
Verdict: Ready to merge
No blockers, no non-blockers, no nice-to-haves. The cycle-7 fallback is implemented correctly, tested at the right granularity, doesn't regress the fast path, and tightens the error message to reflect the new behavior.
Fall back to sh -lc 'command -v X' for agent CLI path discovery
When `aimx setup` drops through to `aimx agent-setup` via runuser, the
child inherits sudoers' secure_path. Per-user installs of `claude`,
`codex`, `gemini`, `goose`, etc. under ~/.local/bin, ~/.npm-global/bin,
or nvm shims aren't on that PATH, so template registration fails even
though the operator's interactive shell sees the binary fine.
AgentEnv::probe_binary now falls back to a POSIX login-shell probe
(sh -lc 'command -v "$1"' sh <name>) after probe_path misses. The
binary name is passed positionally as $1, structurally preventing argv
injection; results are filtered to absolute paths pointing at
executable regular files. Refresh the operator-facing error message to
reflect that we now search both PATH and the login shell.
…ude Code detect fix
Five fixes bundled around the agent-wiring surface:
- Reword the `aimx setup` root-required error message. The old text
pointed operators at `sudo aimx setup <domain>`, which both pushed
the (now-hidden) positional and missed the common case where
someone running aimx-already-up just wanted to wire an agent. The
new error explicitly names both paths: `sudo aimx setup` for the
server, `aimx agents setup` (no root needed) for the agent.
- CLI restructure: agent management moves to `aimx agents
setup|remove|list` (plural noun + verbs, matching `mailboxes` /
`hooks`). Singular `aimx agent` works as a clap alias, and the
historical hyphenated `aimx agent-setup` flat verb is preserved as
a hidden legacy alias so existing scripts and `install.sh`
invocations keep working. `aimx agents list` is a thin alias of
`aimx agents setup --list`.
- TUI polish: status suffix is now `(AIMX MCP wired)` for
consistency with the "wire" verb used elsewhere; agent name uses
`term::dim` and the status suffix uses a new one-shade-deeper
`term::very_dim` helper for two-shade visual hierarchy. After the
operator presses Enter on the picker the TUI renders a
confirmation screen listing each task with the right verb
("Install AIMX MCP for X" / "Re-install AIMX MCP for Y") and
reads `Confirm? [Y/n]` before any files are written. On `n` the
picker re-enters with previous selections preserved.
- Bug fix: `dest_contains_aimx_entry` only scanned top-level files
in the destination directory. Claude Code's plugin layout puts
everything under `.claude-plugin/` and `skills/` subdirectories,
so a fully-wired install always classified as `InstalledNotWired`.
Special-case `claude-code` to check for `.claude-plugin/plugin.json`
filename presence (mirroring goose's existing `aimx.yaml` case).
- New `aimx agents remove <agent>` command. Removes plugin files
under `$HOME`, drops the `invoke-<agent>-<username>` template over
UDS, and prints an agent-specific cleanup hint pointing at any
external command the operator still needs to run (for example
`claude mcp remove aimx`). Refuses root with the same
`--dangerously-allow-root` escape hatch as `agents setup`.
Drops `<domain>` from the setup positional in --help (kept hidden as
a back-compat input). Updates `book/setup.md`,
`book/installation.md`, and `book/agent-integration.md` for the new
command names and the renamed status label.
uzyn
left a comment
There was a problem hiding this comment.
Re-review: PR #153 cycle 8 — agent CLI restructure + bundled fixes
Cycle-8 scope verification
The five bundled changes are implemented well. Quick rundown of each, before getting to the blocker:
aimx setuproot error rewording —src/setup.rs:2215-2217now names both paths explicitly:sudo aimx setupfor the server,aimx agents setupfor the agent path. Old<domain>reference is gone everywhere I checked. Good.- CLI restructure —
src/cli.rs:163-216defines the newAgents(AgentsCommand)(withagentclap alias) plus hidden legacyAgentSetupandAgentCleanupcommands.src/main.rs:97-160dispatches all three.agents listis a thin alias ofagents setup --list(verified atmain.rs:149-159). The four invocation forms (agent-setup,agent setup,agents setup,agents setup claude-code --list) all parse and dispatch correctly — confirmed bytests/integration.rs:5557-5612. - TUI polish —
src/agent_tui.rs:366-394definesrender_confirmationwith the right Install / Re-install verb selection.src/agent_tui.rs:138-166runs the picker in aloopand re-enters with selections preserved onn. The(AIMX MCP wired)and(not detected)suffixes are wrapped interm::very_dim(agent_tui.rs:461,467); the agent display name usesterm::dimfor non-selectable rows (agent_tui.rs:456). - Claude Code wired-detection bug fix —
src/agent_setup.rs:935-948special-casesclaude-codeto check.claude-plugin/plugin.json, mirroring the gooseaimx.yamlcase. Pinned bydetect_install_state_installed_wired_for_claude_code_plugin_layoutatagent_setup.rs:2228. The fixture update ondetect_install_state_installed_wired_when_dest_exists(agent_setup.rs:2196-2220) is correct, not a regression cover-up — the old test wrote a top-levelplugin.jsonthat no real install ever produces. aimx agents remove—src/agent_remove.rsrefuses root with--dangerously-allow-rootescape hatch (agent_remove.rs:55-57), errors clearly on unknown agent (agent_remove.rs:59-64), unconditionally removes plugin files viaagent_cleanup::run_with_envin--full --yesmode (agent_remove.rs:95-109), unregisters the template via the same path, and prints an agent-specific cleanup hint viaremoval_hint(agent_remove.rs:131-153). The delegation toagent_cleanup::run_with_envis sound —MaskRootEnv(agent_remove.rs:158-199) cleanly skips the inner per-user refusal when--dangerously-allow-rootis set without doubly applying the root override.removal_hint_covers_every_registered_agentenforces no orphan agents.
term::very_dim (src/term.rs:94-105) uses truecolor(88, 88, 88) and degrades to bright black on non-truecolor terminals; tested by very_dim_is_dimmer_than_dim_when_color_forced_on and very_dim_degrades_to_named_color_without_truecolor. Fallback chain is sound.
cargo fmt --check, cargo clippy --all-targets -- -D warnings, and the local test suite all pass: 1393 unit + 101 integration + 45 install.sh tests, matching the implementer's reported counts.
No prior-cycle regressions — install.sh's [ -t 0 ] gate, print_step_complete, current_exe() runuser fix, parse_yn, <user> placeholder fallback, and sh -lc PATH fallback are all still in place.
Blocker
B1. CI docs-build job fails — book/ references the now-hidden legacy verbs
The docs-build job (scripts/check-docs.sh) fails on this PR. Reproduced locally:
check-docs: book/ has command-line `aimx <verb>` references where <verb> is not in `aimx --help`:
- aimx agent
- aimx agent-cleanup
- aimx agent-setup
Because AgentSetup, AgentCleanup, and the agent alias are all marked hide = true in cli.rs, they no longer appear in aimx --help. The lint script's ALLOWED_NON_VERBS (in scripts/check-docs.sh) currently allowlists only hook and mailbox aliases, not the agent legacy aliases, so every aimx agent-setup ..., aimx agent-cleanup ..., aimx agent setup ... reference inside book/ code blocks is flagged. The book has these references in many places — book/getting-started.md:70,72,102,105, book/setup.md:84, book/agent-integration.md (multiple), book/cli.md:205,220, book/mcp.md:23,183,185, book/faq.md:162, book/configuration.md:131, book/security.md:221, book/release-notes.md:24,25, book/README.md:52.
This is a Blocker because:
- CI is red on the PR. Mergeable state aside, the project's own merge-quality bar is
core-tests+docs-buildgreen. - The fix is mechanical: either add
agent,agent-setup,agent-cleanuptoALLOWED_NON_VERBS(since the implementer has explicitly committed to keeping these as hidden back-compat aliases) or rewrite the book references to useaimx agents setup/aimx agents remove. The PR's own commit message says these legacy aliases are retained "so existing scripts andinstall.shinvocations keep working" — adding them to the allowlist matches that intent, and is one-line.
The lint exists exactly to catch doc drift like this; it's working as intended. The PR needs to make a deliberate call: keep the hidden legacy verbs documented (extend the allowlist) or migrate the book over to the new names. Either is fine, but CI must go green before merge.
Summary
- Verdict: CHANGES_REQUESTED (posted as
--commentbecause GitHub blocks self---request-changes; treat as request-changes) - Resolved cycle-8 scope items: all five bundled changes verified end-to-end.
- Blockers: B1 —
docs-buildCI failure on the legacyaimx agent-setup/aimx agent-cleanup/aimx agentreferences inbook/. - Non-blockers: none.
- Nice-to-haves: none.
CI status: core-tests pass (3m10s), verifier-tests pass, build pass, Analyze (actions) and Analyze (javascript-typescript) pass; docs-build fail; Analyze (rust) pending.
Once the book references / allowlist are reconciled and docs-build goes green, this is ready to merge. The implementation work itself is solid — the cycle-8 scope is completely fulfilled and the testing is appropriately tight (the new detect_install_state_installed_wired_for_claude_code_plugin_layout regression test alone earns a lot of trust). The one missing piece is the doc-lint update.
…nonical form Extend `scripts/check-docs.sh`'s `ALLOWED_NON_VERBS` to allow `agent`, `agent-setup`, and `agent-cleanup` — the cycle-8 restructure marked these as `hide = true` clap aliases so they no longer appear in `aimx --help`, but the lint flagged every reference in `book/`. Audit the book and switch primary documentation to the canonical `aimx agents setup` / `aimx agents remove` / `aimx agents list` forms, keeping a single back-compat note per page where the legacy hyphenated form is intentionally surfaced. Headers, troubleshooting sections, and the cli.md reference now mirror the actual CLI shape post-cycle-8.
uzyn
left a comment
There was a problem hiding this comment.
Re-review (cycle 9): PR #153 — docs-build blocker fix (commit 64cb8ce)
Cycle-9 scope verification
The cycle-8 blocker (docs-build CI failing on hidden legacy aimx agent / aimx agent-setup / aimx agent-cleanup references) is resolved. The implementer took the hybrid path I sketched — extending ALLOWED_NON_VERBS AND rewriting the book's primary forms — which is the right call here:
scripts/check-docs.sh(lines 69–72): addsagent,agent-setup,agent-cleanuptoALLOWED_NON_VERBSwith a comment pointing at this PR. The diff is narrow — only the allowlist array changed, the matching logic and the rest of the lint are untouched, so there's no risk of new false negatives. I re-read the whole file and confirmed that.- Book audit:
book/agent-integration.md,book/cli.md,book/configuration.md,book/faq.md,book/getting-started.md,book/hook-recipes.md,book/hooks.md,book/mcp.md,book/README.md,book/release-notes.md,book/security.mdall switch primary documentation fromaimx agent-setup→aimx agents setup(andagent-cleanup→agents remove). The cli.md reference grew a### aimx agents listsection and rewrote the### aimx agents remove <agent>reference to match the new shape.
bash scripts/check-docs.sh passes locally. cargo fmt --check, cargo clippy --all-targets -- -D warnings, cargo test, and sh tests/install_sh.sh all green: 1393 unit + 101 integration + 1 uds_authz + 45 install.sh tests, matching the cycle-8 counts (no regressions). All earlier-cycle work (CLI restructure, TUI confirmation screen, Claude Code .claude-plugin/plugin.json detection, agent_remove module, aimx setup root error rewording) is still intact — I spot-checked the relevant lines.
CI is fully green: core-tests, verifier-tests, docs-build, build, Analyze (actions), Analyze (javascript-typescript), Analyze (rust), CodeQL all pass; deploy skipping (PR-only); no pending checks.
Non-blocker
NB1. book/cli.md:235 documents a -y, --yes flag on aimx agents remove that doesn't exist
The new "### aimx agents remove <agent>" section in book/cli.md lists this row:
| `-y`, `--yes` | Skip the interactive confirmation prompt. |
But AgentsCommand::Remove in src/cli.rs:430-437 only declares agent: String and dangerously_allow_root: bool — there is no --yes / -y flag. The actual binary rejects it:
$ aimx agents remove --yes claude-code
error: unexpected argument '--yes' found
The flag was carried over from the legacy agent-cleanup --yes (which exists on AgentCleanup at cli.rs:213-215 and gates the --full interactive prompt). The new aimx agents remove deliberately runs in unconditional full: true, yes: true mode (src/agent_remove.rs:96-99) — there's no prompt to skip, so the flag is both undocumented-on-the-binary and conceptually moot.
Two fixes are equivalent:
- Drop the row from
book/cli.md— match the implementation. The "Refuses to run as root" + "removes plugin files unconditionally" model is fine; users don't need an opt-out. - Add the flag to
AgentsCommand::Remove— make the doc match. But there's nothing to opt out of, so this would be a no-op flag. Option (1) is cleaner.
Why this is a non-blocker, not a blocker: the wrong doc row doesn't crash anything, doesn't cause data loss, and doesn't bypass any security. A user copying the flag from the doc gets a clean clap-level error and can resolve it by re-reading the help. But it's still a doc accuracy bug introduced by this commit and the PR's stated intent is to make book/ accurate — keeping it merits a one-line fix.
NB2 (informational, not requested as a fix). Binary error message still embeds the legacy verb
In src/agent_setup.rs:1403-1406 and src/agent_setup.rs:1444-1447, the aimx serve is not running daemon-down error message is built with the literal token aimx agent-setup {} .... The book troubleshooting heading at book/agent-integration.md:449 correctly mirrors that error string verbatim, so the heading is accurate. But the binary itself is now telling users to run a hidden legacy alias instead of the canonical form. The PR's commit message explicitly says it "switch[ed] primary documentation to the canonical aimx agents setup" — that's true for the narrative copy; the binary's own user-visible error string was not updated.
Not in scope for this commit and the lint already passes, so I'm flagging it as informational only — keep or fix at your discretion. If you do touch it, the troubleshooting heading needs to track whatever the binary emits.
Summary
- Verdict: CHANGES_REQUESTED (posted as
--commentbecause GitHub blocks self---request-changes; treat as request-changes) — one non-blocker doc accuracy fix, no marker emitted. - Resolved: B1 (docs-build CI failure) — fully fixed.
- Still unresolved: none from prior cycles.
- Blockers: none.
- Non-blockers: NB1 (
book/cli.mddocuments nonexistent-y, --yesflag onaimx agents remove). - Nice-to-haves: NB2 (binary's daemon-down error embeds the legacy
aimx agent-setupverb; the troubleshooting heading mirrors it correctly, so this is purely cosmetic — flagging only because the cycle-9 scope was "make the book canonical").
Once NB1 is reconciled (drop the row, or add the flag), this is ready to merge. The cycle-9 implementation work is otherwise correct, narrow, and well-justified — the allowlist comment cites this PR explicitly, which makes future audits trivial.
uzyn
left a comment
There was a problem hiding this comment.
Re-review verdict: Ready to merge
Commit 291c3af ("Drop nonexistent --yes flag from aimx agents remove docs") is a 1-line deletion of a single row in the aimx agents remove flag table in book/cli.md. It cleanly addresses the lone non-blocker (NB1) from the prior pass.
Resolved
- NB1 (
book/cli.md:235): the documented-y, --yesrow foraimx agents removeis gone. The remaining flags for that command —<agent>positional and--dangerously-allow-root— now exactly match theAgentsCommand::Removedefinition insrc/cli.rs(lines 430–437).
Verification (all green)
book/cli.mdaimx agents removetable is now consistent withsrc/cli.rsAgentsCommand::Remove.- Other
-y, --yesreferences inbook/cli.md(lines 46 / 129 / 193) belong touninstall,mailboxes delete, andhooks delete, each of which actually accepts--yes. None are spurious. - The
aimx agent-cleanup ... --fullreferences inbook/cli.md:237andbook/agent-integration.md:102correctly describe the hidden legacy alias (AgentCleanup { full, yes }insrc/cli.rs:205), so they are not drift. bash scripts/check-docs.sh→ okcargo fmt --check→ cleancargo clippy --all-targets -- -D warnings→ cleancargo test→ 1393 unit + 101 integration + 1 uds_authz pass, 0 fail (3 / 5 / 1 ignored — pre-existing)sh tests/install_sh.sh→ 45/45 passgh pr checks 153→ CodeQL pass, Analyze (actions / javascript-typescript / rust) pass, build pass, core-tests pass, docs-build pass, verifier-tests pass;deployskipping is expected for non-main.mergeable: MERGEABLE,state: OPEN.
Still unresolved
None.
New findings
None. The change is a pure doc-table deletion with zero logic surface; no regression risk introduced.
NB2 from the prior pass (legacy aimx agent-setup nudge in src/agent_setup.rs error text) was explicitly informational and out of scope for this cycle, per the re-review brief. Not flagged here.
Recommended merge commit message
feat: end-to-end install.sh with welcome banner, sudo escalation, setup + agent-setup orchestration
Make `curl -fsSL https://aimx.email/install.sh | sh` drive the full
onboarding flow end-to-end. The binary now owns the welcome banner, the
six-step checklist (Preflight -> Domain & DNS -> TLS -> Trust ->
Install -> MCP), the agent-setup drop-through, and the closing message;
install.sh is a thin downloader that fails fast on missing sudo,
backs up any pre-existing /etc/aimx/config.toml, installs the binary,
and execs `sudo aimx setup`.
Restructures agent management into `aimx agents setup | remove | list`
(plural noun + verbs, matching `mailboxes` / `hooks`). Singular
`aimx agent` and the historical hyphenated `aimx agent-setup` /
`aimx agent-cleanup` forms are preserved as hidden legacy aliases so
existing scripts and `install.sh` invocations keep working. Adds new
`aimx agents remove <agent>` that removes plugin files under `$HOME`,
drops the `invoke-<agent>-<username>` template over UDS, and prints an
agent-specific cleanup hint.
Other fixes bundled in:
- pin HOME for agent-setup drop-through (sudo -H -u $invoker env HOME=...)
- make sudo optional when already root (resolve_sudo_prefix)
- use `[ -t 0 ]` instead of /dev/tty existence to gate stdin redirect
(fixes keyboard-hang on distros where sudo defaults to use_pty)
- fix `runuser` exec path via std::env::current_exe()
- fix Claude Code wired-detection (special-case `.claude-plugin/plugin.json`)
- fall back to `sh -lc 'command -v X'` for agent CLI path discovery
- confirm-after-prompt UX for domain and trusted-senders
- per-step checklist redraw on every section transition
Test surface: 1393 unit + 101 integration tests pass; 45/45 install.sh
shell tests pass; cargo fmt clean; clippy clean (-D warnings); docs-build
green.
…nonical aimx agents setup Renames source files src/agent_*.rs → src/agents_*.rs (preserved via git mv), drops the hidden Command::AgentSetup / Command::AgentCleanup clap variants and the singular `agent` alias on the Agents subcommand, and migrates every literal occurrence (source code, tests, embedded LLM-facing primers shipped inside agent plugins, README, the book, hook-templates/defaults.toml, install.sh) to `aimx agents setup`. Internal renames: drop_through_to_agent_setup → drop_through_to_agents_setup, build_agent_setup_argv → build_agents_setup_argv (the constructed argv now pushes "agents" then "setup" instead of one "agent-setup" token), AgentSetupOutcome → AgentsSetupOutcome. Tightens scripts/check-docs.sh by dropping the agent-setup / agent / agent-cleanup back-compat allowlist entries so docs that reintroduce the legacy form fail CI. Replaces the legacy-alias-keeps-working integration test with two negative-confirmation tests asserting that `aimx agent-setup` and `aimx agent setup` now error with "unrecognized subcommand".
uzyn
left a comment
There was a problem hiding this comment.
Cycle 10 Re-review — APPROVED
All scope items from cycle 10 land cleanly. No blockers, no non-blockers, no nice-to-haves.
Cycle-10 scope verified
- Hard back-compat break — confirmed by running the binary locally:
aimx agent-setup --list→error: unrecognized subcommand 'agent-setup'aimx agent setup --list→error: unrecognized subcommand 'agent'aimx agent-cleanup→error: unrecognized subcommand 'agent-cleanup'aimx agents setup --list→ still works (canonical path)
- Renames detected as renames by git —
git diff --find-renames 61b74dd^..61b74dd --name-status | grep '^R'returns:R089 src/agent_cleanup.rs → src/agents_cleanup.rsR093 src/agent_remove.rs → src/agents_remove.rsR098 src/agent_setup.rs → src/agents_setup.rsR098 src/agent_tui.rs → src/agents_tui.rs
All four detected as renames with high similarity. Blame is preserved.
- Zero straggler legacy references. All four greps from the prompt return empty (the only matches for the legacy literals are inside
tests/integration.rs:5590,5606doc comments and assertion fixtures that pin the negative-confirmation behavior). scripts/check-docs.shallowlist tightened.agent,agent-setup, andagent-cleanupare dropped fromALLOWED_NON_VERBS; onlyhookandmailboxremain.bash scripts/check-docs.shpasses.- Embedded primers (
agents/common/aimx-primer.md,agents/common/references/{hooks,mcp-tools}.md). Allaimx agents …references use the canonical plural form. The unrelatedmailbox: "agent"mentions are mailbox-name examples, not CLI references — no ambiguity for downstream LLMs. build_agents_setup_argvargv shape correct.src/setup.rs:2540-2554pushes"agents"then"setup"as separate args. Test fixtures atsrc/setup.rs:5905-5927assertargv.len() == 3without data-dir andargv.len() == 5with--data-dir, with indices 1 ="agents", 2 ="setup".tests/e2e_agent_flow.rsmigrated and passing.cargo test --test e2e_agent_flow -- --ignoredruns and passes 1/1. Argv shape uses["agents", "setup"](line 362) and["agents", "remove"](line 595). The bareagents remove claude-codecorrectly relies ondo_remove's implicitfull: true, yes: true(verified atsrc/agents_remove.rs:97-98).agents_cleanup::runcorrectly removed. Module doc updated to "Internal cleanup core". Onlypub(crate) fn run_with_envremains, called byagents_remove::do_remove. NoCommand::AgentSetup/Command::AgentCleanupclap variants remain incli.rs/main.rs. TheAgents(AgentsCommand)variant carries no singularagentalias attribute.- Negative-confirmation tests are precise.
tests/integration.rs:5596-5604and5611-5619assert bothfailure()exit ANDpredicate::str::contains("unrecognized subcommand")— correctly catches the clap-error shape, not just any non-zero exit. - Prior-cycle work intact. Spot-checked:
parse_yn(setup.rs:1726),print_step_complete(setup.rs:2139),current_exe()use (setup.rs:419),term::very_dim(term.rs:94),sh -lcfallback (agents_setup.rs:526). All in place.
CI status (all green)
| Check | Status |
|---|---|
| CodeQL | pass |
| Analyze (actions) | pass |
| Analyze (javascript-typescript) | pass |
| Analyze (rust) | pass |
| build | pass |
| core-tests | pass |
| docs-build | pass |
| verifier-tests | pass |
Local verification
cargo fmt --check— cleancargo clippy --all-targets -- -D warnings(root + verifier) — cleancargo test— 101 integration tests pass; 1393 binary unit tests pass; 5 + 1 ignored (pre-existing)cargo test --test e2e_agent_flow -- --ignored— 1/1 passbash scripts/check-docs.sh—ok (no unknown 'aimx <verb>' command lines in book/)sh tests/install_sh.sh— 45/45 passgh pr view 153 --json mergeable,mergeStateStatus—MERGEABLE/CLEAN
Summary
The cycle 10 migration is complete and surgical. The hard back-compat break is correctly enforced at clap parse time, all four file renames are tracked as renames (so blame is intact), every operator-facing string in shipped artifacts (source, tests, embedded LLM primers, README, book, hook templates, install.sh) has been migrated to the canonical aimx agents setup form, the docs lint allowlist is tightened so future regressions are caught, and the e2e test exercises the new argv shape end-to-end.
Verdict: Ready to merge.
Recommended merge commit message
feat: end-to-end install.sh with welcome banner, sudo escalation, setup + agents setup orchestration
- Move setup wizard UX into the binary; thin install.sh to a downloader
- Six-step interactive checklist with per-step redraw and Unicode glyphs
- TUI-driven `aimx agents setup` with per-agent wired detection
- New `aimx agents remove <agent>` with template + plugin cleanup over UDS
- Drop `aimx agent-setup` / `aimx agent-cleanup` / `aimx agent <verb>` legacy aliases
- Rename source files src/agent_*.rs → src/agents_*.rs (blame preserved via git mv)
- Tighten scripts/check-docs.sh allowlist; CI catches future doc drift
- `current_exe()` runuser fix; `sh -lc 'command -v X'` fallback for per-user agent CLIs
- Claude Code wired-detection special case (.claude-plugin/plugin.json)
Summary
Today
install.shis a narrow, POSIX-sh download-and-place script. It fetches the release tarball, dropsaimxinto/usr/local/bin/, handles the upgrade-swap path, then exits with a "→ next: sudo aimx setup" hint. The operator has to run two more commands by hand (sudo aimx setup <domain>, then re-login as a user and runaimx agent-setup) before anything works.This PR makes
curl -fsSL https://aimx.email/install.sh | shdrive the whole thing in one go: a branded welcome banner, a six-item progress list, transparentsudoescalation, ordered execution, and a closing "ask your LLM to set up a mailbox" message.After the cycle-3 update, the binary owns the welcome banner, the six-step checklist, the agent-setup drop-through, and the closing message.
install.shbecomes a thin downloader: it fails fast on missing sudo (before any network call), downloads the tarball, installs the binary, andexecssudo aimx setup. The wizard inside the binary does the rest in a single, continuous flow with the right section order.Cycle 3 update
The reviewer-approved scope from the prior pass had three real gaps:
aimx setupdid not match the spec. Before: Trust ran before TLS, DNS verification ran at the end (after DKIM + catchall), and TLS / Trust had no labeled section header at all. Now: each section reads literallyPreflight checks on port 25→Set up domain and DNS→Set up TLS certificate→Set up trust policy→Install AIMX→Set up MCP for agent(s), in that order.aimx setupalready drops through toaimx agent-setupviarunuser, theninstall.shcalledrun_agent_setup_as_invokera second time — the operator saw the TUI twice. The shell-side function is gone now; the binary's drop-through is the single source of truth.install.shprinted the welcome banner + 6-line checklist + final banner, but had no way to flip individual boxes ☐ → ☑/☒ as the binary's sections completed. Now the binary owns the banner + ticking, andinstall.shonly prints a thin two-line "AIMX installer" line.What changed in the cycle-3 commit:
src/setup.rs: addedChecklist+SETUP_STEPS+print_welcome_banner/print_final_banner/print_step_list. Restructuredrun_setupbody to follow spec order with section headers viaterm::header. Each section ticks ☑ on success, ☒ on skip (with a "Step N skipped" line), and the final banner reprints the 1–5 ☑ + step-6 ◐ before the agent-setup TUI takes over. On re-entry (cert + DKIM + service all present), steps 3, 4, 5 mark ☒ skipped.src/setup.rs: switcheddrop_through_to_agent_setupfromCommandExt::exectoCommand::status(). Control returns toaimx setupafter agent-setup exits, so the closing message ("AIMX has been set up successfully…") fires AFTER step 6 — exactly as the spec mandates. Step 6 ticks ☑ on zero exit, ☒ on non-zero or$SUDO_USERunset (warning, not fatal).src/setup.rs: replaced the old single-lineaimx is running for <domain>.banner with the spec's closing message, including the@<DOMAIN>substitution and theclaude -p "..."example prompt.src/term.rs: addedStepState+step_glyph(☐/◐/☑/☒/✗ on TTY,[ ]/[~]/[x]/[-]/[!]non-TTY) — colors: pending=dim, running=yellow, done=green, skipped=cyan, error=red.install.sh: deletedprint_welcome_banner/print_final_banner/print_step_list/set_step/_step_glyph/_supports_unicode/STEP{N}_TITLE,STATE/run_aimx_setup/run_agent_setup_as_invoker/extract_domain/print_closing_message. Replaced the welcome banner with a thin two-line "AIMX installer" line. Movedensure_sudoearlier inmain()so a non-root invoker without sudo bails BEFORE the GitHub tag-resolution roundtrip — dry-run is honored first so unprivileged auditors can still audit. The fresh-install tail is now a singleexec ${SUDO} aimx setup </dev/tty(with non-TTY fallback). Backup-existing-config still runs immediately before the exec.tests/install_sh.sh: dropped tests for removed helpers; added a fail-fast test that confirmsensure_sudoexits with a sudo-named error BEFORE any GitHub network call, by stubbingcurl/wgetto fail loudly and asserting the marker file was never written. Updated the dry-run smoke to expect the thin banner (no checklist).SETUP_STEPSwording,Checklistmutations,term::step_glyph(TTY + non-TTY), the welcome / final banner content and structure, the section-order invariant via source-grep, the re-entrant skipped-step contract, and theAgentSetupOutcomedispatch.book/installation.mdandbook/setup.md: rewritten "What the installer does" / "First-time setup flow" sections to reflect the new flow (thin shell handoff into the binary's wizard, full 6-step checklist owned by the binary, closing message printed by the binary).Verification
cargo fmt -- --check→ cleancargo clippy --all-targets -- -D warnings→ cleancargo test→ 1355 binary unit tests + 95 integration tests pass (3 ignored, 5 ignored — pre-existing)sh -n install.shanddash -n install.sh→ cleansh tests/install_sh.sh→ 40/40 pass (was 34, +6 from new fail-fast + thin-banner tests)AIMX_DRY_RUN=1 sh install.sh→ prints the thin install banner, target, tarball, install path, "would exec aimx setup", no FS changesOut of scope
aimx setupinto discrete subcommands (single-command flow with internal sections is enough)Test plan
cargo fmt --checkcargo clippy --all-targets -- -D warningscargo test(1355 + 95 pass)sh -n install.sh/dash -n install.shsh tests/install_sh.sh(40/40)AIMX_DRY_RUN=1 sh install.shsmokeaimx setupwelcome banner with checklist → 6 sections in spec order → checklist ticks 1→2→3→4→5 → final banner with all five ☑ → drop into agent-setup TUI as$SUDO_USER→ tick 6 → closing message with@<domain>🤖 Generated with Claude Code
Cycle 8 update
Five more fixes around the agent-wiring surface:
aimx setuproot error rewording. The old text pointedoperators at
sudo aimx setup <domain>, which both pushed the(now-hidden) positional and missed the common case where someone
with aimx already running just wanted to wire an agent. The new
error names both paths explicitly:
sudo aimx setupfor theserver,
aimx agents setup(no root needed) for the agent.aimx agents setup|remove|list. Plural nounmailboxes/hooks. Singularaimx agentworks as a clap alias; the historical
aimx agent-setupflatverb is preserved as a hidden legacy alias so existing scripts
and
install.shinvocations keep working.aimx agents listisa thin alias of
aimx agents setup --list. The<domain>positional on
aimx setupis now hidden in--help(stillparses for back-compat).
(AIMX MCP wired)forconsistency with the "wire" verb used elsewhere; agent name uses
term::dimand the status suffix uses a new one-shade-deeperterm::very_dimhelper for two-shade visual hierarchy. After theoperator presses Enter on the picker, the TUI renders a
confirmation screen listing each task with the right verb
(
Install AIMX MCP for X/Re-install AIMX MCP for Y) andreads
Confirm? [Y/n]before any files are written. Onnthepicker re-enters with previous selections preserved.
dest_contains_aimx_entryonly scanned top-level files in the destination directory. Claude
Code's plugin layout puts everything under
.claude-plugin/andskills/subdirectories — top level has zero files in a realinstall — so a fully-wired Claude Code install always classified
as
InstalledNotWired. Now special-cased to check.claude-plugin/plugin.jsonfilename presence, mirroring Goose'sexisting
aimx.yamlcase.aimx agents remove <agent>. Removes plugin files under$HOME, drops theinvoke-<agent>-<username>template over UDS,and prints an agent-specific cleanup hint per agent (for example
Runclaude mcp remove aimxto also unregister the MCP server from Claude Code.). Refuses root with the same--dangerously-allow-rootescape hatch asagents setup.Daemon-down with the plugin files still removed exits
2soscripts can detect the partial cleanup.
Cycle-8 verification
cargo fmt -- --check→ cleancargo clippy --all-targets -- -D warnings→ cleancargo test→ 1393 binary unit tests + 101 integration tests + 1 uds_authz test pass (3 + 5 + 1 ignored — pre-existing)sh -n install.sh/dash -n install.sh→ cleansh tests/install_sh.sh→ 45/45 passAIMX_DRY_RUN=1 sh install.sh→ prints thin install banner, target, tarball, install path, "would exec aimx setup", no FS changesCycle-10 update
Drops every legacy alias and renames the source files to match the canonical
aimx agents …shape:aimx agent-setup/aimx agent-cleanup/aimx agent <verb>. All three legacy paths now error with clap's standard "unrecognized subcommand" message. Tests pin the new failure shape.git mv(blame preserved):src/agent_setup.rs→agents_setup.rs,src/agent_tui.rs→agents_tui.rs,src/agent_remove.rs→agents_remove.rs,src/agent_cleanup.rs→agents_cleanup.rs. Internal symbols renamed in lockstep (AgentSetupOutcome→AgentsSetupOutcome,drop_through_to_agent_setup→drop_through_to_agents_setup,build_agent_setup_argv→build_agents_setup_argv).build_agents_setup_argvargv shape. Now pushes"agents"then"setup"instead of one"agent-setup"token, so therunuserdrop-through insideaimx setupre-execs the canonical verb.aimx agent-setupliteral in source code, tests, embedded LLM primers (agents/common/aimx-primer.mdandagents/common/references/*.md, baked into plugins viainclude_dir!), book pages, the top-level README,CLAUDE.md,hook-templates/defaults.toml, andinstall.shnow readsaimx agents setup. Where the message points at the registry dump (e.g. unknown-agent error), the canonical form isaimx agents list.<agent>positional. The TUI is the documented onboarding path; the positional remains supported in clap and is still shown in scripted /--no-interactivesnippets and per-agent install tables.scripts/check-docs.shallowlist tightened.agent,agent-setup,agent-cleanupare dropped fromALLOWED_NON_VERBS, so any future doc that reintroduces the legacy form fails the docs lint.Verification re-run on this commit:
cargo fmt,cargo clippy --all-targets -- -D warnings,cargo test(1393 binary unit + 101 integration + 1 uds_authz all green; same 3 + 5 + 1 ignored as before),bash scripts/check-docs.sh,sh -n install.sh,dash -n install.sh,sh tests/install_sh.sh(45/45). Final grep over*.rs/*.md/*.sh/*.tomlforaimx agent-setup,aimx agent-cleanup, andaimx agent setupyields zero hits outside the negative-confirmation integration tests that intentionally pin the legacy spellings.