feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227
feat(doctor): add swamp doctor audit preflight diagnostic (swamp-club#156)#1227
Conversation
…#156) Adds a new `swamp doctor` parent command with `doctor audit [--tool <name>]` as its first subcommand. Runs five independent preflight checks against the AI-tool audit integration configured in the repo and reports per-check pass/fail/skip with actionable hints in log and JSON modes. Exits non-zero on any check fail — safe to gate in CI. Checks: - binary-on-path: AI tool's own binary is resolvable on PATH - swamp-binary-on-path: swamp is on PATH (all four tools); for Kiro, also verifies the absolute swamp path baked into the hook file still resolves - agent-config-loadable: per-tool config parses and is well-formed (e.g. Kiro's `tools: ["*"]` regression) - default-agent-set: Kiro workspace default-agent points at `swamp` - recording-smoke-test: end-to-end — synthetic postToolUse payload round-trips through `swamp audit record --from-hook` into today's JSONL The smoke-test writes a real audit row tagged with the reserved command prefix `echo swamp-doctor-smoke-test`; the audit timeline filters that prefix out of the default view (`--include-diagnostic` opts back in). Also: - Extracts `todaysAuditFilePath` / `auditFilePathForTimestamp` / related helpers into `src/domain/audit/audit_path.ts`, consumed by both the JSONL writer and the doctor smoke-test reader — single source of truth for the `commands-YYYY-MM-DD.jsonl` filename format (previously duplicated twice inside jsonl_audit_repository.ts). - Introduces `parseAiToolOrThrow` in `src/cli/ai_tool_parser.ts` for centralised `--tool` validation. - Adds `design/audit.md` + `design/audit-doctor.md` and updates the swamp-repo skill. **Name override:** the issue body requests "please NOT `doctor`"; the team picked `doctor` anyway — the command namespace is meant to grow (future `doctor datastore`, `doctor vault`, etc.), and the name fits that shape. **Scope note:** issue 156 cites three concrete Kiro bugs as motivation (wildcard tools, missing default-agent, hard-coded `execute_bash` normalizer). The `execute_bash` one is **already fixed** upstream of this PR — `src/domain/audit/hook_input.ts:146-150` has `KIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"])`. This PR adds a durable preflight for the whole class of upstream-drift bugs, not a fix for any specific one. The other two bugs are caught by the new checks but not fixed by this PR. **Load-bearing manual verification (ADV-2):** temporarily removing `"shell"` from `KIRO_SHELL_TOOL_NAMES` and recompiling produces a FAIL on `recording-smoke-test` against a freshly-init'd kiro repo, confirming the diagnostic does catch normalizer drift. Reverted before this commit. **UAT coverage:** filed systeminit/swamp-uat#160 (CLI scenarios) and systeminit/swamp-uat#161 (adversarial scenarios). swamp-club has issues disabled — docs gap (new `content/manual/reference/doctor.md` and "verify the integration" section in the four `use-swamp-with-*.md` how-tos) needs to be routed through the team's docs process. 4773 tests passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extracts tool-resolution logic into `resolveTargetTool` and covers: - flag tool wins over marker - marker fallback when no flag - NoToolConfiguredError when both absent - --tool validation rejects garbage - audit-skip tools (codex/copilot/none) accepted as overrides - command is registered as subcommand of doctorCommand - --tool and --repo-dir options present Closes the CLI-level integration gap flagged in the feature PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
CLI UX Review
Blocking
-
NoToolConfiguredErrorshows a stack trace instead of a clean error message (src/domain/audit/doctor/check.ts)NoToolConfiguredErrorextends plainError, notUserError. When a user runsswamp doctor auditin a repo with no tool configured (the most likely first-run scenario), the error propagates tomain.ts→renderError, which branches oninstanceof UserError. Since it's not aUserError, it falls into theelsebranch that logs the full error object including a stack trace:Error: No AI tool configured in .swamp.yaml and no --tool flag provided... at new NoToolConfiguredError (file:///.../check.ts:12:5) at resolveTargetTool (file:///.../doctor_audit.ts:54:11) ...The error message itself is excellent and actionable — the stack trace contradicts it. The
UserErrorclass exists precisely for this: user mistakes that should render cleanly without a stack trace.Fix: Change
class NoToolConfiguredError extends Error→class NoToolConfiguredError extends UserErrorinsrc/domain/audit/doctor/check.ts(and update the import).
Suggestions
-
check-startedevent doubles every check name in log output (src/presentation/renderers/audit_doctor.ts)The log renderer prints
… binary-on-pathoncheck-started, then immediately✓ binary-on-path <message>oncheck-completed. BecausewriteOutputhas no cursor movement, users see two lines per check. Since checks run fast (no long waits), the in-progress…line adds noise without helping readability. Dropping thecheck-startedhandler entirely (or making it a no-op like the JSON renderer) would produce the cleaner output users expect from a doctor-style tool. -
--toolflag inconsistency withrepo init(src/cli/commands/doctor_audit.ts)swamp repo initandswamp repo upgradeexpose-t, --tool(short + long).doctor auditexposes only--tool. Minor, sinceswamp audit recordalso omits-t, so this matches the audit-subcommand convention. Just flagging for awareness when thedoctornamespace grows. -
warnexit-0 may surprise CI users who pass--tool codex(src/cli/commands/doctor_audit.ts)overallStatus: "warn"(all checks skipped) exits 0. A CI gate likeswamp doctor audit --tool codex && echo okwill pass silently. TheOVERALL: WARNis visible in log mode andoverallStatusis in the JSON — but$?can't distinguish it. Documenting this behavior in the command description or adding an--warn-as-failflag would help. Low priority sincecodex/copilot/noneare explicitly no-hook tools.
Verdict
NEEDS CHANGES — one blocking issue: NoToolConfiguredError must extend UserError to avoid showing a stack trace on the most common first-run error.
- NoToolConfiguredError now extends UserError, so the first-run "no tool configured" path renders cleanly via renderError instead of dumping a stack trace. (CLI UX Review blocking finding.) - Drop check-started log handler — checks run fast enough that the in-progress line is visual noise. Users see one line per check (completion) instead of two. (CLI UX Review suggestion.) - Trim swamp-repo SKILL.md description from 1183 to 885 chars so tessl's 1024-char validation passes. Kept all load-bearing triggers (doctor audit, verify audit, preflight, audit log empty); dropped redundant aliases. (Skill Review blocking failure.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
-
No progress indicator during smoke test (
src/presentation/renderers/audit_doctor.ts,check-startedhandler): The comment justifies the no-op as "checks run fast enough." The smoke test is the exception — it spawns a realswamp audit recordsubprocess, which requires binary resolution + file I/O. On cold starts or slow machines this could silently pause for 1–2s between lines and feel like a hang. Consider emitting a dim "running…" (or similar) specifically for the smoke test check, or at minimum revisit the blanket no-op if more slow checks are added underdoctor. -
"OVERALL: WARN" for unsupported tools (
src/domain/audit/doctor/doctor_service.ts:74, renderersummaryLine): Runningswamp doctor audit --tool codex(orcopilot) today results in all checks skipping →0 passed, 0 failed, N skipped — OVERALL: WARN. That status is technically correct but opaque — a user who doesn't know these tools aren't fully covered yet will be confused about what WARN means. A one-line note in the summary (e.g.,hint: all checks were skipped — no audit integration is defined for this tool yet) would clarify intent. Not blocking since the valid tool list in the--toolflag help text does include these values, but it's a discoverability gap. -
--include-diagnosticdescription (src/cli/commands/audit.ts): The description"Include rows written byswamp doctor audit's smoke test (filtered by default)"is accurate, but it's not immediately obvious to a user who hasn't read the doctor audit docs why smoke-test rows would appear in the audit timeline at all. Minor; a brief parenthetical like(sentinel rows prefixedecho swamp-doctor-smoke-test)would help power users correlating raw JSONL with CLI output.
Verdict
PASS — no blocking issues. Error messages are clear and actionable (NoToolConfiguredError is especially good), log/JSON dual-mode is correctly wired, exit codes are right, --tool validation is consistent, and the output format is readable and scannable.
…shold tessl review flagged swamp-extension-model at 86% (below 90% threshold), blocking the Skill Review CI gate on every PR that touches any skill. Pre-existing issue surfaced by PR #1227 triggering a Skill Review run. Changes: - Add explicit "## Development Workflow" with eight numbered steps and per-step validation checkpoints (author → verify registration → smoke test → unit tests → adversarial review → version → dry-run → publish). Addresses the "workflow_clarity" score (2/3 → 3/3). - Tighten "## When to Create a Custom Model" from a verbose multi- paragraph structure into a compact four-step decision list. Tessl score after: description 100%, content 85%, average 92.5%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eator Initial fix for the 86% tessl score was itself verbose (added 34 lines of prose that duplicated content already elsewhere in the file, violating "concise is key" and bloating the file further past the 500-line soft limit). Compact the Development Workflow to link-heavy one-line bullets that point at existing sections (Quick Start, smoke_testing, adversarial review). Net add over pre-PR state: +5 lines instead of +35. Content score holds at 85%, workflow_clarity stays 3/3, total average 92.5%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
CLI UX Review
Blocking
None.
Suggestions
-
warnexit-code semantics in CI:swamp doctor audit --tool none(or codex/copilot) exits 0 withOVERALL: WARNin yellow. The PR description advertises this as "safe to gate in CI", but a user who accidentally passes a no-hooks tool gets a green CI step with a yellow banner — easy to miss. Consider a one-line note in the--tooldescription or thewarnsummary message explaining thatWARNmeans "this tool has no audit hooks; nothing was checked". Not blocking — the check names and messages already imply it. -
audit record --tooldescription stays vague: The siblingswamp audit record --toolstill says"AI tool providing hook input"with no valid-values hint, while the newdoctor audit --toollists them inline. Consistency win ifaudit recordgot the same treatment, but that's outside this PR's scope. -
--include-diagnosticdiscoverability: The new flag is added toswamp audit(hidden command), so its help text is only reachable viaswamp audit --help. Sinceauditis hidden, users who hit a polluted timeline from a smoke test will struggle to find this flag. A mention in therecording-smoke-testhint text ("runswamp audit --include-diagnosticto inspect") would close the loop.
Verdict
PASS — help text is complete, error messages are actionable (NoToolConfiguredError is particularly good), log/JSON output cover all fields, exit codes are correct.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
-
--include-diagnosticis silently defeated by the noise filter —src/domain/audit/audit_service.ts:148-176DIAGNOSTIC_COMMAND_PREFIXis"echo swamp-doctor-smoke-test". The diagnostic filter (line 148–154) correctly passes the row through whenincludeDiagnosticis true. However, the row then hits the noise filter at line 175:isNoiseCommandmatches because"echo"is inNOISE_COMMAND_PREFIXES(line 37), and the command starts with"echo ".Breaking input: Run
swamp audit --include-diagnostic(without--all). Diagnostic rows are passed through the first filter but caught by the second, so they never appear in the timeline. The--include-diagnosticflag is dead without--all.Suggested fix: Exempt diagnostic rows from the noise filter:
const isDiagnosticRow = trimmedCommand.startsWith(DIAGNOSTIC_COMMAND_PREFIX); if (!options.showAll && !isDiagnosticRow && isNoiseCommand(trimmedCommand)) { continue; }
Low
-
Day-boundary TOCTOU in recording smoke test —
src/domain/audit/doctor/checks/recording_smoke.ts:76-92The subprocess writes a row timestamped by
new Date()insideaudit record, and thenreadAuditRowsFromTodaycallstodaysAuditFilePath, which also usesnew Date(). If the twonew Date()calls straddle midnight UTC, the writer's row lands in tomorrow's file while the reader looks in today's file. The check would false-fail.Practically negligible (sub-second window), but worth noting for awareness — especially in CI environments running near midnight UTC.
-
No subprocess timeout in
makeSwampSpawnFn—src/cli/commands/doctor_audit.ts:69-103The spawned
swamp audit record --from-hookprocess has no timeout. If it hangs (e.g., waiting for network in a future code path, or a deadlocked pipe), the doctor command hangs indefinitely. TheAbortSignalis checked between checks (doctor_service.ts:110) but not during a check's subprocess call.Consider passing
signal: deps.abortSignaltoDeno.CommandorAbortSignal.timeout(N)as a safety net. -
Inconsistent error propagation in
checkOpenCode—src/domain/audit/doctor/checks/agent_config_loadable.ts:217-219checkOpenCodere-throws non-NotFounderrors (e.g., permission denied), whilecheckKiro,checkClaude, andcheckCursorall catch and return afailresult for the same category of errors. The outer try/catch indoctor_service.ts:123catches it safely, but the resulting error message is the generic "check threw" rather than a tool-specific hint.
Verdict
PASS — No critical or high severity issues. The core doctor logic is well-structured: checks are independent, errors don't propagate, the streaming protocol is correct, and test coverage is thorough. The noise-filter interaction (finding 1) is a real bug in the --include-diagnostic flag but only affects the audit timeline view, not the doctor command itself. Worth fixing before merge but not blocking.
There was a problem hiding this comment.
Code Review
Well-structured PR. Clean DDD layering with domain types/services in src/domain/audit/doctor/, application-layer orchestration via auditDoctor() exported through src/libswamp/mod.ts, CLI in src/cli/commands/, and presentation properly separated. The PreflightCheck interface, SpawnFn port, and ResolveBinary port enable clean testing without real subprocesses or binary resolution. Streaming event pattern is consistent with the rest of the codebase (datastoreStatus, etc.).
Every new source file has a companion _test.ts with thorough coverage — fake spawn functions, temp directories for filesystem checks, round-trip normalization tests for synthetic payloads. The compile-time contract in synthetic_payloads_test.ts (round-tripping fixtures through the normalizer) is a smart pattern that catches drift at CI time.
Blocking Issues
None.
Suggestions
-
CONFIG_FILESinagent_config_loadable.ts(line 30) — the file-path arrays in the values are never read programmatically; only the keys driveappliesToviatool in CONFIG_FILES. The actual paths are hardcoded again inside eachcheckKiro/checkClaude/etc. function. Consider either using the arrays to derive paths (single source of truth) or replacing with aSet<string>forappliesToto make intent clearer. -
No
resolve_binary_test.ts—binaryNameFor()is tested indirectly throughbinary_on_path_test.ts, but a direct unit test would strengthen coverage for the default branch and document thekiro→kiro-climapping explicitly. -
Minor:
AiToolimport inconsistency — domain code incheck.tsimportsAiToolfrom../../repo/repo_service.tswhile the CLI command indoctor_audit.tsimports it from../../infrastructure/persistence/repo_marker_repository.ts. ExportingAiToolthroughlibswamp/mod.tswould let the CLI import from the public API surface. (Low priority — the existingaudit.tscommand follows the same infrastructure import pattern.)
Summary
Adds
swamp doctor audit [--tool <name>]— a new preflight diagnostic that verifies the AI-tool audit integration configured in the repo is healthy end-to-end. Non-zero exit on any fail, safe to gate in CI.doctoris registered as a parent namespace so future diagnostics (doctor datastore,doctor vault, ...) plug in as siblings. Implements swamp-club#156.Five checks
binary-on-pathswamp-binary-on-pathbrew upgradeagent-config-loadabletools: ["*"])default-agent-set.kiro/settings/cli.jsonnot pointingchat.defaultAgentatswamprecording-smoke-testpostToolUsepayload piped throughswamp audit record --from-hookmust land as a row in today's audit JSONLThe headline verification
Temporarily removing
"shell"fromKIRO_SHELL_TOOL_NAMESinhook_input.ts, recompiling, and running doctor against a freshly-init'd kiro repo produces:Reverted before this commit. This is the load-bearing proof that the feature protects against future upstream drift instead of being theatre.
Name override
The issue body says "please NOT
doctor". The team pickeddoctoranyway — the command is a namespace and will grow; the name fits that shape. Acknowledging here for reviewer context.Scope note on the three cited Kiro bugs
Issue 156 cites three concrete bugs as motivation. Their status in this PR:
tools: ["*"]in kiro agent config →agent-config-loadableflags it with a hint; NOT fixed here.chat.defaultAgent: "swamp"→default-agent-setflags it; NOT fixed here.tool_name === "execute_bash"in Kiro normalizer → already fixed upstream of this PR.src/domain/audit/hook_input.ts:146-150hasKIRO_SHELL_TOOL_NAMES = new Set(["execute_bash", "shell", "execute_cmd"]). This PR's diagnostic catches the class of bug, it does not fix that specific one.Ancillary changes
todaysAuditFilePathhelper.src/infrastructure/persistence/jsonl_audit_repository.tshad two duplicate copies of thecommands-YYYY-MM-DD.jsonlformat string; moved tosrc/domain/audit/audit_path.tsand consumed by both the writer and the doctor smoke-test reader. Single source of truth for the filename format.echo swamp-doctor-smoke-testas a sentinel command prefix. The smoke-test writes a real audit row with this prefix; the audit timeline filters it out of the default view. Opt in withswamp audit --include-diagnostic.parseAiToolOrThrowinsrc/cli/ai_tool_parser.ts— central validator for--toolflags, so bogus values fail with a usage error listing valid names.design/audit.md,design/audit-doctor.md, and a short "verify the audit integration" block in theswamp-reposkill.Follow-ups
content/manual/reference/doctor.mdand a "verify the integration" section for eachcontent/manual/how-to/use-swamp-with-*.mdneed to be routed through the team's docs process.Test Plan
deno check— cleandeno lint— cleandeno fmt --check— cleandeno run test— 4922 passed, 0 failed, 1 ignoreddeno run compile— produces working binary"shell"fromKIRO_SHELL_TOOL_NAMES, recompile, rerun) — smoke-test correctly FAILS on kiro~/code/kiro-audit(existing audit history) — passes; sentinel filter confirmed withswamp audit --include-diagnostic/with--repo-dir /tmp/..., synthetic row landed in the target repo, not the CWD🤖 Generated with Claude Code