feat: npm run audit — read-only drift detector for state/dashboard divergence#27
Merged
Conversation
Adds `npm run audit -- <org>` — single-command audit for the state-vs-dashboard drift conditions that have been accumulating cruft in customer-fork repos. Detects (read-only): - orphan local YAML files (no state entry — Scenario B leftovers) - state ghosts (state UUID missing on dashboard) - state UUID collisions (cascade-duplicate fingerprint) - content-identical resources (same lastPulledHash) - sibling base-slug clusters (cascade-risk warning) - dashboard orphans (UUID not in state; suppressed by .vapi-ignore) - assistants with inline model.tools (suspected duplicate-spawn surface) Exit code: 0 if clean, 1 if any findings. Designed for DI: state loader, local file lister, remote fetcher are all injectable, making tests filesystem-free and network-free. Promotes `listExistingResourceIds` in src/pull.ts from `function` to `export function` (one-word edit) to avoid duplicating the directory walker. Tests in tests/audit.test.ts will be added in a follow-up commit on this branch.
…ion) Covers all 7 audit checks via DI fixtures (no filesystem, no network): - orphan-yaml (3 cases) - state-ghost (3 cases inc. fetchRemote=false short-circuit) - state-uuid-collision (2 cases) - content-identical (3 cases inc. missing-hash safety) - sibling-base-slug (3 cases inc. cross-ref overlap) - dashboard-orphan (4 cases inc. .vapi-ignore suppression) - inline-tools (4 cases inc. async-Promise branch) Plus: 1 integration test combining multiple checks, 1 exit-code mapping test, 3 formatter tests.
Closes the gap surfaced by the test-writer phase: audit-cmd.ts had inlined `findings.length === 0 ? 0 : 1` at every exit-code call site, so the exit-code test in tests/audit.test.ts could only assert on a parallel re-derivation rather than the real CLI behavior. Extracts a tiny exported `exitCodeForFindings(findings)` helper and routes both exit sites through it. Test imports and pins to the helper, so future changes to the severity bar (e.g. a `--strict` flag in v2) will surface in the existing assertion instead of silently drifting. No behavior change. 155/155 tests pass.
Addresses two non-blocking code-review findings before opening the PR: 1. **Fail-fast → fail-graceful for dashboard fetches.** Switched the parallel per-type API calls from `Promise.all` to `Promise.allSettled`. A transient 500 / 429 / network blip on one resource type used to abort the entire audit, leaving the operator with zero findings instead of findings-for-the-types-that-succeeded. Now: each failed fetch emits a `fetch-failed` finding (severity: warn, message includes the underlying error). The per-type loop checks `remoteByType.has(type)` before running state-ghost and dashboard-orphan checks — preventing the would-be false-positive where an empty-array fallback marks every state entry as a ghost. New rule: `AuditRule = ... | "fetch-failed"`. 2. **README command table missing `audit`.** Added a row under the `validate` entry so operators discover the command from the same surface that lists `pull`/`push`/`cleanup`/`rollback`/etc. New test pinning the fail-graceful path: one type's `remoteFetcher` throws → exactly 1 `fetch-failed` finding for that type, 0 false-positive state-ghost findings for any state entry of that type, and other types' checks proceed normally. Suite: 156/156 pass (+1 test).
2 tasks
dhruva-reddy
added a commit
that referenced
this pull request
May 13, 2026
Followup to #27. Agents discovering the engine surface read the command tables in AGENTS.md (lines 62-72 and 797-830); both need to mention `npm run audit` so downstream agents pick it up in normal workflow. Added to: - "Common commands" quick-reference table (after `validate`) - "## Available Commands" bash block (after `validate`, before `sim`) Docs-only PR — skip test-writer/code-reviewer per the always-apply rule for docs-only changes.
This was referenced May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
npm run audit -- <org>— a single-command read-only diagnostic for the state-vs-dashboard drift conditions that have been silently accumulating in customer-fork repos (orphan tools, byte-identical assistant clusters, state ghosts, etc.).7 checks:
orphan-yamlstate-ghoststate-uuid-collisioncontent-identicallastPulledHash(PR #19 enables this)sibling-base-slugdashboard-orphan.vapi-ignorepatterns)inline-toolsmodel.toolsblocks — suspected duplicate-spawn surfacefetch-failedExit code: 0 if clean, 1 on any finding. Safe to wire into CI.
Design: DI surface for
stateLoader/listLocalIds/remoteFetcher/readAssistantTools— tests are filesystem-free and network-free. Mirrorsvalidate-cmd.ts/validate.tsstructure.Concurrency: per-type dashboard fetches use
Promise.allSettledso a transient API hiccup on one type emits afetch-failedfinding and skips the remote checks for that type instead of aborting the whole audit.Why now
Today's manual cleanup on a customer-fork repo (mudflap-prod) drained 13 orphan endCall tool UUIDs and 7 duplicate assistant UUIDs across two orgs — a ~30-minute cross-reference exercise involving Node scripts, dashboard API calls, and YAML grep. The same cleanup pattern is needed for other customer forks (gitops-amazon3p, gitops-notable). This command makes it a single read.
Files changed
src/audit.tssrc/audit-cmd.tstests/audit.test.tssrc/pull.tsfunction→export function listExistingResourceIds)package.jsonREADME.mdTest plan
npm run build— cleannpm test— 156 / 156 pass (28 new tests for audit, 128 prior)npx @biomejs/biome check --write— cleanResidual risks / known follow-ups
Surfaced during the code-review phase (non-blocking, deferred):
credentialsstate section is intentionally not audited —VALID_RESOURCE_TYPESomits it by design, so the audit skips it silently. If credentials ever drift in state, audit won't detect. (Documented at the top ofsrc/audit.ts.)inline-toolsfindings.validatecatches parse errors via a separate code path. Could be promoted to its ownunparseable-assistantrule in a follow-up..vapi-ignorepatterns must use slug form to suppress dashboard-orphan findings — bare UUIDs in.vapi-ignorewon't work. Real-world.vapi-ignorecontent uses slug form, so unlikely to bite.--summaryflag yet — on a 200-assistant fork with deep drift, a single rename can produce 3–4 findings for one logical issue. Future v2 could collapse per-resource-id.--fixflag — v1 is read-only by design; deletion stays a manualdelete files → push → API-delete → pullsequence per safety.checkXfunction could use a 1-line invariant doc. Mechanical follow-up.Diagnostic context
This command is part of a broader investigation into duplicate-tool / duplicate-assistant drift in customer-fork repos. A separate vapi-core ticket is being filed against the dashboard UI's assistant-editor save handler, which is the leading suspect for the spawn-source (see customer-fork
improvements.mdrevision for the full diagnosis history).