Skip to content

feat: interactive CLI, slug-based orgs, evals support#7

Merged
dhruva-reddy merged 8 commits intomainfrom
feat/optimization-gitops-flow
Apr 21, 2026
Merged

feat: interactive CLI, slug-based orgs, evals support#7
dhruva-reddy merged 8 commits intomainfrom
feat/optimization-gitops-flow

Conversation

@vtkovapi
Copy link
Copy Markdown
Collaborator

Summary

  • Interactive CLI for all commandspull, push, apply, call, cleanup now prompt for org selection and offer a searchable multi-select resource picker when run without arguments. Direct mode (npm run push -- <org>) still works for scripting/CI.
  • Slug-based orgs replace fixed dev/stg/prod — Resources are scoped by org name (e.g. my-org, production) instead of hardcoded environments. Each org gets its own .env.<org>, .vapi-state.<org>.json, and resources/<org>/ directory. An interactive npm run setup wizard handles first-time configuration.
  • Evals as a first-class resource type — Added evals throughout the pipeline: types, state, pull, push, delete, resource loading, and the eval runner (npm run eval).

Details

  • npm run setup — interactive wizard: API key validation with region auto-detection, org naming, searchable resource picker with dependency detection
  • searchableCheckbox — custom @inquirer/core prompt with type-to-search, space-to-toggle, Ctrl+A, grouped display, ESC-to-go-back
  • All -cmd.ts wrappers detect whether a slug arg is present — if yes, forward to core script; if no, enter interactive mode
  • shouldApplyResourceType in push.ts now skips resource types not relevant to the selected file paths (less noise)
  • Removed all 30+ env-specific npm scripts (push:dev, pull:stg:force, etc.) — replaced by 7 universal commands
  • README fully rewritten for the new workflow

Test plan

  • npm run setup — configure a new org end-to-end
  • npm run pull — interactive org selection, resource picker with ✔ local markers, ESC to go back
  • npm run push — interactive with git status indicators, selective push of individual files
  • npm run push -- <org> — direct mode still works
  • npm run push -- <org> resources/<org>/assistants/foo.md — single-file push only loads relevant resource type
  • npm run apply — interactive apply (pull → push)
  • npm run call — interactive assistant/squad picker from state file
  • npm run cleanup — interactive dry-run then confirm
  • npm run eval -- <org> -s <squad> — eval runner works with slug-based env
  • npm run build — clean compile (no new type errors)

Made with Cursor

@vtkovapi vtkovapi requested a review from dhruva-reddy April 10, 2026 18:23
@vtkovapi vtkovapi self-assigned this Apr 10, 2026
@dhruva-reddy
Copy link
Copy Markdown
Contributor

Awesome stuff :D

Couple of quick pointers with longer summaries in the .md file below:

  1. UX flow is a bit messy to navigate in the terminal for npm run setup
  2. Documentation still references the old way to pull and push (i.e. commands haven't been updated on our README and AGENTS.md)
  3. This one's a bit more tricky but the local websocket connection to test calling a squad or assistant is broken so I tried getting claude to walk through a fix for that and documented what ended up working for me on local dev

requested improvements.md

Copy link
Copy Markdown
Contributor

@dhruva-reddy dhruva-reddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requesting changes post review

vtkovapi and others added 5 commits April 15, 2026 16:36
- Enhanced `.env.example` with clearer API base URL instructions for US and EU regions.
- Updated `.gitignore` to simplify environment file exclusions and added a catch-all for `.env.*`.
- Removed obsolete `.vapi-state.dev.json` and `.vapi-state.prod.json` files.
- Added new scripts to `package.json` for setup, apply, push, pull, call, and cleanup operations.
- Introduced `searchableCheckbox.ts` for improved user input handling in CLI prompts.
- Cleaned up empty directories and `.gitkeep` files across various resource paths.
- Replaced existing push and pull scripts with new interactive versions (`push-cmd.ts` and `pull-cmd.ts`) that allow users to select organizations and resources interactively.
- Added a new `interactive.ts` file to handle organization detection and resource selection.
- Updated `package.json` scripts to point to the new command files.
- Enhanced `searchableCheckbox.ts` to support a back option in the interactive prompts.
- Refactored `setup.ts` to integrate the new interactive features.
- Introduced `apply-cmd.ts`, `call-cmd.ts`, and `cleanup-cmd.ts` as entry points for their respective commands, allowing for organization slug detection and interactive modes.
- Updated `apply.ts`, `call.ts`, and `cleanup.ts` to support new command structures and improved error handling for invalid org names.
- Enhanced `interactive.ts` to facilitate user interaction for selecting organizations and confirming actions.
- Added support for eval resources across various scripts, including updates to state management and resource handling in `push.ts`, `pull.ts`, and `delete.ts`.
- Refactored argument parsing and validation to ensure consistency across commands.
…ive setup

- Replaced existing command scripts with new command files (`apply-cmd.ts`, `call-cmd.ts`, `cleanup-cmd.ts`) for improved organization and functionality.
- Removed outdated scripts from `package.json` to streamline command usage.
- Enhanced the README to introduce an interactive setup process, detailing the steps for first-time users and clarifying command functionalities.
- Updated command descriptions to reflect the new interactive capabilities and improved user experience.
…ons, squad patterns

- assistants.md: Deepgram Nova-3 keyterm vs keywords, pronunciation dictionary provider comparison (Cartesia/ElevenLabs/Vapi), three-layer pronunciation approach
- tools.md: dead air during KB/API tool calls — request-start + request-response-delayed fix pattern
- squads.md: toolIds in assistantOverrides require UUIDs (not filenames), FAQ agent consolidation pattern
- simulations.md: running simulations against squads, A/B testing workflow, primitive-type evaluation constraint, filename renaming after push
- multilingual.md: updated best single-agent stack to Cartesia sonic-3, added keyterm example
- latency.md: added Cartesia pronunciation row to TTS selection table
@dhruva-reddy dhruva-reddy force-pushed the feat/optimization-gitops-flow branch from 2a68d2c to 0bcb530 Compare April 15, 2026 23:36
- Updated error handling in audio context and microphone initialization to provide more specific warnings based on the encountered issues.
- Modified command usage messages across various scripts to standardize the format and improve clarity, ensuring users understand the correct syntax for commands.
- Added support for new resource types in interactive prompts and improved the handling of locally modified files during resource pulls.
- Introduced a new grouping mechanism in the searchable checkbox for better organization of choices in interactive prompts.
…nization scope

- Revised `.env.example` to reflect changes from environment-specific to organization-specific configurations.
- Updated `AGENTS.md` to clarify resource management under org-scoped directories, including command usage and resource promotion.
- Enhanced instructions for setting up new organizations and managing resources accordingly.
@vtkovapi
Copy link
Copy Markdown
Collaborator Author

@dhruva-reddy can you recheck please

@vtkovapi vtkovapi requested a review from dhruva-reddy April 16, 2026 02:25
Copy link
Copy Markdown
Contributor

dhruva-reddy commented Apr 20, 2026

Resolves the conflicts on PR #7 that blocked it from merging into main.
main shipped `a1e3228 fix(gitops): scope credential walker, harden state
and retries` and `eb29042 feat(gitops): add .vapi-ignore for explicit
resource opt-out` while this branch was in flight. Most of `a1e3228`
undoes regressions this branch introduced — auto-merge handled the
non-overlapping changes correctly; two files needed hand resolution.

Conflict resolutions
--------------------

`.gitignore` — combined: keep `tmp/` alongside main's `.claude/`.

`src/pull.ts`
  - Kept this branch's mtime-based fallback block for the "locally
    edited, no git" case, with one syntactic cleanup: removed the
    inner `const folderPath = FOLDER_MAP[resourceType];` shadow since
    main's `.vapi-ignore` merge introduced an outer `folderPath` at
    the top of the loop body (line 676) that now covers this scope.
  - For the deleted-resource block, took main's expanded comment that
    references `.vapi-ignore` as the mechanism to stop tracking a
    resource entirely.
  - For the pull summary, took main's categorized legend
    (🚫 / ✏️ / 🗑️) — it genuinely surfaces more information than the
    single "preserved (locally changed)" line — but fixed its
    `pull:dev:force` example to the org-slug command form
    `npm run pull -- ${VAPI_ENV} --force`, which is what this branch
    actually exposes.

Additional stale-string fixes made in the same commit
-----------------------------------------------------

These are strings that were correct on main before the org-slug
refactor but become stale once this branch's renamed scripts land. If
left as-is they would actively mis-route users away from the commands
this branch defines. All three are single-line text changes:

  `src/cleanup.ts:108` — `npm run cleanup:${VAPI_ENV} -- --force ...`
    → `npm run cleanup -- ${VAPI_ENV} --force ...`.

  `src/cleanup.ts:135` — `npm run pull:${VAPI_ENV}:bootstrap`
    → `npm run pull -- ${VAPI_ENV} --bootstrap`.

  `docs/environment-scoped-resources.md` deleted entirely. The whole
    file documented `push:dev`/`push:stg`/`pull:prod` commands and
    `resources/dev/` workflows that this branch removed. Also cleaned
    up the surviving references from `README.md` and `AGENTS.md`
    project-structure trees.

Notable incoming work (auto-merged)
-----------------------------------

`src/credentials.ts` — main's scoped credential walker replaces this
branch's generic string-walk. Fixes the `provider: openai` enum
corruption bug.

`src/push.ts` — main's `try { ... } finally { saveState }` wrap around
the apply body. Ensures state is flushed on partial failure so retries
don't create duplicates.

`src/cleanup.ts` — main's `--confirm <slug>` double-gate and empty-
state refusal (both previously removed on this branch).

`src/pull.ts cleanResource` — main's `null` preservation, so cleared
fields like `voicemailMessage: null` don't silently revert on the
next push.

`src/api.ts` — main's retry loop now covers 5xx in addition to 429.

`src/state.ts` — main's atomic state write (tmp + rename) plus loud-
throw on JSON parse error.

`src/resources.ts` — main sorts directory entries before iteration for
deterministic push order across filesystems.

`src/config.ts` — main's `loadIgnorePatterns()` / `matchesIgnore()`
for the new `.vapi-ignore` feature.

`resources/{dev,stg,prod}/.vapi-ignore.example` — new illustrative
example files added by main. Harmless as-is; a follow-up commit on the
child `fix/p0-gitops-regressions` branch consolidates them to a single
canonical `resources/.vapi-ignore.example`.

`AGENTS.md`, `docs/learnings/call-duration.md` — doc adds; no
conflict.

Verification
------------

  npm run build  → passes (tsc --noEmit, 0 errors)

(No test suite yet on this branch — it lives on the stacked child
branch `fix/p0-gitops-regressions`.)
Copy link
Copy Markdown
Contributor

dhruva-reddy commented Apr 21, 2026

Merge activity

  • Apr 21, 6:14 AM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Apr 21, 6:14 AM UTC: @dhruva-reddy merged this pull request with Graphite.

@dhruva-reddy dhruva-reddy merged commit 0782281 into main Apr 21, 2026
1 check passed
dhruva-reddy added a commit that referenced this pull request Apr 21, 2026
…itops-flow audit; add regression tests (#10)

## Describe your changes

Stacked on top of #7. Restores seven P0-class correctness/safety behaviors that `feat/optimization-gitops-flow` regressed, locks them in with 33 regression tests, sweeps docs and CLI text to the org-slug model, and integrates `origin/main`'s parallel fix-set (`a1e3228`) + `.vapi-ignore` feature (`eb29042`).

**P0 fixes** (see `requested improvements.md` audit for details; also mostly shipped in parallel on `main` as `a1e3228`):

- **Credential walker scoped** to `credentialId` / `credentialIds` — the `feat` branch's generic string-walk rewrote `provider: openai` / `11labs` / `langfuse` enums into UUIDs on push, which the Vapi API rejects. Restored the scoped walker with cycle-safe `WeakSet` and `isPlainObject` guard.
- **`saveState` wrapped in `try { ... } finally`** so a mid-push 5xx no longer leaves API-issued UUIDs unrecorded and creates duplicates on retry.
- **`cleanResource` preserves `null`** — the Vapi API uses `null` to mean "intentionally cleared". Stripping it caused pull→push round-trip drift that silently re-applied prior values.
- **Cleanup double-gate restored** — `--force` alone is no longer enough; `--confirm <slug>` is also required, plus empty-state refusal. Prevents a fresh clone or corrupted state from being misread as "everything is orphaned" and wiping the org. Interactive cleanup passes both flags since the `confirm()` prompt is the user's explicit consent.
- **Interactive pull is now local-first** — drops the unconditional `--force` in both the "All" and per-resource flows; adds an explicit `Overwrite locally modified files?` confirm (default `No`).
- **Fresh-clone preservation** — `git status --porcelain --untracked-files=all -z` (was just `-z`, causing untracked dirs to collapse). Mtime fallback now fires when `changedFiles` is an empty Set, not just `undefined`. `.yaml` extension now included alongside `.md` / `.yml` in the preservation check.
- **Short-form paths match for partial push** — `assistants/foo.yml` (the form documented in `AGENTS.md`) used to silently no-op. Extracted `pathMatchesFolder` helper and broadened the match. Bare resource ids (`npm run push -- <org> foo`) are now refused explicitly instead of triggering a full apply with orphan-deletion.

**Other changes in this PR:**

- 33 regression tests under `tests/` using Node's built-in `node:test` runner (no new dependencies); `npm test` added.
- README + AGENTS.md swept for the org-slug model (`.env.<org>`, `resources/<org>/...`, `npm run <cmd> -- <org>`); `docs/environment-scoped-resources.md` deleted as fully stale.
- Merged `origin/main`'s parallel fix (`a1e3228`) + `.vapi-ignore` feature (`eb29042`) cleanly — full conflict-resolution rationale in the merge commit message.
- Consolidated `.vapi-ignore.example` to a single canonical `resources/.vapi-ignore.example` (main's merge added three duplicates in dead env-named folders).
- Removed unused `scripts/mock-vapi-webhook-server.ts` and its references in README / AGENTS / `package.json`.
- Swept P1/P2 findings from a pre-push review subagent: evals properly wired through `cleanup.ts` orphan scan and `ALL_RESOURCE_TYPES` in push; `eval.ts` remediation text updated to org-slug paths; AGENTS.md `.vapi-ignore` section fixed.

**Deferred to follow-up PRs (flagged, not blocking):**

- `applyEval` doesn't yet use `upsertResourceWithStateRecovery` — will 404-crash on stale state mappings instead of recovering like the other resource types.
- Interactive pull masks 5xx/network errors as "no remote resources" — only 401/403 are currently classified correctly.
- Interactive flows (`src/interactive.ts`, `src/searchableCheckbox.ts`, `src/setup.ts`) have no automated coverage; the `--force` confirm prompt and cleanup's confirm spawn are only protected by code review.

## Relevant Context (linear ticket, slack link, etc)

Surfaced during a multi-agent audit of `feat/optimization-gitops-flow` (parent PR #7) before it shipped to customer-facing deployments. The audit's findings and a status-per-requested-improvement table live in `requested improvements.md` at the repo root (gitignored — local audit log only).

## API Changes

- **Is this changing the public API?**

  - [ ] Yes
  - [x] No

- **If yes, is it backward‐compatible?**
  - [ ] Yes
  - [ ] No

N/A. This repo is an internal CLI that consumes the Vapi API; it does not expose any public API surface. The only user-visible CLI changes are additive (`npm test`) or correctness-restoring (cleanup `--confirm <slug>` now required for destructive runs, bare-id positional args refused explicitly). Behaviors that were silent no-ops or silent overwrites now error or prompt loudly.

Non backward-compatible changes might break customers' agents. Please proceed with care and notify the team.

## How did you test this?

**Automated (33 tests, all passing under `npm test`):**
- `tests/credentials.test.ts` (8) — `replaceCredentialRefs` scoping, including the specific `provider: openai` regression that motivated the walker revert; cycle safety; Date/Buffer pass-through; symmetric round-trip.
- `tests/clean-resource.test.ts` (4) — `null` preservation, `undefined` stripping, `EXCLUDED_FIELDS` handling, nested structures.
- `tests/path-matching.test.ts` (11) — `pathMatchesFolder` across long-form, short-form, absolute, Windows-style, nested-folder, and segment-boundary cases.
- `tests/cleanup-safety.test.ts` (4) — spawn-based integration: `cleanup --force` refuses without `--confirm <slug>` / with wrong slug / with empty state; dry-run allowed without the gates.
- `tests/cli-arg-parsing.test.ts` (6) — spawn-based integration: bare-id refusal, misspelled-type refusal, valid positional types / file paths / `--confirm` slug consumption all accepted.

**Verified locally (no network):**
- `npm run build` (tsc --noEmit) clean at every commit.
- `speaker` and `mic` load under `createRequire(import.meta.url)` in ESM (per the `requested improvements.md` smoke test command).
- `git status --porcelain --untracked-files=all -z` expands untracked dirs to individual files on macOS/APFS (the mechanism the P0-6 fix relies on).

**Manual smoke test flagged for reviewer** (requires real terminal / real org):
- Interactive pull on a populated test org: confirm the `Overwrite locally modified files?` prompt defaults to No and only forwards `--force` when user explicitly answers Yes.
- Interactive cleanup on a populated test org: confirm the spawn passes `--force --confirm <slug>` so the destructive subprocess proceeds.
- Fresh-clone preservation: delete `resources/<org>/` and `.vapi-state.<org>.json`, run `npm run pull -- <org> --bootstrap`, then `npm run pull -- <org>`, edit one assistant, run pull again and confirm `✏️ <id> (locally modified, preserving)` fires (P0-6 end-to-end).
dhruva-reddy added a commit that referenced this pull request Apr 23, 2026
…, and rename gotchas (#12)

## Describe your changes

Adds eight battle-tested learnings entries to the gitops template, sourced from the `customers/amazon/gitops-amazon3p` deployment tracker (§7 "Known footguns"). All entries are tracker-validated against specific iterations (Iter13–Iter24), generalized to strip customer/business specifics, and concentrated in the existing learnings files most likely to be consulted before each gotcha bites.

**`docs/learnings/squads.md`** — three new sections, all gitops-relevant squad gotchas not previously covered:

- **Inline `model.messages` in `assistantOverrides` silently shadows the assistant `.md`.** Surfaced as an 8.6k-char inline prompt drifting from its source `.md` over multiple iterations. Recommends keeping the `.md` as single source of truth and using a second assistant file when a different prompt is genuinely needed.
- **`firstMessage` replays on every handoff re-entry.** Default `firstMessageMode` is `assistant-speaks-first`, which fires on every control transfer back to that assistant — not just call start. Recommends `firstMessage: ""` + `assistant-speaks-first-with-model-generated-message` for any non-terminal squad member, plus a "RE-ENTRY PROTOCOL" prompt block.
- **Two silence handlers fire at once when both are configured.** `messagePlan.idleMessages` (per-assistant) and `customer.speech.timeout` hooks (per-assistant or via `membersOverrides.hooks`) both fire on the same silence event. Recommends picking one — squad-level hooks usually preferable for `triggerMaxCount` / `triggerResetMode` support.

**`docs/learnings/assistants.md`** — refines an existing entry and adds a Cartesia subsection:

- **`numWords: 2` produces a 500–800ms TTS overlap window** that drops STT confidence on barge-in, often filtering out the customer's first sentence after interrupting. The existing entry only documented the "lower = more interruptible" tradeoff; the new entry explains why `numWords: 1` + Krisp denoising is the recommended pairing.
- **Cartesia-specific config gotchas** table covering rejected ElevenLabs-only fields (`enableSsmlParsing`, top-level `voice.speed`, `voice.stability`, `voice.similarityBoost`), plus the correct nested paths for `generationConfig.speed` and `generationConfig.experimental.accentLocalization`.
- **Cartesia Sonic-3 mangles em-dashes and SSML `<break>` tags** — pacing should use commas/periods/semicolons instead.

**`docs/learnings/multilingual.md`** — adds:

- **English-heavy `keyterm` arrays bias Deepgram `language: multi` toward English.** The language-ID step uses partial transcripts; an English-leaning `keyterm` tilts that signal, sending non-English audio through the English pipeline and producing low-confidence transcripts that get filtered. Recommends Gladia Solaria for code-switching customers, with a fallback recommendation for staying on Deepgram.

**`AGENTS.md`** — adds a "Renaming an existing resource" subsection under Naming Conventions:

- Documents that the engine's `name_mismatch` guard auto-bootstraps state from the dashboard before applying, so manually editing `.vapi-state.<org>.json` to repoint a renamed file at an existing UUID does not work. Lays out the two correct paths (rename locally + new UUID + cleanup orphan, or rename in dashboard + pull preserves UUID).

## Relevant Context (linear ticket, slack link, etc)

Surfaced during a sweep of the Amazon 3P deployment tracker's "Known footguns" section after several of these gotchas hit consecutive iterations on a single customer. Eight of the sixteen tracked footguns were already covered by existing learnings (`call-duration.md` for LLM wall-clock, `transfers.md` for routing-bias, etc.) or are out of scope for the gitops template (dashboard UX bugs, customer-specific call-flow design). The remaining eight were filed here.

**Engine-logic candidate flagged for follow-up, not included in this PR:** the inline-`model.messages` shadow in squad overrides (#7 in the tracker) is plausibly a `npm run push` lint-warn opportunity — but legitimate cases for partial `model` overrides exist (temperature, tool sets), so a precise rule would need to target only the `messages` field. The doc warning is the right first step; the engine warning can come once the rule is well-scoped.

## API Changes

- **Is this changing the public API?**

  - [ ] Yes
  - [x] No

- **If yes, is it backward‐compatible?**
  - [ ] Yes
  - [ ] No

N/A — docs-only change. No engine or API surface modifications.

Non backward-compatible changes might break customers' agents. Please proceed with care and notify the team.

## How did you test this?

- `npm run build` (tsc --noEmit) passes.
- Each new claim cross-checked against the tracker iteration that surfaced it (Iter13–Iter24, listed in the "First surfaced" column of `customers/amazon/gitops-amazon3p/TRACKER.md` §7).
- Generalization sweep run on the diff: `rg -i 'amazon|youssaf|mudflap|notable|zeals|3p|iform|FBA'` against the changed files returns only one false-positive hit (the literal string "duplication" in `squads.md`); no customer or business-specific terms leaked.
- Internal cross-references (`call-duration.md`, `assistants.md` interior links) verified to point at existing sections.
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** Today: you pull, your teammate edits the same assistant
on the dashboard during a live test, you push your unrelated branch,
and their dashboard edit disappears with no warning. Customer-success
reps update business hours via the dashboard; the next gitops push
silently reverts them. Even `git revert + push` rollbacks have the
same problem — they overwrite whatever's currently live, not just the
change being reverted. The engine had no way to detect this because
the state file only stored name→UUID, no record of the platform's
content at last pull.

**What this fix does.** Now that Stack F populates `lastPulledHash`,
drift detection becomes possible. Before each PATCH, the engine GETs
the current platform payload, hashes it, and compares to the
`lastPulledHash` in state.

  - Hashes match → continue silently.
  - Hashes differ + no flag → **refuse the push**, point at the
    drift, ask the operator to either pull-and-resolve or pass
    `--overwrite` to take ownership.
  - Hashes differ + `--overwrite` → log "overwriting drift" and
    proceed.
  - No baseline (legacy state, first push after Stack F) → log
    "drift unknown — proceeding" and don't block.

Also adds a specific helper for the **Cartesia voice picker**
footgun: if `pronunciationDictId` was set at last pull but isn't on
the platform now, surface that explicitly so the operator notices.

**Outcome you'll notice.** Concurrent dashboard edits no longer
disappear silently. If someone else touched a resource between your
pull and your push, you see the conflict at push time and have to
make an explicit call (overwrite, or pull and resolve). The engine
becomes a real safety rail rather than a blind PATCH machine.

---

Before each PATCH, GET the current platform payload, hash it, and
compare to the lastPulledHash recorded in state (Stack F). If the
hashes differ, the dashboard has drifted away from the version we last
pulled — refuse to push without --overwrite.

Behavior matrix:
- No lastPulledHash (legacy state, first push after Stack F): log
  "drift unknown — proceeding" and continue. Don't block.
- Hashes match: continue silently.
- Hashes differ + no --overwrite: refuse the push, return null.
- Hashes differ + --overwrite: log "overwriting drift" and continue.

Files:
- src/drift.ts (NEW): checkDriftForUpdate(endpoint, state, overwrite).
  GETs platform, strips server-managed fields (id/orgId/createdAt/etc)
  to align hash basis with cleanResource()'s output, sha256 compares.
  Returns DriftCheckResult with reason and message for caller logging.
- src/state-serialize.ts: checkPronunciationDictDrop helper for the
  Cartesia voice-picker case (improvements.md #7) — pure data, safe
  to import in tests.
- src/config.ts: --overwrite flag.
- src/push.ts: drift gate in upsertResourceWithStateRecovery before
  every PATCH. Skipped in dry-run (operator wants to see what would
  happen). Skipped if no baseline.
- tests/drift.test.ts: hash-match → ok, hash-differ-no-overwrite → ok=false,
  hash-differ-overwrite → ok=true, no-baseline → ok=true.

Closes improvements.md #1, #7. Partial #2 (push side caught; pull side
same-file conflict still requires manual resolution).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---

## Update — 11labs `pronunciationDictionaryLocators` array also covered

`checkPronunciationDictDrop` now detects drops in both pronunciation-
dictionary shapes Vapi exposes:

- **11labs** (the documented shape):
  `voice.pronunciationDictionaryLocators[]` — array of
  `{ pronunciationDictionaryId, versionId }`. We warn on N → M shrinks
  (M < N) including N → 0 and array-going-missing.
- **Cartesia** (passthrough — not in Vapi docs but observed):
  `voice.pronunciationDictId` — single string id. Existing 1 → 0
  detection unchanged.

Reference: https://docs.vapi.ai/assistants/pronunciation-dictionaries

Six new test cases pin the 11labs behavior: array clear (1 → 0), shrink
(2 → 1), array-going-missing entirely, no-op when unchanged, no-op when
locators are added (additive growth shouldn't warn), and the defensive
hybrid case where a payload carries both shapes.
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** Today: you pull, your teammate edits the same assistant
on the dashboard during a live test, you push your unrelated branch,
and their dashboard edit disappears with no warning. Customer-success
reps update business hours via the dashboard; the next gitops push
silently reverts them. Even `git revert + push` rollbacks have the
same problem — they overwrite whatever's currently live, not just the
change being reverted. The engine had no way to detect this because
the state file only stored name→UUID, no record of the platform's
content at last pull.

**What this fix does.** Now that Stack F populates `lastPulledHash`,
drift detection becomes possible. Before each PATCH, the engine GETs
the current platform payload, hashes it, and compares to the
`lastPulledHash` in state.

  - Hashes match → continue silently.
  - Hashes differ + no flag → **refuse the push**, point at the
    drift, ask the operator to either pull-and-resolve or pass
    `--overwrite` to take ownership.
  - Hashes differ + `--overwrite` → log "overwriting drift" and
    proceed.
  - No baseline (legacy state, first push after Stack F) → log
    "drift unknown — proceeding" and don't block.

Also adds a specific helper for the **Cartesia voice picker**
footgun: if `pronunciationDictId` was set at last pull but isn't on
the platform now, surface that explicitly so the operator notices.

**Outcome you'll notice.** Concurrent dashboard edits no longer
disappear silently. If someone else touched a resource between your
pull and your push, you see the conflict at push time and have to
make an explicit call (overwrite, or pull and resolve). The engine
becomes a real safety rail rather than a blind PATCH machine.

---

Before each PATCH, GET the current platform payload, hash it, and
compare to the lastPulledHash recorded in state (Stack F). If the
hashes differ, the dashboard has drifted away from the version we last
pulled — refuse to push without --overwrite.

Behavior matrix:
- No lastPulledHash (legacy state, first push after Stack F): log
  "drift unknown — proceeding" and continue. Don't block.
- Hashes match: continue silently.
- Hashes differ + no --overwrite: refuse the push, return null.
- Hashes differ + --overwrite: log "overwriting drift" and continue.

Files:
- src/drift.ts (NEW): checkDriftForUpdate(endpoint, state, overwrite).
  GETs platform, strips server-managed fields (id/orgId/createdAt/etc)
  to align hash basis with cleanResource()'s output, sha256 compares.
  Returns DriftCheckResult with reason and message for caller logging.
- src/state-serialize.ts: checkPronunciationDictDrop helper for the
  Cartesia voice-picker case (improvements.md #7) — pure data, safe
  to import in tests.
- src/config.ts: --overwrite flag.
- src/push.ts: drift gate in upsertResourceWithStateRecovery before
  every PATCH. Skipped in dry-run (operator wants to see what would
  happen). Skipped if no baseline.
- tests/drift.test.ts: hash-match → ok, hash-differ-no-overwrite → ok=false,
  hash-differ-overwrite → ok=true, no-baseline → ok=true.

Closes improvements.md #1, #7. Partial #2 (push side caught; pull side
same-file conflict still requires manual resolution).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---

## Update — 11labs `pronunciationDictionaryLocators` array also covered

`checkPronunciationDictDrop` now detects drops in both pronunciation-
dictionary shapes Vapi exposes:

- **11labs** (the documented shape):
  `voice.pronunciationDictionaryLocators[]` — array of
  `{ pronunciationDictionaryId, versionId }`. We warn on N → M shrinks
  (M < N) including N → 0 and array-going-missing.
- **Cartesia** (passthrough — not in Vapi docs but observed):
  `voice.pronunciationDictId` — single string id. Existing 1 → 0
  detection unchanged.

Reference: https://docs.vapi.ai/assistants/pronunciation-dictionaries

Six new test cases pin the 11labs behavior: array clear (1 → 0), shrink
(2 → 1), array-going-missing entirely, no-op when unchanged, no-op when
locators are added (additive growth shouldn't warn), and the defensive
hybrid case where a payload carries both shapes.
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** Today: you pull, your teammate edits the same assistant
on the dashboard during a live test, you push your unrelated branch,
and their dashboard edit disappears with no warning. Customer-success
reps update business hours via the dashboard; the next gitops push
silently reverts them. Even `git revert + push` rollbacks have the
same problem — they overwrite whatever's currently live, not just the
change being reverted. The engine had no way to detect this because
the state file only stored name→UUID, no record of the platform's
content at last pull.

**What this fix does.** Now that Stack F populates `lastPulledHash`,
drift detection becomes possible. Before each PATCH, the engine GETs
the current platform payload, hashes it, and compares to the
`lastPulledHash` in state.

  - Hashes match → continue silently.
  - Hashes differ + no flag → **refuse the push**, point at the
    drift, ask the operator to either pull-and-resolve or pass
    `--overwrite` to take ownership.
  - Hashes differ + `--overwrite` → log "overwriting drift" and
    proceed.
  - No baseline (legacy state, first push after Stack F) → log
    "drift unknown — proceeding" and don't block.

Also adds a specific helper for the **Cartesia voice picker**
footgun: if `pronunciationDictId` was set at last pull but isn't on
the platform now, surface that explicitly so the operator notices.

**Outcome you'll notice.** Concurrent dashboard edits no longer
disappear silently. If someone else touched a resource between your
pull and your push, you see the conflict at push time and have to
make an explicit call (overwrite, or pull and resolve). The engine
becomes a real safety rail rather than a blind PATCH machine.

---

Before each PATCH, GET the current platform payload, hash it, and
compare to the lastPulledHash recorded in state (Stack F). If the
hashes differ, the dashboard has drifted away from the version we last
pulled — refuse to push without --overwrite.

Behavior matrix:
- No lastPulledHash (legacy state, first push after Stack F): log
  "drift unknown — proceeding" and continue. Don't block.
- Hashes match: continue silently.
- Hashes differ + no --overwrite: refuse the push, return null.
- Hashes differ + --overwrite: log "overwriting drift" and continue.

Files:
- src/drift.ts (NEW): checkDriftForUpdate(endpoint, state, overwrite).
  GETs platform, strips server-managed fields (id/orgId/createdAt/etc)
  to align hash basis with cleanResource()'s output, sha256 compares.
  Returns DriftCheckResult with reason and message for caller logging.
- src/state-serialize.ts: checkPronunciationDictDrop helper for the
  Cartesia voice-picker case (improvements.md #7) — pure data, safe
  to import in tests.
- src/config.ts: --overwrite flag.
- src/push.ts: drift gate in upsertResourceWithStateRecovery before
  every PATCH. Skipped in dry-run (operator wants to see what would
  happen). Skipped if no baseline.
- tests/drift.test.ts: hash-match → ok, hash-differ-no-overwrite → ok=false,
  hash-differ-overwrite → ok=true, no-baseline → ok=true.

Closes improvements.md #1, #7. Partial #2 (push side caught; pull side
same-file conflict still requires manual resolution).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---

## Update — 11labs `pronunciationDictionaryLocators` array also covered

`checkPronunciationDictDrop` now detects drops in both pronunciation-
dictionary shapes Vapi exposes:

- **11labs** (the documented shape):
  `voice.pronunciationDictionaryLocators[]` — array of
  `{ pronunciationDictionaryId, versionId }`. We warn on N → M shrinks
  (M < N) including N → 0 and array-going-missing.
- **Cartesia** (passthrough — not in Vapi docs but observed):
  `voice.pronunciationDictId` — single string id. Existing 1 → 0
  detection unchanged.

Reference: https://docs.vapi.ai/assistants/pronunciation-dictionaries

Six new test cases pin the 11labs behavior: array clear (1 → 0), shrink
(2 → 1), array-going-missing entirely, no-op when unchanged, no-op when
locators are added (additive growth shouldn't warn), and the defensive
hybrid case where a payload carries both shapes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants