Skip to content

fix(call): clear wrapped partial transcripts cleanly in npm run call#11

Merged
dhruva-reddy merged 1 commit intomainfrom
fix/call-transcript-duplicate-display
Apr 23, 2026
Merged

fix(call): clear wrapped partial transcripts cleanly in npm run call#11
dhruva-reddy merged 1 commit intomainfrom
fix/call-transcript-duplicate-display

Conversation

@dhruva-reddy
Copy link
Copy Markdown
Contributor

Describe your changes

Fixes a long-standing display bug in npm run call where partial assistant transcripts visibly duplicate themselves in the terminal, leaving stacks of 🤖 Assistant: ... lines on screen instead of overwriting cleanly.

  • Root cause: handleControlMessage in src/call.ts was repainting partials with \r + N spaces + \r. \r only returns the cursor to column 0 of the current terminal row, so when a partial wraps to multiple rows (very common for assistant utterances on an 80-col terminal — the example that surfaced this was 110 cells wide), only the bottom row gets cleared. Every wrapped row above stays on screen, and the next partial paints another wrapping block on top.
  • Fix: Compute the previous partial's row count from process.stdout.columns and an estimated display width that treats emoji and CJK glyphs as 2 cells, then walk the cursor up with readline.moveCursor / readline.cursorTo / readline.clearLine to erase every row, not just the bottom one.
  • Apply the same clear before printing speech-update events, the call-ended message, the WebSocket onclose log, and the SIGINT/SIGTERM cleanup banner so they don't print on top of an in-flight partial either.
  • Non-TTY mode: When process.stdout.isTTY is false (piping to tee, CI logs, redirecting to a file), \r was producing literal carriage returns and concatenated lines. We now skip live partial overwrites entirely in non-TTY contexts and only emit finals — one transcript per line, clean logs.

No new dependencies. Pure refactor of an existing function plus two small helpers (getDisplayWidth, clearWrittenLine) below connectWebSocket.

Relevant Context (linear ticket, slack link, etc)

Surfaced during a manual simulated call (npm run call -- <org> -s <squad>) — every assistant turn was producing 5–10 stacked copies of the same 🤖 Assistant: ... Seller C partial in the terminal, with long stretches of trailing whitespace between them. The duplication signature (identical content, growing whitespace gaps) traced back to \r only clearing one wrapped row instead of all of them.

API Changes

  • Is this changing the public API?

    • Yes
    • No
  • If yes, is it backward‐compatible?

    • Yes
    • No

N/A. This repo is an internal gitops CLI; the only user-visible change is that npm run call now renders partial transcripts cleanly in TTY and emits one line per final transcript in non-TTY contexts. No flags, no commands, no on-disk schema, no platform requests change.

Non backward-compatible changes might break customers' agents. Please proceed with care and notify the team.

How did you test this?

Automated:

  • npx tsc --noEmit — clean.
  • npm test — 33/33 passing (no new tests added; existing suite covers the credential walker, path matching, cleanup safety, CLI arg parsing, and clean-resource regression set).

Width math sanity check:

  • The exact partial that was duplicating in the reported call (🤖 Assistant: Understood. Let's pick up right where we left off. The last instruction was: first open Seller C) has a display width of 110 cells under the new getDisplayWidth, which correctly resolves to 2 rows on an 80-col terminal, 2 rows at 100 cols, and 1 row at 120+ cols. Short partials and 🎤 You: ... lines correctly resolve to 1 row at all standard widths.

Manual smoke test flagged for reviewer (requires real terminal + a configured org with an assistant or squad):

  • npm run call -- <org> -a <name> (or -s <squad>) and have a multi-sentence exchange. Confirm that long assistant partials overwrite in place — no stack of duplicate 🤖 Assistant: ... lines, no leftover wrapped rows.
  • Resize the terminal mid-call to a narrow width (e.g. 50 cols) and continue the conversation; partials should still overwrite cleanly when they wrap to 3+ rows.
  • Pipe the output to a file: npm run call -- <org> -a <name> | tee /tmp/call.log. Confirm the log contains only finals (one per line), no \r artifacts, no duplicated partials.
  • Press Ctrl+C while a partial is on screen; confirm 👋 Ending call... prints cleanly without the partial visible above it.

The live partial-transcript overwrite in `handleControlMessage` used
`\r` + spaces + `\r` to repaint each partial in place. `\r` only moves
the cursor to column 0 of the *current* terminal row, so when a partial
is long enough to wrap (very common for assistant utterances and any
narrow terminal), only the bottom row of the wrapped block is cleared.
Every previous wrapped row stays on screen, and each subsequent partial
paints another wrapping block on top — producing the duplicated stack
of `🤖 Assistant: ... Seller C` lines reported during a recent
simulated call.

Replace the broken sequence with `readline.cursorTo` /
`readline.clearLine` / `readline.moveCursor` driven by an estimated
display width that accounts for emoji and CJK glyphs being 2 cells
wide. Apply the same clear before printing speech-update,
call-ended, ws.onclose, and SIGINT/SIGTERM cleanup messages so they
don't print over a half-rendered partial.

Also gate live partial overwrites on `process.stdout.isTTY`: when the
output is being piped (CI logs, `tee`, etc.), `\r` was producing
literal carriage returns and concatenated lines. In non-TTY mode we
now skip partials entirely and only print finals — one transcript per
line, clean logs.

No public API change. `tsc --noEmit` and `npm test` (33/33) pass.
Copy link
Copy Markdown
Contributor Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

Copy link
Copy Markdown
Collaborator

@vtkovapi vtkovapi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to merge

Copy link
Copy Markdown
Contributor Author

dhruva-reddy commented Apr 23, 2026

Merge activity

  • Apr 23, 5:54 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • Apr 23, 5:54 PM UTC: @dhruva-reddy merged this pull request with Graphite.

@dhruva-reddy dhruva-reddy merged commit e210277 into main Apr 23, 2026
1 check passed
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
     the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
     with two identical headers stacked, agent follows both).
  4. `maxTokens` set lower than the JSON-schema size of the attached
     tools' arguments — assistant looks fine on push, bricks on first
     tool-using call.
  5. Voice fields nested wrong for the provider (`voice.speed` on
     Cartesia, where it lives at `voice.generationConfig.speed`).

**What this fix does.** Five client-side validators, all running off
the same `LoadedResources` shape that `push.ts` would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with `--strict`. Run
standalone via `npm run validate -- <org>`.

**Outcome you'll notice.** Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add `npm run push -- <env> --strict` as a gate before any deploy.

---

Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via `npm run validate -- <org>`.

Validators implemented:

1. Name length cap (40 chars). Walks every assistant.name and every
   evaluations[].structuredOutput.name in scenarios. Closes #18.
2. SO ↔ assistant bidirectional lockstep. For every SO file's
   assistant_ids, checks the named assistant's structuredOutputIds
   mirrors it; reverse direction too. Closes #11.
3. Prompt duplication heuristics. Same H1 heading appearing twice,
   repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
   Partial fix for #8 (paste-on-top dashboard duplications).
4. maxTokens floor for tool-using assistants. Computes
   floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
   per attached tool. Warns under floor. Closes #19.
5. Per-provider voice schema. Cartesia rejects top-level speed /
   stability / similarityBoost / enableSsmlParsing (point at
   generationConfig.* / drop the field). 11labs rejects
   generationConfig (it's a Cartesia path). Closes #9 (engine half).

- src/validate.ts (NEW): validateResources(loadedResources) returning
  ValidationFinding[] with severity / type / resourceId / rule / message
  / fieldPath. Pure data; safe to test directly.
- src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
  push.ts so the lint runs against exactly what would ship. Exit non-zero
  on any error finding.
- src/config.ts: --strict flag.
- src/push.ts: validators run in default-warn mode; --strict aborts.
- package.json: validate script.
- AGENTS.md: document npm run validate and --strict.
- tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
  covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
     the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
     with two identical headers stacked, agent follows both).
  4. `maxTokens` set lower than the JSON-schema size of the attached
     tools' arguments — assistant looks fine on push, bricks on first
     tool-using call.
  5. Voice fields nested wrong for the provider (`voice.speed` on
     Cartesia, where it lives at `voice.generationConfig.speed`).

**What this fix does.** Five client-side validators, all running off
the same `LoadedResources` shape that `push.ts` would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with `--strict`. Run
standalone via `npm run validate -- <org>`.

**Outcome you'll notice.** Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add `npm run push -- <env> --strict` as a gate before any deploy.

---

Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via `npm run validate -- <org>`.

Validators implemented:

1. Name length cap (40 chars). Walks every assistant.name and every
   evaluations[].structuredOutput.name in scenarios. Closes #18.
2. SO ↔ assistant bidirectional lockstep. For every SO file's
   assistant_ids, checks the named assistant's structuredOutputIds
   mirrors it; reverse direction too. Closes #11.
3. Prompt duplication heuristics. Same H1 heading appearing twice,
   repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
   Partial fix for #8 (paste-on-top dashboard duplications).
4. maxTokens floor for tool-using assistants. Computes
   floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
   per attached tool. Warns under floor. Closes #19.
5. Per-provider voice schema. Cartesia rejects top-level speed /
   stability / similarityBoost / enableSsmlParsing (point at
   generationConfig.* / drop the field). 11labs rejects
   generationConfig (it's a Cartesia path). Closes #9 (engine half).

- src/validate.ts (NEW): validateResources(loadedResources) returning
  ValidationFinding[] with severity / type / resourceId / rule / message
  / fieldPath. Pure data; safe to test directly.
- src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
  push.ts so the lint runs against exactly what would ship. Exit non-zero
  on any error finding.
- src/config.ts: --strict flag.
- src/push.ts: validators run in default-warn mode; --strict aborts.
- package.json: validate script.
- AGENTS.md: document npm run validate and --strict.
- tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
  covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
     the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
     with two identical headers stacked, agent follows both).
  4. `maxTokens` set lower than the JSON-schema size of the attached
     tools' arguments — assistant looks fine on push, bricks on first
     tool-using call.
  5. Voice fields nested wrong for the provider (`voice.speed` on
     Cartesia, where it lives at `voice.generationConfig.speed`).

**What this fix does.** Five client-side validators, all running off
the same `LoadedResources` shape that `push.ts` would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with `--strict`. Run
standalone via `npm run validate -- <org>`.

**Outcome you'll notice.** Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add `npm run push -- <env> --strict` as a gate before any deploy.

---

Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via `npm run validate -- <org>`.

Validators implemented:

1. Name length cap (40 chars). Walks every assistant.name and every
   evaluations[].structuredOutput.name in scenarios. Closes #18.
2. SO ↔ assistant bidirectional lockstep. For every SO file's
   assistant_ids, checks the named assistant's structuredOutputIds
   mirrors it; reverse direction too. Closes #11.
3. Prompt duplication heuristics. Same H1 heading appearing twice,
   repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
   Partial fix for #8 (paste-on-top dashboard duplications).
4. maxTokens floor for tool-using assistants. Computes
   floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
   per attached tool. Warns under floor. Closes #19.
5. Per-provider voice schema. Cartesia rejects top-level speed /
   stability / similarityBoost / enableSsmlParsing (point at
   generationConfig.* / drop the field). 11labs rejects
   generationConfig (it's a Cartesia path). Closes #9 (engine half).

- src/validate.ts (NEW): validateResources(loadedResources) returning
  ValidationFinding[] with severity / type / resourceId / rule / message
  / fieldPath. Pure data; safe to test directly.
- src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
  push.ts so the lint runs against exactly what would ship. Exit non-zero
  on any error finding.
- src/config.ts: --strict flag.
- src/push.ts: validators run in default-warn mode; --strict aborts.
- package.json: validate script.
- AGENTS.md: document npm run validate and --strict.
- tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
  covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
dhruva-reddy added a commit that referenced this pull request May 2, 2026
**Problem.** The Vapi API rejects bad configs at PATCH time with terse
400s ("property speed should not exist") — and by then the push has
already partially completed against other resources. We watched the
same five classes of mistake hit production over and over:

  1. Assistant names (or eval names) longer than 40 chars (silent cap).
  2. Structured-output ↔ assistant lockstep mismatch — one side declares
     the relationship, the other doesn't, dashboard ends up inconsistent.
  3. Prompts duplicated by paste-on-top dashboard edits (10kB prompt
     with two identical headers stacked, agent follows both).
  4. `maxTokens` set lower than the JSON-schema size of the attached
     tools' arguments — assistant looks fine on push, bricks on first
     tool-using call.
  5. Voice fields nested wrong for the provider (`voice.speed` on
     Cartesia, where it lives at `voice.generationConfig.speed`).

**What this fix does.** Five client-side validators, all running off
the same `LoadedResources` shape that `push.ts` would actually ship —
so the lint runs against exactly what would be pushed, no separate
parser to drift. Surfaces as warnings by default (one bad spec doesn't
block an otherwise-good push); promote to abort with `--strict`. Run
standalone via `npm run validate -- <org>`.

**Outcome you'll notice.** Most schema-class mistakes get caught
locally in seconds instead of mid-push 400s. Voice provider field
mismatch gets a specific message pointing at the right path. CI can
add `npm run push -- <env> --strict` as a gate before any deploy.

---

Catch the classes of errors that today only surface when the API returns
a 400 mid-push. The push pipeline runs validation in warn-only mode by
default; --strict promotes errors to a blocking abort before any API
call. Standalone runner via `npm run validate -- <org>`.

Validators implemented:

1. Name length cap (40 chars). Walks every assistant.name and every
   evaluations[].structuredOutput.name in scenarios. Closes #18.
2. SO ↔ assistant bidirectional lockstep. For every SO file's
   assistant_ids, checks the named assistant's structuredOutputIds
   mirrors it; reverse direction too. Closes #11.
3. Prompt duplication heuristics. Same H1 heading appearing twice,
   repeated CONTINUITY ON ENTRY / CLOSEOUT FLOW STRUCTURE blocks.
   Partial fix for #8 (paste-on-top dashboard duplications).
4. maxTokens floor for tool-using assistants. Computes
   floor ≈ 25 + sum(len(JSON.stringify(tool.function.parameters)))
   per attached tool. Warns under floor. Closes #19.
5. Per-provider voice schema. Cartesia rejects top-level speed /
   stability / similarityBoost / enableSsmlParsing (point at
   generationConfig.* / drop the field). 11labs rejects
   generationConfig (it's a Cartesia path). Closes #9 (engine half).

- src/validate.ts (NEW): validateResources(loadedResources) returning
  ValidationFinding[] with severity / type / resourceId / rule / message
  / fieldPath. Pure data; safe to test directly.
- src/validate-cmd.ts (NEW): CLI entry. Loads same resource shape as
  push.ts so the lint runs against exactly what would ship. Exit non-zero
  on any error finding.
- src/config.ts: --strict flag.
- src/push.ts: validators run in default-warn mode; --strict aborts.
- package.json: validate script.
- AGENTS.md: document npm run validate and --strict.
- tests/validate.test.ts: per-rule fixtures (golden + bad inputs)
  covering all five checks.

Closes improvements.md #11, #18, #19. Resolves engine half of #9.
Partial #8, #20 (heuristic only).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants