Skip to content

fix(cartesia): surface TTS websocket server errors#1534

Merged
davidzhao merged 8 commits into
livekit:mainfrom
bowdens:fix/cartesia-tts-server-errors
Jun 1, 2026
Merged

fix(cartesia): surface TTS websocket server errors#1534
davidzhao merged 8 commits into
livekit:mainfrom
bowdens:fix/cartesia-tts-server-errors

Conversation

@bowdens
Copy link
Copy Markdown
Contributor

@bowdens bowdens commented May 18, 2026

Description

Encountered this issue when Cartesia had a brief outage: our agents were going silent and were not throwing any errors.

The Cartesia TTS plugin's WebSocket receive loop silently swallowed server-returned error frames; it logged them and continued. The base SynthesizeStream never got a thrown error, so tts_error was never emitted, retries never ran, and ttsErrorCounts / maxUnrecoverableErrors escalation never kicked in.

Ports the Python SDK fix (livekit/agents#3028 + #3080).

Changes Made

  • Throw a retryable APIConnectionError on Cartesia error frames so the base SynthesizeStream retry path actually runs.
  • Restructure recvTask branch order to match Python: any frame with done: true — including the error Cartesia returns for empty input on function-call turns — is treated as completion once the sentence stream has closed. Only pure error frames raise. Mirroring #3080 up front avoids shipping the same regression Python hit.
  • Stop recvTask's catch from swallowing APIError, and stop the outer catch from re-wrapping it via toRetryableConnectionError. Without these the new throw never reaches the base class.
  • Patch changeset for @livekit/agents-plugin-cartesia - dunno if that's the right designation for this

Pre-Review Checklist

  • Build passes: All builds (lint, typecheck, tests) pass locally.
  • AI-generated code reviewed.
  • Changes explained.
  • Scope appropriate.
  • Video demo. n/a

Testing

  • Automated tests added/updated (if applicable): n/a
  • All tests pass
  • Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes

n/a

bowdens added 2 commits May 18, 2026 15:01
Cartesia error frames received over the synthesis WebSocket were
logged and dropped, so the base SynthesizeStream never saw a thrown
error and tts_error was never emitted. Throw a retryable
APIConnectionError so _mainTaskImpl can retry up to
connOptions.maxRetry times and then emit tts_error with
recoverable: false once retries are exhausted.

Also stop the recvTask catch from swallowing APIError, and stop the
outer catch from double-wrapping it via toRetryableConnectionError.
Cartesia error frames received over the synthesis WebSocket were
logged and dropped, so the base SynthesizeStream never saw a thrown
error and tts_error was never emitted. Throw a retryable
APIConnectionError so _mainTaskImpl can retry up to
connOptions.maxRetry times and then emit tts_error with
recoverable: false once retries are exhausted.

Restructure the recv loop to mirror the Python plugin's branch order
(livekit/agents#3028 + #3080): any frame with `done: true` —
including the error returned for empty/whitespace input on
function-call turns — is treated as completion once the sentence
stream has been closed, instead of being raised and retried. Only a
"pure" error frame (no done:true) raises APIConnectionError.

Also stop the recvTask catch from swallowing APIError, and stop the
outer catch from double-wrapping it via toRetryableConnectionError.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 18, 2026

🦋 Changeset detected

Latest commit: f6d32e2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 33 packages
Name Type
@livekit/agents-plugin-cartesia Patch
@livekit/agents Patch
@livekit/agents-plugin-anam Patch
@livekit/agents-plugin-assemblyai Patch
@livekit/agents-plugin-baseten Patch
@livekit/agents-plugin-bey Patch
@livekit/agents-plugin-cerebras Patch
@livekit/agents-plugin-deepgram Patch
@livekit/agents-plugin-elevenlabs Patch
@livekit/agents-plugin-fishaudio Patch
@livekit/agents-plugin-google Patch
@livekit/agents-plugin-hedra Patch
@livekit/agents-plugin-hume Patch
@livekit/agents-plugin-inworld Patch
@livekit/agents-plugin-lemonslice Patch
@livekit/agents-plugin-liveavatar Patch
@livekit/agents-plugin-livekit Patch
@livekit/agents-plugin-minimax Patch
@livekit/agents-plugin-mistral Patch
@livekit/agents-plugin-mistralai Patch
@livekit/agents-plugin-neuphonic Patch
@livekit/agents-plugin-openai Patch
@livekit/agents-plugin-perplexity Patch
@livekit/agents-plugin-phonic Patch
@livekit/agents-plugin-resemble Patch
@livekit/agents-plugin-rime Patch
@livekit/agents-plugin-runway Patch
@livekit/agents-plugin-sarvam Patch
@livekit/agents-plugin-silero Patch
@livekit/agents-plugin-tavus Patch
@livekit/agents-plugin-trugen Patch
@livekit/agents-plugin-xai Patch
@livekit/agents-plugins-test Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

Comment thread plugins/cartesia/src/tts.ts Outdated
… errors

Per @charlotte-zhuang's review: Cartesia error frames also carry
`done: true`, so the prior `serverMsg.done === true` branch swallowed
real mid-stream errors instead of letting them throw to retry.

Use isDoneMessage (type === 'done') for the completion branch so all
errors reach the error branch. To preserve the empty-transcript fix
that motivated mirroring Python's #3080 — Cartesia rejects empty /
whitespace-only input with an error frame on function-call turns
where the LLM emits no spoken text — track whether any non-empty
token was sent, and in the error branch treat that specific case as
benign completion instead of retrying.

This is strictly better than the Python plugin, which catches error
frames in its `data.get("done")` branch and silently swallows them
when the tokenizer is open.
Copy link
Copy Markdown
Contributor

@charlotte-zhuang charlotte-zhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit concerned with the new change where the client tries to decipher server errors and think that the server errors could serve as a good source of truth.

I also think the new "reducer-like" pattern of branching on server message type is a bit confusing now since the "done" logic would have to run in 2 places.

LMK what you think!

I'm also happy to stack a PR on top of yours with what I had in mind if that's easier.

Comment on lines -437 to -440
if (isErrorMessage(serverMsg)) {
this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error');
continue;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than tracking non-fatal errors client-side, can you check the status_code from the error message?

I also think that it would be great to move error handling back to the top to avoid duplicating the "done" logic, but that's no big deal.

          // Handle error messages
          if (isErrorMessage(serverMsg)) {
            // Do not retry the connection on 4xx errors
            // since they can be safely ignored, e.g. empty transcripts
            if (400 <= serverMsg.status_code && serverMsg.status_code < 500) {
              this.#logger.debug({ error: serverMsg.error }, 'Cartesia sent a non-fatal error');
            } else {
              this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error');
              throw new APIConnectionError({
                message: `Cartesia returned error: ${serverMsg.error}`,
                options: { retryable: true },
              });
            }
          }

Comment thread plugins/cartesia/src/tts.ts Outdated
Comment on lines +489 to +505
} else if (this.#opts.wordTimestamps !== false && hasWordTimestamps(serverMsg)) {
const wordTimestamps = serverMsg.word_timestamps;
for (let i = 0; i < wordTimestamps.words.length; i++) {
const word = wordTimestamps.words[i];
const startTime = wordTimestamps.start[i];
const endTime = wordTimestamps.end[i];
if (word !== undefined && startTime !== undefined && endTime !== undefined) {
pendingTimedTranscripts.push(
createTimedString({
text: word + ' ', // Add space after word for consistency
startTime,
endTime,
}),
);
}
}
} else if (isErrorMessage(serverMsg)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In combination with my previous comment, it could be nice to revert this change so the code looks like this:

  1. parse message
  2. if fatal error, throw to retry. otherwise, log and go to step 3
  3. if there are timestamps, emit them
  4. if there is audio, emit it
  5. if the message is "type": "done" or "type": "error" AND "done": true, send the last frame

bowdens added 3 commits May 20, 2026 10:41
…rors

Replace the client-side sentNonEmptyToken heuristic with a check against
the status_code on Cartesia's error frame. 4xx (e.g. empty-transcript on
function-call turns) is logged and finishes cleanly; 5xx bubbles up so the
base SynthesizeStream can retry.
…e frames

Restructure the recv-loop dispatch so error handling runs first (log on 4xx,
throw on 5xx) and a single branch handles closing for both `type:"done"` and
`type:"error"` frames carrying `done:true`. Removes the duplicated close
sequence that previously lived in both isDoneMessage and isErrorMessage.
@bowdens
Copy link
Copy Markdown
Contributor Author

bowdens commented May 21, 2026

I'm a bit concerned with the new change where the client tries to decipher server errors and think that the server errors could serve as a good source of truth.

I also think the new "reducer-like" pattern of branching on server message type is a bit confusing now since the "done" logic would have to run in 2 places.

LMK what you think!

I'm also happy to stack a PR on top of yours with what I had in mind if that's easier.

@charlotte-zhuang Thank you for your patience in reviewing this!

I've made some changes to address your comments, let me know if it lines up with your thinking.

Copy link
Copy Markdown
Contributor

@charlotte-zhuang charlotte-zhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix!

@charlotte-zhuang
Copy link
Copy Markdown
Contributor

FYI I couldn't actually test the changes since I'm not sure how to run agents-js now that Agents Playground is gone.

@charlotte-zhuang
Copy link
Copy Markdown
Contributor

I tested it and it seems to work well!

Comment thread plugins/cartesia/src/tts.ts Outdated
this.#logger.debug({ error: serverMsg.error }, 'Cartesia sent a non-fatal error');
} else {
this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error');
throw new APIConnectionError({
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have a status code here, we should throw a APIStatusError instead, which carries that detail.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙏 f6d32e2

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 31, 2026

CLA assistant check
All committers have signed the CLA.

@bowdens bowdens force-pushed the fix/cartesia-tts-server-errors branch from c908917 to 7c5147d Compare June 1, 2026 00:07
Per review feedback, the fatal (5xx) Cartesia error frame now raises an
APIStatusError carrying the server status_code instead of a generic
APIConnectionError, surfacing the code for diagnostics while keeping the
same retryable behaviour. The non-fatal 4xx fall-through is unchanged.
@bowdens bowdens force-pushed the fix/cartesia-tts-server-errors branch from 7c5147d to f6d32e2 Compare June 1, 2026 00:08
@davidzhao davidzhao merged commit bf7477a into livekit:main Jun 1, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants