fix(cartesia): surface TTS websocket server errors by bowdens · Pull Request #1534 · livekit/agents-js

bowdens · 2026-05-18T06:51:32Z

Description

Encountered this issue when Cartesia had a brief outage: our agents were going silent and were not throwing any errors.

The Cartesia TTS plugin's WebSocket receive loop silently swallowed server-returned error frames; it logged them and continued. The base SynthesizeStream never got a thrown error, so tts_error was never emitted, retries never ran, and ttsErrorCounts / maxUnrecoverableErrors escalation never kicked in.

Ports the Python SDK fix (livekit/agents#3028 + #3080).

Changes Made

Throw a retryable APIConnectionError on Cartesia error frames so the base SynthesizeStream retry path actually runs.
Restructure recvTask branch order to match Python: any frame with done: true — including the error Cartesia returns for empty input on function-call turns — is treated as completion once the sentence stream has closed. Only pure error frames raise. Mirroring #3080 up front avoids shipping the same regression Python hit.
Stop recvTask's catch from swallowing APIError, and stop the outer catch from re-wrapping it via toRetryableConnectionError. Without these the new throw never reaches the base class.
Patch changeset for @livekit/agents-plugin-cartesia - dunno if that's the right designation for this

Pre-Review Checklist

Build passes: All builds (lint, typecheck, tests) pass locally.
AI-generated code reviewed.
Changes explained.
Scope appropriate.
Video demo. n/a

Testing

Automated tests added/updated (if applicable): n/a
All tests pass
Make sure both restaurant_agent.ts and realtime_agent.ts work properly (for major changes)

Additional Notes

n/a

Cartesia error frames received over the synthesis WebSocket were logged and dropped, so the base SynthesizeStream never saw a thrown error and tts_error was never emitted. Throw a retryable APIConnectionError so _mainTaskImpl can retry up to connOptions.maxRetry times and then emit tts_error with recoverable: false once retries are exhausted. Also stop the recvTask catch from swallowing APIError, and stop the outer catch from double-wrapping it via toRetryableConnectionError.

Cartesia error frames received over the synthesis WebSocket were logged and dropped, so the base SynthesizeStream never saw a thrown error and tts_error was never emitted. Throw a retryable APIConnectionError so _mainTaskImpl can retry up to connOptions.maxRetry times and then emit tts_error with recoverable: false once retries are exhausted. Restructure the recv loop to mirror the Python plugin's branch order (livekit/agents#3028 + #3080): any frame with `done: true` — including the error returned for empty/whitespace input on function-call turns — is treated as completion once the sentence stream has been closed, instead of being raised and retried. Only a "pure" error frame (no done:true) raises APIConnectionError. Also stop the recvTask catch from swallowing APIError, and stop the outer catch from double-wrapping it via toRetryableConnectionError.

changeset-bot · 2026-05-18T06:51:38Z

🦋 Changeset detected

Latest commit: f6d32e2

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 33 packages

Name	Type
@livekit/agents-plugin-cartesia	Patch
@livekit/agents	Patch
@livekit/agents-plugin-anam	Patch
@livekit/agents-plugin-assemblyai	Patch
@livekit/agents-plugin-baseten	Patch
@livekit/agents-plugin-bey	Patch
@livekit/agents-plugin-cerebras	Patch
@livekit/agents-plugin-deepgram	Patch
@livekit/agents-plugin-elevenlabs	Patch
@livekit/agents-plugin-fishaudio	Patch
@livekit/agents-plugin-google	Patch
@livekit/agents-plugin-hedra	Patch
@livekit/agents-plugin-hume	Patch
@livekit/agents-plugin-inworld	Patch
@livekit/agents-plugin-lemonslice	Patch
@livekit/agents-plugin-liveavatar	Patch
@livekit/agents-plugin-livekit	Patch
@livekit/agents-plugin-minimax	Patch
@livekit/agents-plugin-mistral	Patch
@livekit/agents-plugin-mistralai	Patch
@livekit/agents-plugin-neuphonic	Patch
@livekit/agents-plugin-openai	Patch
@livekit/agents-plugin-perplexity	Patch
@livekit/agents-plugin-phonic	Patch
@livekit/agents-plugin-resemble	Patch
@livekit/agents-plugin-rime	Patch
@livekit/agents-plugin-runway	Patch
@livekit/agents-plugin-sarvam	Patch
@livekit/agents-plugin-silero	Patch
@livekit/agents-plugin-tavus	Patch
@livekit/agents-plugin-trugen	Patch
@livekit/agents-plugin-xai	Patch
@livekit/agents-plugins-test	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

@charlotte-zhuang

… errors Per @charlotte-zhuang's review: Cartesia error frames also carry `done: true`, so the prior `serverMsg.done === true` branch swallowed real mid-stream errors instead of letting them throw to retry. Use isDoneMessage (type === 'done') for the completion branch so all errors reach the error branch. To preserve the empty-transcript fix that motivated mirroring Python's #3080 — Cartesia rejects empty / whitespace-only input with an error frame on function-call turns where the LLM emits no spoken text — track whether any non-empty token was sent, and in the error branch treat that specific case as benign completion instead of retrying. This is strictly better than the Python plugin, which catches error frames in its `data.get("done")` branch and silently swallows them when the tokenizer is open.

charlotte-zhuang

I'm a bit concerned with the new change where the client tries to decipher server errors and think that the server errors could serve as a good source of truth.

I also think the new "reducer-like" pattern of branching on server message type is a bit confusing now since the "done" logic would have to run in 2 places.

LMK what you think!

I'm also happy to stack a PR on top of yours with what I had in mind if that's easier.

charlotte-zhuang · 2026-05-19T16:40:42Z

-          if (isErrorMessage(serverMsg)) {
-            this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error');
-            continue;
-          }


Rather than tracking non-fatal errors client-side, can you check the status_code from the error message?

I also think that it would be great to move error handling back to the top to avoid duplicating the "done" logic, but that's no big deal.

// Handle error messages if (isErrorMessage(serverMsg)) { // Do not retry the connection on 4xx errors // since they can be safely ignored, e.g. empty transcripts if (400 <= serverMsg.status_code && serverMsg.status_code < 500) { this.#logger.debug({ error: serverMsg.error }, 'Cartesia sent a non-fatal error'); } else { this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error'); throw new APIConnectionError({ message: `Cartesia returned error: ${serverMsg.error}`, options: { retryable: true }, }); } }

charlotte-zhuang · 2026-05-19T16:45:15Z

+          } else if (this.#opts.wordTimestamps !== false && hasWordTimestamps(serverMsg)) {
+            const wordTimestamps = serverMsg.word_timestamps;
+            for (let i = 0; i < wordTimestamps.words.length; i++) {
+              const word = wordTimestamps.words[i];
+              const startTime = wordTimestamps.start[i];
+              const endTime = wordTimestamps.end[i];
+              if (word !== undefined && startTime !== undefined && endTime !== undefined) {
+                pendingTimedTranscripts.push(
+                  createTimedString({
+                    text: word + ' ', // Add space after word for consistency
+                    startTime,
+                    endTime,
+                  }),
+                );
+              }
+            }
+          } else if (isErrorMessage(serverMsg)) {


In combination with my previous comment, it could be nice to revert this change so the code looks like this:

parse message

if fatal error, throw to retry. otherwise, log and go to step 3

if there are timestamps, emit them

if there is audio, emit it

if the message is "type": "done" or "type": "error" AND "done": true, send the last frame

…rors Replace the client-side sentNonEmptyToken heuristic with a check against the status_code on Cartesia's error frame. 4xx (e.g. empty-transcript on function-call turns) is logged and finishes cleanly; 5xx bubbles up so the base SynthesizeStream can retry.

…e frames Restructure the recv-loop dispatch so error handling runs first (log on 4xx, throw on 5xx) and a single branch handles closing for both `type:"done"` and `type:"error"` frames carrying `done:true`. Removes the duplicated close sequence that previously lived in both isDoneMessage and isErrorMessage.

…handling

bowdens · 2026-05-21T22:45:24Z

I'm a bit concerned with the new change where the client tries to decipher server errors and think that the server errors could serve as a good source of truth.

I also think the new "reducer-like" pattern of branching on server message type is a bit confusing now since the "done" logic would have to run in 2 places.

LMK what you think!

I'm also happy to stack a PR on top of yours with what I had in mind if that's easier.

@charlotte-zhuang Thank you for your patience in reviewing this!

I've made some changes to address your comments, let me know if it lines up with your thinking.

charlotte-zhuang

thanks for the fix!

charlotte-zhuang · 2026-05-22T15:43:33Z

FYI I couldn't actually test the changes since I'm not sure how to run agents-js now that Agents Playground is gone.

charlotte-zhuang · 2026-05-30T14:34:05Z

I tested it and it seems to work well!

davidzhao · 2026-05-30T16:36:18Z

+              this.#logger.debug({ error: serverMsg.error }, 'Cartesia sent a non-fatal error');
+            } else {
+              this.#logger.error({ error: serverMsg.error }, 'Cartesia returned error');
+              throw new APIConnectionError({


since we have a status code here, we should throw a APIStatusError instead, which carries that detail.

🙏 f6d32e2

CLAassistant · 2026-05-31T23:29:16Z

All committers have signed the CLA.

Per review feedback, the fatal (5xx) Cartesia error frame now raises an APIStatusError carrying the server status_code instead of a generic APIConnectionError, surfacing the code for diagnostics while keeping the same retryable behaviour. The non-fatal 4xx fall-through is unchanged.

bowdens added 2 commits May 18, 2026 15:01

devin-ai-integration Bot reviewed May 18, 2026

View reviewed changes

charlotte-zhuang reviewed May 18, 2026

View reviewed changes

Comment thread plugins/cartesia/src/tts.ts Outdated

charlotte-zhuang reviewed May 19, 2026

View reviewed changes

bowdens added 3 commits May 20, 2026 10:41

docs(changeset): update cartesia TTS changeset for status_code-based …

62b79f0

…handling

charlotte-zhuang approved these changes May 22, 2026

View reviewed changes

davidzhao reviewed May 30, 2026

View reviewed changes

Merge branch 'livekit:main' into fix/cartesia-tts-server-errors

2531f59

bowdens force-pushed the fix/cartesia-tts-server-errors branch from c908917 to 7c5147d Compare June 1, 2026 00:07

bowdens force-pushed the fix/cartesia-tts-server-errors branch from 7c5147d to f6d32e2 Compare June 1, 2026 00:08

davidzhao approved these changes Jun 1, 2026

View reviewed changes

davidzhao merged commit bf7477a into livekit:main Jun 1, 2026
6 checks passed

Conversation

bowdens commented May 18, 2026

Description

Changes Made

Pre-Review Checklist

Testing

Additional Notes

Uh oh!

changeset-bot Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

Uh oh!

charlotte-zhuang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

charlotte-zhuang May 19, 2026

Choose a reason for hiding this comment

Uh oh!

charlotte-zhuang May 19, 2026

Choose a reason for hiding this comment

Uh oh!

bowdens commented May 21, 2026

Uh oh!

charlotte-zhuang left a comment

Choose a reason for hiding this comment

Uh oh!

charlotte-zhuang commented May 22, 2026

Uh oh!

charlotte-zhuang commented May 30, 2026

Uh oh!

davidzhao May 30, 2026

Choose a reason for hiding this comment

Uh oh!

bowdens Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented May 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

changeset-bot Bot commented May 18, 2026 •

edited

Loading

charlotte-zhuang left a comment •

edited

Loading

CLAassistant commented May 31, 2026 •

edited

Loading