Skip to content

feat(fishaudio): use websocket API for faster inference#5629

Merged
davidzhao merged 4 commits intomainfrom
dz/fix-fish-ttfb
May 3, 2026
Merged

feat(fishaudio): use websocket API for faster inference#5629
davidzhao merged 4 commits intomainfrom
dz/fix-fish-ttfb

Conversation

@davidzhao
Copy link
Copy Markdown
Member

reducing TTFB from ~400ms to ~230ms

reducing TTFB from ~400ms to ~230ms
@chenghao-mou chenghao-mou requested a review from a team May 2, 2026 06:27
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

Open in Devin Review

Comment thread livekit-plugins/livekit-plugins-fishaudio/livekit/plugins/fishaudio/tts.py Outdated
devin-ai-integration[bot]

This comment was marked as resolved.

@davidzhao davidzhao merged commit 38ffc2c into main May 3, 2026
22 of 23 checks passed
@davidzhao davidzhao deleted the dz/fix-fish-ttfb branch May 3, 2026 20:37
Copy link
Copy Markdown
Contributor

This is an automated Claude Code Routine created by @toubatbrian. Right now it is in experimentation stage. The automation will start porting this PR into agents-js automatically.

Note: this PR contains a mix of plugin-specific (fishaudio), Python-specific (replacing the fish-audio-sdk dependency), and core runtime changes. Since agents-js does not currently have a fishaudio plugin and the dependency swap is Python-specific, only the core runtime improvements will be ported:

  • tts/tts.py: don't tear the audio decoder down on a mid-stream FlushSegment (preserves stateful codec parser state for WAV/OGG/MP3).
  • utils/codecs/decoder.py: gracefully handle EOF before any audio could be decoded.

Generated by Claude Code

Copy link
Copy Markdown
Contributor

Update from the routine: after inspecting agents-js, none of the changes in this PR have a direct port path, so no JS PR will be opened.

  • fishaudio plugin rewrite (HTTP+msgpack websocket) — agents-js does not have a fishaudio plugin, so there is nothing to update.
  • fish-audio-sdkmsgpack dependency swap — Python-specific, skipped per the routine's rules.
  • tts/tts.py mid-stream FlushSegment decoder fix and utils/codecs/decoder.py EOFError handling — these patch the Python AudioEmitter._decode_task path which wraps a stateful FFmpeg-based streaming decoder. agents-js has no equivalent: every TTS plugin (cartesia, elevenlabs, deepgram, baseten, inworld, etc.) emits raw PCM directly via AudioByteStream, and FFmpeg is only used in agents/src/audio.ts for one-shot file playback and in recorder_io.ts for output encoding — neither has a flush()-driven decoder-teardown path or the EOF-before-any-output failure mode that the Python fixes target.

cc @toubatbrian


Generated by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants