Skip to content

Conversation

@yadavsahil197
Copy link
Contributor

@yadavsahil197 yadavsahil197 commented Oct 30, 2025

Have you read the Contributing Guidelines?

Issue #

Describe your changes

Adding support for new TTS models - orpheus and kokoro along with the streaming mode.


Note

Adds binary audio streaming handling in requestor (sync/async) and enhances TTS stream_to_file to assemble and save WAV/MP3/RAW outputs.

  • Request handling (src/together/abstract/api_requestor.py):
    • Handle streaming by content type: SSE (text/event-stream) vs binary audio (audio/wav, audio/mpeg, application/octet-stream).
    • Return chunked binary generators for audio streams (sync/async) and preserve bytes for non-streaming binary responses.
    • Refactor async response reading to avoid double reads and properly decode text vs binary.
  • Audio TTS (src/together/types/audio_speech.py):
    • AudioSpeechStreamResponse.stream_to_file(file_path, response_format) now:
      • Infers format from filename or parameter; supports wav, mp3, raw.
      • Aggregates streaming chunks (raw bytes or SSE base64) and writes output.
      • Adds WAV header for raw PCM; validates MP3 frames; writes RAW as-is.
    • New helper _write_wav_header to construct valid WAV files for PCM data.

Written by Cursor Bugbot for commit 2f6b638. This will update automatically on new commits. Configure here.

cursor[bot]

This comment was marked as outdated.

@zainhas zainhas merged commit 2b1338f into main Oct 31, 2025
12 checks passed
@zainhas zainhas deleted the tts_streaming branch October 31, 2025 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants