Skip to content

v2.0.3

Latest

Choose a tag to compare

@lipku lipku released this 07 Jun 12:01
· 1 commit to main since this release

✨ New Features

OmniTTS — vLLM Omni Speech Adapter

  • Added tts/omnitts.py, a new TTS backend that calls the vLLM Omni OpenAI-compatible speech API (POST /v1/audio/speech). Supports all vLLM Omni models (Qwen3-TTS, Fish Speech S2, CosyVoice3, Voxtral, VoxCPM2, MOSS-TTS-Nano) with configurable source sample rate, automatic resampling to 16 kHz, and per-message parameter overrides (voice, language, speed, instructions, task type). usage

TTS Voice Manager Web UI

  • Added web/tts/index.html and web/tts/index-en.html — a full-featured voice management dashboard:
    • Voice list — browse preset and uploaded voices from the vLLM Omni server, with one-click selection and voice deletion.
    • Voice clone upload — upload audio samples to clone new voices with auto-generated consent ID, required reference text, and a browser-native speech recognition button for auto-transcription.
    • Speech synthesis test — select a voice, enter text, and synthesize speech with adjustable speed, language (11 languages), output format (WAV/MP3/FLAC/AAC/Opus/PCM), and task type. Results play in-page and are downloadable.
  • Added quick-link cards to the main index pages pointing to the TTS manager.

Environment Variable Template

  • Added .env.example with placeholder keys for Tencent, DashScope, and Doubao services. python-dotenv auto-loads .env on startup.
image