v2.0.3

Latest

Latest

lipku released this 07 Jun 12:01

· 1 commit to main since this release

860c6e0

✨ New Features

OmniTTS — vLLM Omni Speech Adapter

Added tts/omnitts.py, a new TTS backend that calls the vLLM Omni OpenAI-compatible speech API (POST /v1/audio/speech). Supports all vLLM Omni models (Qwen3-TTS, Fish Speech S2, CosyVoice3, Voxtral, VoxCPM2, MOSS-TTS-Nano) with configurable source sample rate, automatic resampling to 16 kHz, and per-message parameter overrides (voice, language, speed, instructions, task type). usage

TTS Voice Manager Web UI

Added web/tts/index.html and web/tts/index-en.html — a full-featured voice management dashboard:
- Voice list — browse preset and uploaded voices from the vLLM Omni server, with one-click selection and voice deletion.
- Voice clone upload — upload audio samples to clone new voices with auto-generated consent ID, required reference text, and a browser-native speech recognition button for auto-transcription.
- Speech synthesis test — select a voice, enter text, and synthesize speech with adjustable speed, language (11 languages), output format (WAV/MP3/FLAC/AAC/Opus/PCM), and task type. Results play in-page and are downloadable.
Added quick-link cards to the main index pages pointing to the TTS manager.

Environment Variable Template

Added .env.example with placeholder keys for Tencent, DashScope, and Doubao services. python-dotenv auto-loads .env on startup.

Assets 2

0 Join discussion