Skip to content

Ladvien/speech_server

Repository files navigation

🗣️ Speech Server – TTS API with Chatterbox & Kokoro

This project provides a FastAPI-based HTTP server for generating speech audio using Chatterbox TTS or Kokoro ONNX.


🚀 Quick Start

1. Clone the Repo

git clone https://github.com/Ladvien/speech_server.git
cd speech_server

2. Install Dependencies

We use Poetry for managing dependencies.

poetry install

3. Run the Server

poetry run speech-server

Or:

poetry run uvicorn speech_server.server.app:app --host 0.0.0.0 --port 8000

Access:


🧪 Example Usage

Generate Audio from Text

curl -X POST http://localhost:8000/tts \
     -H "Content-Type: application/json" \
     -d '{"text": "Hello, world!", "voice": "default"}' \
     --output hello.wav

List Voices

curl http://localhost:8000/voices

⚙️ Configuration

Use config.yaml, environment variables, or Python config classes like TTSServerConfig.

tts_service: chatterbox  # or 'kokoro'
voice: default
log_level: info
sample_rate: 24000

🧠 Features

  • ✅ Chatterbox TTS (PyTorch)
  • ✅ Kokoro ONNX (lightweight, GPU-ready)
  • ✅ Voice cloning support
  • ✅ Streaming endpoint
  • /voices API
  • ✅ YAML config support
  • ✅ Ready for Docker or cloud deployment

🧩 Extend It

To add a new TTS engine, subclass:

speech_server.common.base_tts_service.TTSService

Then register it via your config loader.


🛠 Dev Tools

Lint, Format, Test

poetry run black .
poetry run isort .
poetry run pytest

Type Check

poetry run mypy src/

📚 Documentation

Build local docs:

cd docs
make html

Docs live in /docs/source/ and are rendered via ReadTheDocs.


📄 License

MIT © C. Thomas Brittain

About

An TTS SoTA model served via FastAPI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages