Pocket TTS API

Lightweight local TTS server based on the very fast Pocket TTS model from Kyutai, provides a simple OpenAI-compatible speech API (v1/audio/speech) for generating audio from text.

Using an old Haswell CPU it runs at around 1.5x real-time speed with the nova voice.

The server works great with the OpenAI TTS Custom Component for Home Assistant.

Inspired by kyutai-tts-openai-api.

Build and run with Docker:

docker build -t pocket_tts_api .
docker run --restart=always --name pocket_tts_api -d -p 8008:8000 pocket_tts_api

or with docker-compose:

docker-compose up -d

Currently the model and speed parameters are ignored.

Test server with `curl`:

curl http://localhost:8008/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello! This is a test of the fully compatible local text to speech server.",
    "voice": "nova",
    "response_format":"wav",
    "speed": 1.1
  }' \
  --output test_audio.wav

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yaml		docker-compose.yaml
pocketapi.py		pocketapi.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pocket TTS API

Build and run with Docker:

Test server with `curl`:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pocket TTS API

Build and run with Docker:

Test server with curl:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Test server with `curl`:

Packages