Skip to content

bozakov/pocket_tts_api

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pocket TTS API

Lightweight local TTS server based on the very fast Pocket TTS model from Kyutai, provides a simple OpenAI-compatible speech API (v1/audio/speech) for generating audio from text.

Using an old Haswell CPU it runs at around 1.5x real-time speed with the nova voice.

The server works great with the OpenAI TTS Custom Component for Home Assistant.

Inspired by kyutai-tts-openai-api.

Build and run with Docker:

docker build -t pocket_tts_api .
docker run --restart=always --name pocket_tts_api -d -p 8008:8000 pocket_tts_api

or with docker-compose:

docker-compose up -d

Currently the model and speed parameters are ignored.

Test server with curl:

curl http://localhost:8008/v1/audio/speech \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tts-1",
    "input": "Hello! This is a test of the fully compatible local text to speech server.",
    "voice": "nova",
    "response_format":"wav",
    "speed": 1.1
  }' \
  --output test_audio.wav

About

Kyutai Pocket TTS API Server

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors