Skip to content

semyenov/silero-tts-generator

Repository files navigation

๐ŸŽ™๏ธ Silero TTS Text-to-Speech Generator

๐ŸŒŸ Overview

A powerful Python-based text-to-speech generator leveraging Silero TTS models, designed for versatile and high-quality speech synthesis:

  • ๐ŸŒ Multiple language support (Russian, English, German)
  • ๐Ÿ“ Advanced SSML text processing
  • ๐Ÿ”Š Intelligent noise reduction
  • ๐Ÿ’ป GPU/CPU compatibility
  • ๐ŸŽญ Flexible speaker selection
  • ๐ŸŒช๏ธ Tornado-based API server for remote TTS generation

๐Ÿ“‚ Project Structure

  • __main__.py: ๐Ÿš€ Example script demonstrating local TTS usage
  • silero_tts_processor.py: ๐Ÿง  Core TTS processor class
  • tts_server.py: ๐ŸŒ Tornado-based API server for remote TTS generation
  • test_request.sh: ๐Ÿงช Bash script for testing the TTS API
  • requirements.txt: ๐Ÿ“ฆ Project dependencies

๐Ÿ› ๏ธ Prerequisites

  • ๐Ÿ Python 3.8+
  • ๐Ÿš€ CUDA (optional, for GPU acceleration)
  • ๐ŸŒ curl for API testing (optional)
  • ๐Ÿ“Š jq for JSON parsing (optional)

๐Ÿš€ Installation

  1. Clone the repository:
git clone https://github.com/semyenov/silero-tts-generator.git
cd silero-tts-generator
  1. Create a virtual environment:
python3 -m venv .venv
source .venv/bin/activate
  1. Install dependencies:
pip install -r requirements.txt

๐ŸŽฌ Local Usage

from silero_tts_processor import SileroTTSProcessor

# Create TTS processor
tts = SileroTTSProcessor(
    language_id="ru",
    model_id="v4_ru",
)

# Generate speech
audio = tts.generate_speech(
    "<speak>ะŸั€ะธะฒะตั‚, ะผะธั€</speak>",
    speaker_id="xenia",
    enhance_noise=True,
    output_filename="output.wav"
)
tts.play_audio(audio)

๐ŸŒ API Server Usage

Start the Tornado API server:

python tts_server.py

๐Ÿงช API Testing with Bash Script

A convenient bash script test_request.sh is provided to test the TTS API:

# Basic usage
./test_request.sh -t "ะŸั€ะธะฒะตั‚, ะผะธั€"

# Advanced usage with custom parameters
./test_request.sh \
    -t "<speak>ะŸั€ะธะฒะตั‚, ะผะธั€</speak>" \
    -s xenia

Script options:

  • -t: Text to convert to speech (required)
  • -s: Speaker (default: xenia)
  • -h: Show help message

๐Ÿ“„ API Endpoints

๐ŸŽ™๏ธ Generate TTS

POST /tts

Request Body:

{
  "text": "<speak>ะขะตะบัั‚ ะดะปั ัะธะฝั‚ะตะทะฐ ั€ะตั‡ะธ</speak>",
  "speaker": "xenia",
  "enhance_noise": true
}

Response:

{
  "success": true,
  "filename": "generated_audio_file.wav"
}

๐Ÿ” Retrieve Audio File

GET /audio/{filename}

Retrieves the generated audio file.

๐ŸŒ Supported Languages

  • Russian
  • English
  • German
  • ...

Full list of supported languages and models can be found here.

๐Ÿ”ง Troubleshooting

  • Ensure you have the latest version of PyTorch
  • Check CUDA compatibility if using GPU
  • Verify audio device settings
  • Make sure curl and jq are installed for API testing

๐Ÿ“„ License

MIT License

๐Ÿค Contributing

Pull requests are welcome. For major changes, please open an issue first.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published