tts-middleware

Middleware module for our speech synthesis systems.

Supported SSML tags

Many common tags are assumed implicitly. Read this for an overview of SSML specification.

Sentence level <prosody> with rate, pitch, and volume attributes.
<phoneme> with ipa attribute.
<voice> with gender and name attribute.

Installation

Install sox.
pip install tts-middleware

Usage

For full featured inference, simply wrap your TTS function (text to audio) with the decorator like this:

from tts_middleware.core import tts_middleware, Audio
import numpy as np

@tts_middleware
def tts(text: str, language_code: str) -> Audio:
    # Do requests and return audio
    ...

# Now calls to `tts` will support SSML with all features enabled.

Attributes for SSML tags are described next:

<prosody rate='1.3'>hello world</prosody>.
<prosody pitch='2'>hello world</prosody>. Parameter is number of semitones as described in pysox here.
<prosody volume='10'>hello world</prosody>. Parameter is gains in db similar to pysox here.
<voice gender="female" name="excited"> hello world! </voice>. Voice element supports two attributes: gender and name.

There is a streamlit app which you can use to try the API by doing the following:

# You need to install espeak for this.

poetry install
poetry run streamlit run ./examples/app.py

There are three major components here, all of which can be used in isolation.

`tts_middleware.normalizer`

For implicit normalization of SSML marked text. No normalization level tags are supported at the moment, so this only touches raw text.

`tts_middleware.phonemizer`

For converting normalized and <phoneme> marked text in phone symbols. This can be used independently for pre-processing training data too.

`tts_middleware.audio`

For applying signal level post processing steps (mostly rate and volume attributes) on generated audios.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
.github/workflows		.github/workflows
examples		examples
tests		tests
tts_middleware		tts_middleware
.gitignore		.gitignore
README.md		README.md
mypy.ini		mypy.ini
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tts-middleware

Supported SSML tags

Installation

Usage

`tts_middleware.normalizer`

`tts_middleware.phonemizer`

`tts_middleware.audio`

About

Releases 7

Packages

Contributors 5

Languages

skit-ai/tts-middleware

Folders and files

Latest commit

History

Repository files navigation

tts-middleware

Supported SSML tags

Installation

Usage

tts_middleware.normalizer

tts_middleware.phonemizer

tts_middleware.audio

About

Resources

Stars

Watchers

Forks

Releases 7

Packages 0

Contributors 5

Languages

`tts_middleware.normalizer`

`tts_middleware.phonemizer`

`tts_middleware.audio`

Packages