HTTP microservice using Axum to generate TTS from an HTTP reqwest.
- eSpeak - Local TTS, low quality. Returns WAV audio.
- gTTS - Cloud TTS, medium quality. Returns MP3 audio
- gcloud - Google Cloud TTS, high quality. Returns OPUS audio. Requires a gCloud API key
- Polly - Amazon Polly TTS, high quality. Returns OggVorbis audio. Requires Amazon Polly credentials
GET /tts?text={CONTENT}&lang={VOICE}&mode={MODE}&speaking_rate={SPEAKING_RATE}&max_length={MAX_LENGTH}&preferred_format={PREFERRED_AUDIO_FORMAT}
- Returns the audio generated.GET /voices?mode={MODE}&raw={BOOL}
- Returns the supported voices for the given mode as either a JSON array of strings, or a raw format from the source with theraw
set to true.GET /modes
- Returns the currently supported modes for TTS as a JSON array of strings.
Non-200 responses will return a JSON object with the following keys:
0
- Unknown error1
- Unknown voice2
- Max length exceeded3
- Speaking rate exceeded limits, see thedisplay
for more information4
-AUTH_KEY
has been set and theAuthorization
header doesn't match the key.
A human readable message describing the error
-
BIND_ADDR
(0.0.0.0:3000
) - The address to bind the web server to -
LOG_LEVEL
(INFO
) - The lowest log level to output to stdout -
AUTH_KEY
- If set, this key must be sent in theAuthorization
header of each request
IPV6_BLOCK
- A block of IPv6 addresses, randomly selected for each gTTS request
GOOGLE_APPLICATION_CREDENTIALS
- The file path to the gCloud JSON
-
AWS_REGION
- The AWS region to use -
AWS_ACCESS_KEY_ID
- The AWS access key ID -
AWS_SECRET_ACCESS_KEY
- The AWS secret access key
MODES
(espeak
) - A comma separated list of modes to support