Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


HTTP server front end to MaryTTS (text-to-speech engine).


The server sends wave formatted audio for the given text and voice inputs.

The following main query string parameters can be specified:

  • text: The text to turn into speech
  • locale (optional): The voice locale (e.g. "de" for German). It defaults to English ("en").
  • gender (optional): Allows you to select the voice for the locale based on its gender (i.e. 'male' or 'female').
  • voice (optional): If there are multiple MaryTTS voices installed for a language, you can choose the specific one with this.
  • style (optional): Allows you to specify a style for voices that support multiple styles.
  • effects (optional): Allows you to specify effects for the voice.

To better understand these options and how they are used by MaryTTS, you can install MaryTTS locally and experiment with it. You can also try the MaryTTS online demo, though as of this writing the demo is sometimes down.

To play the audio on a web page you can embed it with the <audio> tag, e.g.:

<audio autoplay>
<source src="" type="audio/wav">

Securing requests

If you wish to secure your requests so that a third party couldn't use your text-to-speech server for audio clips not related to your application, you can secure marytts-server by setting the HMAC_SECRET environment variable, which will then require requests to be signed with that secret using HMAC-SHA256.

You can also specify the expires (Unix timestamp) parameter with your requests so that they can only be played for a certain amount of time.

The string you need to sign is the concatenation of the various options (with blanks for the ones you didn't specify), i.e. text + locale + gender + voice + style + effects + expires.

The HMAC-SHA256 result should be sent in the signature parameter. So, for example, if your HMAC_SECRET were 8Z2X4ZZyI0+2Ud35CaPk4bSe+rjjFiIQkMWjBYj2Q5M= (it's stored in base64), then a signed URL would look like this:

Here's sample PHP code for signing a request and embedding it in a page:


MaryTTS server uses Jetty and it can easily be deployed to Heroku. Just clone
this repository and then push it to a Heroku remote. That will trigger the Maven
build which will fetch all of the necessary dependencies.

To build it locally with Maven run `mvn install` and then to execute the server 
with the following command:
java -cp target/classes:target/dependency/*:target/voices/lib/* maryserver.MaryServer

Potential enhancements

If you want to add more voices or languages, you can modify `pom.xml` to pull in 
more language jars and download more voices. 

A list of voices is shown in the 
file in the marytts project.

Another idea would be to support ogg/vorbis encoding using 

Utilizing caching (perhaps via Amazon S3 to store generated clips) could also be 


This marytts-http code itself is copyright David Raffensperger and MIT licensed.

MaryTTS is licensed under the GNU LGPL, but includes components with a variety
of licenses (see

The included German voice, `voice-bits3-hsmm` is licensed under the Creative
[Creative Commons License Attribution-NoDerivs 3.0 Unported](


Heroku-ready Java servlet front end for MaryTTS (text-to-speech engine)



No releases published


No packages published