Skip to content

Imusing/MarkovTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Novixx's MarkovTTS, a Markov Chain based Text-to-Speech synthesizer

This is a simple text-to-speech synthesizer that uses a Markov Chain to generate speech. It written in python and uses gradio for the web interface. Use it to train or use a Markov Chain model to generate speech.

Limitations

  • The model is not very good at generating speech when the model has multiple voices, make sure your data has 1 voice.
  • The model size increases exponentially making it hard to train on large datasets.

Usage

  1. Install the requirements:
pip install gradio numpy librosa tqdm gzip soundfile
  1. Run the script:
python tts.py

You can also make the Markov Chain speak by using the "Length of Extra Sequence" slider to set the extra length of the generated speech. Each extra sequence will be generated by the Markov Chain and concatenated to the previous sequence, this will make it speak nonsense, but it's fun to play with.

Dataset format

The dataset should be a directory containing WAV files, and optionally TXT files with the same name as the WAV file. The TXT file should contain the transcript of the WAV file, if a file has no corresponding TXT file, the filename will be used as the transcript. All audio files must be a single word. The model will not work if the transcript is more than one word.

Training

To train the model, run the script, navigate to the web interface, and click the "Train" tab. Select the dataset directory and click "Train Model".

License

This project is licensed under the GNU General Public License v3.0. See the LICENSE file for details.

Acknowledgements

This project was made possible by the following libraries:

Contributors

(Add your name here in your first PR, or not, if you don't want to be listed)

About

Text to Speech using a Markov Chain

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Contributors

Languages