Fast DeepPoniesTTS

This is a fork of the DeepPoniesFrontend project by dunky11 that greatly improves its performance by parallelizing the text-to-speech conversion process and adding GPU support.

The original project allows you to generate an audio file from text using a multispeaker TTS model. The available voices are: Adachi Tohru, Apple Bloom, Applejack, Barack Obama, Bart Simpson, Billie Eilish, Celestia, Chie Satonaka, Cozy Glow, Demoman, Discord, Donald Trump, Engineer, Fluttershy, Franklin, GLaDOS, Granny Smith, Heavy, Homer Simpson, Joe Biden, Joe Rogan, Kanji Tatsumi, Kanye West, Kim Kardashian, Kratos, Luna, Medic, Michael, Nameless Hero, Nanako Dojima, Naoto Shirogane, Pinkie Pie, Rainbow Dash, Rarity, Rise Kujikawa, Ryotaro Dojima, Scootaloo, Scout, Sniper, Soldier, Spike, SpongeBob, Spy, Starlight, Sunset Shimmer, Sweetie Belle, Teddie, Trevor, Trixie, Twilight Sparkle, and Yosuke Hanamura.

My fork greatly speeds up the text-to-speech conversion process by parallelizing the work in threads. Furthermore, it adds GPU support, resulting in incredibly fast processing times for longer texts. For example:

Text Length	Type	Original	FastMod
1416	CPU	24 sec	14 sec
1416	GPU	Not supported	3.72 sec

To take advantage of these improvements, simply run the main.py file, which will convert your text into an audio file using the selected speaker and playback speed.

Example Usage

To use this project, simply follow these steps:

If you are on Linux, please run pip install pynini==2.1.4 manually. On Windows, it is recommended to use a Conda environment; otherwise, it may be impossible to install Pynini. In this case, you will need to run conda install -c conda-forge pynini manually. Finally, on both systems, you can install the required dependencies by running pip install -r requirements.txt OR you can just install conda and run conda env create --name ponyenv --file my_env.yml
Edit and run the main.py file.
Wait for the audio file to be generated.
Enjoy your audio file!

Credits

This project is based on the DeepPoniesTTS project by dunky11.It is also based on this notebook I would like to thank him for his hard work and for creating such an amazing text-to-speech models.

License

This project is licensed under the MIT License. See the LICENSE file for more information. The models license specified in the original notebook is the following: "If you use the synthesized voices of this notebook in your project, citing/naming is appreciated, but not required :)"

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.hypothesis/unicode_data/13.0.0		.hypothesis/unicode_data/13.0.0
assets		assets
.gitignore		.gitignore
LICENSE		LICENSE
README.MD		README.MD
api.py		api.py
main.py		main.py
my_env.yml		my_env.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast DeepPoniesTTS

Example Usage

Credits

License

About

Releases

Packages

Languages

License

Nuked88/FastDeepPoniesTTS

Folders and files

Latest commit

History

Repository files navigation

Fast DeepPoniesTTS

Example Usage

Credits

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages