Add more VITS voices via piper #215

jpenguin · 2024-02-29T20:38:26Z

I know I previously mention edge-tts, which is cloud-based, fast and free, but under the GPL. I have recently been trying out https://github.com/rhasspy/piper/, which uses the VITS model and is under the MIT.

danielw97 · 2024-02-29T20:53:59Z

Whilst epub2tts is extremely stable as far as features and functionality currently at least for me, just a +1 for this.
Piper might not sound quite as good as something like Coqui in most circumstances, although it is extremely fast and is under active development.
I'm not sure how easy it would be to implement, although it is something that would be nice to see in the future.

jpenguin · 2024-02-29T21:15:44Z

Yes, on my 24 thread 3900X, it take around 10 hours to encode a 5 hour book with xtts (wish it could run on OpenCL or OneAPI from my A770). VITS is decent (epub2tts already has 335 & 307 VITS models) and quick.
I don't know python (just a little C), but there is a python interface and it's under a compatible license.

aedocw · 2024-02-29T21:49:30Z

This seems like it would be worth adding, I will take a look at https://github.com/rhasspy/piper/?tab=readme-ov-file#running-in-python and see. It will probably be a while (a few weeks at least) before I've got time again for this, but it looks like it might not be very difficult.

aedocw · 2024-03-04T20:53:24Z

This is going to be a problem that needs to be resolved before incorporating piper: rhasspy/piper#395

I'm able to install on linux without trouble, but my primary dev environment is macOS, and I would not want to introduce a dependency that makes it so epub2tts can not be installed on mac. I'll keep an eye on this though while I poke at a test integration branch.

aedocw · 2024-03-05T00:18:06Z

There's a first pass at this, still needs a lot of work around model name and speaker, and I have not tested anything with other languages, etc. BUT the branch https://github.com/aedocw/epub2tts/tree/add-piper has a very simple implementation that seems to work in a minimal sense. Adds --engine piper option, and in the future will support --model <piper model> and --speaker <piper speaker>. Might also have to support --language but that might be covered by your model choice, need to check.

michaelachrisco · 2024-03-26T05:12:52Z

There's a first pass at this, still needs a lot of work around model name and speaker, and I have not tested anything with other languages, etc. BUT the branch https://github.com/aedocw/epub2tts/tree/add-piper has a very simple implementation that seems to work in a minimal sense. Adds --engine piper option, and in the future will support --model <piper model> and --speaker <piper speaker>. Might also have to support --language but that might be covered by your model choice, need to check.

Awesome, this is great!

Just FYI, I got the following error using the branch with Ubuntu/PopOS:

Engine is Piper, model is /home/my_user/.local/piper/en_US-lessac-medium.onnx
Traceback (most recent call last):
  File "/home/my_user/venv/bin/epub2tts", line 33, in <module>
    sys.exit(load_entry_point('epub2tts==2.4.0', 'console_scripts', 'epub2tts')())
  File "/home/my_user/venv/lib/python3.10/site-packages/epub2tts.py", line 837, in main
    mybook.read_book(
  File "/home/my_user/venv/lib/python3.10/site-packages/epub2tts.py", line 486, in read_book
    self.voice = PiperVoice.load(self.model_name)
  File "/home/my_user/venv/lib/python3.10/site-packages/piper/voice.py", line 34, in load
    with open(config_path, "r", encoding="utf-8") as config_file:
FileNotFoundError: [Errno 2] No such file or directory: '/home/my_user/.local/piper/en_US-lessac-medium.onnx.json'

In order to fix, I manually pulled in the correct piper model via:

echo 'Welcome to the world of speech synthesis!' | piper   --model en_US-lessac-medium   --output_file welcome.wav

and then added the folder:

mkdir /home/my_user/.local/piper
cd /home/my_user/.local/piper
cp '/home/my_user/epub2tts/en_US-lessac-medium.onnx' .
cp '/home/my_user/epub2tts/en_US-lessac-medium.onnx.json' .

Obviously this was a quick fix and should be done differently, but it worked for a quick and dirty solution. Piper sounds great and ill probably use this branch as a starting point with a couple of creative commons books (more of a proof of concept than anything, comparing them all). Piper is very quick compared to the others, but sounds a bit more robotic, which is fine to me.

Here is the sample: https://github.com/michaelachrisco/epub2tts/blob/add-piper/sample-piper.m4b

jpenguin · 2024-03-27T01:37:39Z

Tested on a debian testing VM with pipx. Same as Michael, doesn't download model, but works

ln -s ~/.local/share/pipx/venvs/epub2tts/bin/piper ~/.local/bin/; echo 'Downloading voice' | piper --model en_US-lessac-medium --output-raw | aplay -r 22050 -f S16_LE -t raw -; mkdir ~/.local/piper; cp './en_US-lessac-medium.onnx' ~/.local/piper; cp './en_US-lessac-medium.onnx.json' ~/.local/piper
epub2tts ./sample.txt --sayparts --engine piper works after that

aedocw · 2024-03-27T02:31:43Z

Thanks both of you for sharing this. Once the issues with installing on apple silicon are resolved, I'll do some more work to clean this up and make it usable.

In the mean time I suggest you check out the --engine edge option. It's not super fast, but it doesn't use local CPU so it's pretty painless to leave running in screen/tmux, and the quality is better than almost everything else. Arguably XTTS still sounds better, but the occasional repeats and gibberish get annoying (to me) after a while.

danielw97 · 2024-05-03T20:30:39Z

Hi again,
I'm just curious if there is any update on this at all?
Although it's somewhat robotic, piper is extremely fast even on cpu which is a plus for certain applications.
I picked up an m1 mac recently, so am happy to test if that would help and come back with some feedback.
If you've moved on to epub2tts-edge I also understand, and really appreciate the work you've done on this utility.
It is really the best program out there for converting books into good quality tts that I've found, as the rest use either openai, azure or another online text to speech service.
Thanks as always.

aedocw · 2024-05-04T21:12:03Z

It looks like there are still open issues around installing on mac (rhasspy/piper#395). Feel free though to try it out and see if the issues are resolved now on mac. I don't think I'll have much time to play with this over the next few weeks (getting busy these days!) but update here if it is actually working OK on mac, then I could try to clean up that piper branch and add it to main.

danielw97 · 2024-05-06T14:36:23Z

Hi,
I've done some testing, and it looks as though piper still doesn't install on arm-based macs unfortunately.
I'll keep an eye on the open pull request and test it if and when it's merged.

aedocw self-assigned this Feb 29, 2024

aedocw added the enhancement New feature or request label Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add more VITS voices via piper #215

Add more VITS voices via piper #215

jpenguin commented Feb 29, 2024

danielw97 commented Feb 29, 2024

jpenguin commented Feb 29, 2024 •

edited

Loading

aedocw commented Feb 29, 2024

aedocw commented Mar 4, 2024

aedocw commented Mar 5, 2024

michaelachrisco commented Mar 26, 2024 •

edited

Loading

jpenguin commented Mar 27, 2024 •

edited

Loading

aedocw commented Mar 27, 2024

danielw97 commented May 3, 2024

aedocw commented May 4, 2024

danielw97 commented May 6, 2024

Add more VITS voices via piper #215

Add more VITS voices via piper #215

Comments

jpenguin commented Feb 29, 2024

danielw97 commented Feb 29, 2024

jpenguin commented Feb 29, 2024 • edited Loading

aedocw commented Feb 29, 2024

aedocw commented Mar 4, 2024

aedocw commented Mar 5, 2024

michaelachrisco commented Mar 26, 2024 • edited Loading

jpenguin commented Mar 27, 2024 • edited Loading

aedocw commented Mar 27, 2024

danielw97 commented May 3, 2024

aedocw commented May 4, 2024

danielw97 commented May 6, 2024

jpenguin commented Feb 29, 2024 •

edited

Loading

michaelachrisco commented Mar 26, 2024 •

edited

Loading

jpenguin commented Mar 27, 2024 •

edited

Loading