Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: add StyleTTS2 support #105

Open
danielw97 opened this issue Dec 11, 2023 · 7 comments
Open

enhancement: add StyleTTS2 support #105

danielw97 opened this issue Dec 11, 2023 · 7 comments

Comments

@danielw97
Copy link

You may very well be aware of this already, although there is a rather recent project called StyleTTS2 which raises the bar even further for open-source and local tts generation.
No pressure of course, although it would be great to have this integrated at some point in the future.
I've tested the demo on a cpu and it runs fairly quickly.
As of now there's an http api and also python integration at this repo.
https://github.com/NeuralVox/StyleTTS2

@aedocw
Copy link
Owner

aedocw commented Dec 11, 2023

Ah interesting, I had been watching StyleTTS2 progress a while back but I haven't looked at it in the last month or so. I'll check it out and try to play with it some, that would be neat if it's even better than XTTSv2!

@danielw97
Copy link
Author

danielw97 commented Dec 11, 2023

Great, I'd be interested in how you get on with that.
The main thing that sticks out to me is not only the naturalness, although it is quite fast on the cpu that I tested it with as well.

@danielw97
Copy link
Author

I'm keeping a close eye on the styletts2 project, and just wanted to pass along that there's been a pip package released strictly for inference.
Of course as this is new things change quickly, although wanted to let you know.
https://github.com/sidharthrajaram/StyleTTS2

@aedocw
Copy link
Owner

aedocw commented Dec 16, 2023 via email

@rsxdalv
Copy link

rsxdalv commented Jan 16, 2024

I just finished examining StyleTTS2. I think if we accumulate a bit more we might be able to solve the issue with GPL phonemizer dependency, or at least feel bad together.
rsxdalv/tts-generation-webui#212
yl4579/StyleTTS2#91

@aedocw
Copy link
Owner

aedocw commented Jan 16, 2024

I get it, I had not noticed the phonemizer/GPL issue when I poked around. Now though the GPL fork makes a whole lot more sense to me!

Honestly I think if/when when StyleTTS2 is sounding good and worth using, we can just use the neuralvox fork and re-license this to GPL, assuming the few contributors we've had will agree to that. If they won't, I can pull out any directly contributed code like that and write it fresh, and make a GPL fork of epub2tts.

@aedocw
Copy link
Owner

aedocw commented Jan 16, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants