enhancement: add StyleTTS2 support #105

danielw97 · 2023-12-11T17:25:53Z

You may very well be aware of this already, although there is a rather recent project called StyleTTS2 which raises the bar even further for open-source and local tts generation.
No pressure of course, although it would be great to have this integrated at some point in the future.
I've tested the demo on a cpu and it runs fairly quickly.
As of now there's an http api and also python integration at this repo.
https://github.com/NeuralVox/StyleTTS2

aedocw · 2023-12-11T19:29:38Z

Ah interesting, I had been watching StyleTTS2 progress a while back but I haven't looked at it in the last month or so. I'll check it out and try to play with it some, that would be neat if it's even better than XTTSv2!

danielw97 · 2023-12-11T19:31:43Z

Great, I'd be interested in how you get on with that.
The main thing that sticks out to me is not only the naturalness, although it is quite fast on the cpu that I tested it with as well.

danielw97 · 2023-12-16T01:32:28Z

I'm keeping a close eye on the styletts2 project, and just wanted to pass along that there's been a pip package released strictly for inference.
Of course as this is new things change quickly, although wanted to let you know.
https://github.com/sidharthrajaram/StyleTTS2

aedocw · 2023-12-16T03:02:28Z

Thanks! I played with it yesterday and it’s impressive. Still a little noisy with some stuff. Looking forward to when fine tuning is easy too. Definitely one to watch closely though.

…

On Fri, Dec 15, 2023 at 5:32 PM danielw97 ***@***.***> wrote: I'm keeping a close eye on the styletts2 project, and just wanted to pass along that there's been a pip package released strictly for inference. Of course as this is new things change quickly, although wanted to let you know. https://github.com/sidharthrajaram/StyleTTS2 — Reply to this email directly, view it on GitHub <#105 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAFBJGMQC4I4UQPWUYMMWWLYJT23PAVCNFSM6AAAAABAQEEP5OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNJYGY3DONJVGQ> . You are receiving this because you commented.Message ID: ***@***.***>

rsxdalv · 2024-01-16T20:05:32Z

I just finished examining StyleTTS2. I think if we accumulate a bit more we might be able to solve the issue with GPL phonemizer dependency, or at least feel bad together.
rsxdalv/tts-generation-webui#212
yl4579/StyleTTS2#91

aedocw · 2024-01-16T20:40:41Z

I get it, I had not noticed the phonemizer/GPL issue when I poked around. Now though the GPL fork makes a whole lot more sense to me!

Honestly I think if/when when StyleTTS2 is sounding good and worth using, we can just use the neuralvox fork and re-license this to GPL, assuming the few contributors we've had will agree to that. If they won't, I can pull out any directly contributed code like that and write it fresh, and make a GPL fork of epub2tts.

aedocw · 2024-01-16T21:16:02Z

Discussion for license change

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement: add StyleTTS2 support #105

enhancement: add StyleTTS2 support #105

danielw97 commented Dec 11, 2023

aedocw commented Dec 11, 2023

danielw97 commented Dec 11, 2023 •

edited

Loading

danielw97 commented Dec 16, 2023

aedocw commented Dec 16, 2023 via email

rsxdalv commented Jan 16, 2024

aedocw commented Jan 16, 2024

aedocw commented Jan 16, 2024

enhancement: add StyleTTS2 support #105

enhancement: add StyleTTS2 support #105

Comments

danielw97 commented Dec 11, 2023

aedocw commented Dec 11, 2023

danielw97 commented Dec 11, 2023 • edited Loading

danielw97 commented Dec 16, 2023

aedocw commented Dec 16, 2023 via email

rsxdalv commented Jan 16, 2024

aedocw commented Jan 16, 2024

aedocw commented Jan 16, 2024

danielw97 commented Dec 11, 2023 •

edited

Loading