Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hoping for your result #1

Closed
lucasjinreal opened this issue Sep 27, 2021 · 13 comments
Closed

Hoping for your result #1

lucasjinreal opened this issue Sep 27, 2021 · 13 comments

Comments

@lucasjinreal
Copy link

Hoping for your result trained vits on Chinese

@MaxMax2016
Copy link
Collaborator

i update

@lucasjinreal
Copy link
Author

thanks, I'll try train it. Have u tested speed between vits and tactron2? Which do u think is better in terms of speed and quality?

@MaxMax2016
Copy link
Collaborator

of course, vits is better.it is so amazing

@lucasjinreal
Copy link
Author

how about inference speed?

@lucasjinreal
Copy link
Author

Do u think it worthy to deploy (or reasonable) ? If so, I can help deploy to TensorRT and make a C++ inference demo, also, tvm also applicable if the speed is good.

@MaxMax2016
Copy link
Collaborator

the vits_样本.wav is about 100 Seconds, it spends 800ms of a 1080 GPU to inference. if you need more fast, you can change the decoder from hifigan to mb melgan.

@MaxMax2016
Copy link
Collaborator

the technology in vits: vae & normlizing flow & gan & mas & multi task train & adaptability to long sentences etc, i think it is a general frame work of tts in the future.

@lucasjinreal
Copy link
Author

@dtx525942103 800ms for 100s, 20-30s need only 200ms, which is tolerrenable, if using TensorRT accelerate it, can be 3x faster average. Seems can be even run on some low level compute devices such as Raspberry pi.

@MaxMax2016
Copy link
Collaborator

@jinfagang your wav is 16K?
train.log

@MaxMax2016
Copy link
Collaborator

you can train ljspeech use official vits first

@HallidayReadyOne
Copy link

Ok, i will try. Thank you.

@HallidayReadyOne
Copy link

I think I have solved the problem, which is caused by the compilation of monotonic align. Thanks for help.

@MaxMax2016
Copy link
Collaborator

en en

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants