Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there really speed up? #50

Closed
C00reNUT opened this issue Mar 16, 2023 · 4 comments
Closed

Is there really speed up? #50

C00reNUT opened this issue Mar 16, 2023 · 4 comments

Comments

@C00reNUT
Copy link

Hello,
thank you for trying to optimize the tortoise library, I am trying to compare the speed between the two implementations, but so far I am getting very similar results in both quality and speed. I use NVIDIA 3060 with 12GB VRAM.

Running the script bellow takes about 2m and 14s.

python scripts/tortoise_tts.py -p ultra_fast -O results/best_short_15/ultra_fast -v best_short_15 <text_short.txt --sampler dpm++2m --diffusion_iterations 30 --vocoder Univnet

image

Using the same setting, but using original repo tag in CLI takes about 2m and 2s.

python scripts/tortoise_tts.py -p ultra_fast -O results/best_short_15/ultra_fast_original -v best_short_15 <text_short.txt --original_tortoise

image

Am I missing something? Probably some tags I should add to speed up the generation?

@darkconsole
Copy link

darkconsole commented Mar 16, 2023

for me there was exactly 0% speed change no matter how much i did the readme verbatim, or my own experimentation. also a lot of my voices that were fine in the old one didn't work on this. the ones that did though, were better/clearer so i just kinda shrugged that off and was like ok cool, alternate choice.

(rtx 3060 here)

@152334H
Copy link
Owner

152334H commented Mar 17, 2023

using original repo tag in CLI

the original repo tag uses the kv_cache speedup features ...... try using the actual original tortoise repo, it will be much slower

@152334H 152334H closed this as completed Mar 17, 2023
@rikabi89
Copy link

Check your GPU VRAM is being utilized . This could be a pytorch issue.

@C00reNUT
Copy link
Author

Check your GPU VRAM is being utilized . This could be a pytorch issue.

Yeah it could be part of the problem. But I tried original repo and the average time for generation using the same sampling method and vocoder is about the same.

Anyway, I will try to use Pytorch 2.0 https://pytorch.org/blog/accelerated-diffusers-pt-20/ there seem to be interesting speedup possibilities out of the box.

Also https://github.com/nebuly-ai/nebullvm looks promising to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants