Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Boost Inference speed #426

Open
arnavmehta7 opened this issue Dec 27, 2022 · 8 comments
Open

Boost Inference speed #426

arnavmehta7 opened this issue Dec 27, 2022 · 8 comments
Labels
question Further information is requested

Comments

@arnavmehta7
Copy link

❓ Questions

Hi I am currently using a GPU for inferencing on files, however it takes a lot of time for longer files with duration around 20minutes. GPU VRam is not an issue, but I was wondering if I could increase power or faster anyhow the inference on GPU.

Thanks

@arnavmehta7 arnavmehta7 added the question Further information is requested label Dec 27, 2022
@CarlGao4
Copy link
Contributor

What is your GPU?

@arnavmehta7
Copy link
Author

Its all on cloud - Nvidia A100

@CarlGao4
Copy link
Contributor

You can use --segment argument. Increasing segment can make separation faster while requiring more memory.
Besides, you can compare the speed with completely CPU. For v3 models, it is about 0.8 times of audio length for each single model (or each model inside BagOfModels). For v4 models, it is about 3 times of audio length. (These figures are test results on my laptop) When using GPU, it should be about 20~50 times faster. If it is in this range, then the limit of GPU has been reached.

@arnavmehta7
Copy link
Author

I find v4 models faster than v3, like they were 2-3 minutes quick for the larger sample.

@arnavmehta7
Copy link
Author

@CarlGao4 So what is the value of default --segment ??

@CarlGao4
Copy link
Contributor

I find v4 models faster than v3, like they were 2-3 minutes quick for the larger sample.

This is because default models of v3 (mdx mdx_q mdx_extra mdx_extra_q) are both bags of models containing 4 single models each. It is 4 times slower than single models. But default model for v4 (htdemucs) is a single model, so it is faster. You can use htdemucs_ft and you will use about 2 times longer than v3.

@CarlGao4
Copy link
Contributor

@CarlGao4 So what is the value of default --segment ??

It depends on the model. e.g. All v3 models is 44

@matiaszanolli
Copy link

Hey there, which is the average inference speed you're getting with your A100?

I've been running some performance tests over an A100 (between other GPUs) and found the inference speed pretty much sticks at 48 seconds/s on the htdemucs (v4) model (the htdemucs_ft model is about 4 times slower, since it runs a sequence of 4 models instead of a single one). Are you getting close to those speeds?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants