Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference on long files #44

Closed
databill86 opened this issue Mar 15, 2023 · 8 comments
Closed

Inference on long files #44

databill86 opened this issue Mar 15, 2023 · 8 comments

Comments

@databill86
Copy link

Hello,

Thank you for this great library!
Is there any way we can chunk the initial audio into shorter samples, let's say 50 seconds each, run inference on those, and end up with a final reconstruction.
I came across this article and I wonder if it's possible to get it working here.
Any ideas if this is possible ?

@guillaumekln
Copy link
Contributor

Hi,

The Whisper transcription loop already handles long files using a sliding 30-second window while keeping the context. So you don't need to do anything to transcribe long files.

@databill86
Copy link
Author

databill86 commented Mar 15, 2023

Thank you. So is it normal that the transcription time is considerably long for long files ?

@guillaumekln
Copy link
Contributor

Yes, the transcription time depends on the audio file duration. Long files will take longer.

@databill86
Copy link
Author

Sorry I closed and reopened the issue. I just have one last thing about the longer files.
If we use the "gpu" as a device, is there any way we can avoid OOM for these longer files ?

@databill86 databill86 reopened this Mar 16, 2023
@guillaumekln
Copy link
Contributor

guillaumekln commented Mar 16, 2023

What is your GPU and what model size are you running?

@databill86
Copy link
Author

It's a NVIDIA GeForce GTX 1070 Ti 8Go, I was running the large-v2 model on a 18min file. But even with 4min file I have OOM.

@guillaumekln
Copy link
Contributor

Try running the model with 8-bit quantization:

model = WhisperModel(model_path, device="cuda", compute_type="int8")

@databill86
Copy link
Author

Wow, just like that! it's a lot faster, and no OOM!!!
Thank you!
I will close the issue now for good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants