Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wave2vec OOM while doing inference #3359

Closed
abhinavsp0730 opened this issue Mar 16, 2021 · 9 comments
Closed

wave2vec OOM while doing inference #3359

abhinavsp0730 opened this issue Mar 16, 2021 · 9 comments

Comments

@abhinavsp0730
Copy link

abhinavsp0730 commented Mar 16, 2021

❓ Questions and Help

Before asking:

  1. search the issues. yes
  2. search the docs. yes

What is your question?

When I'm trying to do inference on a audio of length of around 52 sec , I'm getting this error RuntimeError: [enforce fail at CPUAllocator.cpp:65] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 326730288 bytes. Error code 12 (Cannot allocate memory) the inference is going to take almost 326.730288 MB. And when I ran
free -h I'm having this much of free space.

sh-4.2$ free -h
             total       used       free     shared    buffers     cached
Mem:          7.7G       977M       6.7G         0B        90M       384M
-/+ buffers/cache:       502M       7.2G
Swap:         3.0G       290M       2.7G

Would you please help me regarding this issue.
@patrickvonplaten .

code

import soundfile as sf
import librosa
import torch
from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer

tokenizer = Wav2Vec2Tokenizer.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

input_audio, _ = librosa.load(filename, 
                              sr=16000)
input_values = tokenizer(input_audio, return_tensors="pt").input_values
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
text = tokenizer.batch_decode(predicted_ids)[0]

sample audio file in wave format

https://github.com/abhinavsp0730/video-to-text-ap/blob/main/sample_audio_1.wav

@olafthiele
Copy link

Seems to be an issue of using the Huggingface implmentaton. I just ran your example on an XLSR model on a CPU server and didn't have any problems at all. Except for the output as it is a German model and not very good with Indian English :-)

@abhinavsp0730
Copy link
Author

Hy, thanks for the help @olafthiele would you please tell me the did you usued hugging face for XLSR model?

@olafthiele
Copy link

Ah, sorry, should have made that clearer. We don't use Huggingface's implementation. Just pure wav2vec 2.0. So I guess it is something within their code ...

@abhinavsp0730
Copy link
Author

@olafthiele would you please share the snippet for doing the inference using fireseq's wev2vec 2.0 ? thanks .

@olafthiele
Copy link

Inference is built upon this great repo by @mailong25. Don't know whether that is compatible though.

@patrickvonplaten
Copy link
Contributor

Hey @olafthiele - make sure wrap your code into a with torch.no_grad(): to same memory. This snippet should work:

import soundfile as sf
import librosa
import torch
from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer

tokenizer = Wav2Vec2Tokenizer.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

input_audio, _ = librosa.load(filename, 
                              sr=16000)
input_values = tokenizer(input_audio, return_tensors="pt").input_values

with torch.no_grad():
    logits = model(input_values).logits

predicted_ids = torch.argmax(logits, dim=-1)
text = tokenizer.batch_decode(predicted_ids)[0]

@patrickvonplaten
Copy link
Contributor

Here a colab to run it succesfully: https://colab.research.google.com/drive/1m54QPo07ptp_GRdTLztuc0OdCHk7j28C?usp=sharing

Actually, I forgot to put the torch.no_grad() in the description -> I'll fix that!

@olafthiele
Copy link

Thanks @patrickvonplaten , that should help @abhinavsp0730

@abhinavsp0730
Copy link
Author

Thanks, @patrickvonplaten for the help I'm closing this issue as it has been resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants