wave2vec OOM while doing inference #3359

abhinavsp0730 · 2021-03-16T05:53:34Z

❓ Questions and Help

Before asking:

search the issues. yes
search the docs. yes

What is your question?

When I'm trying to do inference on a audio of length of around 52 sec , I'm getting this error RuntimeError: [enforce fail at CPUAllocator.cpp:65] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 326730288 bytes. Error code 12 (Cannot allocate memory) the inference is going to take almost 326.730288 MB. And when I ran
free -h I'm having this much of free space.

sh-4.2$ free -h
             total       used       free     shared    buffers     cached
Mem:          7.7G       977M       6.7G         0B        90M       384M
-/+ buffers/cache:       502M       7.2G
Swap:         3.0G       290M       2.7G

Would you please help me regarding this issue.
@patrickvonplaten .

code

import soundfile as sf
import librosa
import torch
from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer

tokenizer = Wav2Vec2Tokenizer.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

input_audio, _ = librosa.load(filename, 
                              sr=16000)
input_values = tokenizer(input_audio, return_tensors="pt").input_values
logits = model(input_values).logits
predicted_ids = torch.argmax(logits, dim=-1)
text = tokenizer.batch_decode(predicted_ids)[0]

sample audio file in wave format

https://github.com/abhinavsp0730/video-to-text-ap/blob/main/sample_audio_1.wav

The text was updated successfully, but these errors were encountered:

olafthiele · 2021-03-16T07:20:04Z

Seems to be an issue of using the Huggingface implmentaton. I just ran your example on an XLSR model on a CPU server and didn't have any problems at all. Except for the output as it is a German model and not very good with Indian English :-)

abhinavsp0730 · 2021-03-16T07:26:22Z

Hy, thanks for the help @olafthiele would you please tell me the did you usued hugging face for XLSR model?

olafthiele · 2021-03-16T07:27:57Z

Ah, sorry, should have made that clearer. We don't use Huggingface's implementation. Just pure wav2vec 2.0. So I guess it is something within their code ...

abhinavsp0730 · 2021-03-16T08:06:36Z

@olafthiele would you please share the snippet for doing the inference using fireseq's wev2vec 2.0 ? thanks .

olafthiele · 2021-03-16T08:11:47Z

Inference is built upon this great repo by @mailong25. Don't know whether that is compatible though.

patrickvonplaten · 2021-03-16T10:16:21Z

Hey @olafthiele - make sure wrap your code into a with torch.no_grad(): to same memory. This snippet should work:

import soundfile as sf
import librosa
import torch
from transformers import Wav2Vec2ForCTC, Wav2Vec2Tokenizer

tokenizer = Wav2Vec2Tokenizer.from_pretrained("facebook/wav2vec2-base-960h")
model = Wav2Vec2ForCTC.from_pretrained("facebook/wav2vec2-base-960h")

input_audio, _ = librosa.load(filename, 
                              sr=16000)
input_values = tokenizer(input_audio, return_tensors="pt").input_values

with torch.no_grad():
    logits = model(input_values).logits

predicted_ids = torch.argmax(logits, dim=-1)
text = tokenizer.batch_decode(predicted_ids)[0]

patrickvonplaten · 2021-03-16T10:19:54Z

Here a colab to run it succesfully: https://colab.research.google.com/drive/1m54QPo07ptp_GRdTLztuc0OdCHk7j28C?usp=sharing

Actually, I forgot to put the torch.no_grad() in the description -> I'll fix that!

olafthiele · 2021-03-16T10:20:39Z

Thanks @patrickvonplaten , that should help @abhinavsp0730

abhinavsp0730 · 2021-03-19T05:22:16Z

Thanks, @patrickvonplaten for the help I'm closing this issue as it has been resolved.

abhinavsp0730 added needs triage question labels Mar 16, 2021

abhinavsp0730 closed this as completed Mar 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wave2vec OOM while doing inference #3359

wave2vec OOM while doing inference #3359

abhinavsp0730 commented Mar 16, 2021 •

edited

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 16, 2021

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 16, 2021

olafthiele commented Mar 16, 2021

patrickvonplaten commented Mar 16, 2021

patrickvonplaten commented Mar 16, 2021

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 19, 2021

wave2vec OOM while doing inference #3359

wave2vec OOM while doing inference #3359

Comments

abhinavsp0730 commented Mar 16, 2021 • edited

❓ Questions and Help

Before asking:

What is your question?

code

sample audio file in wave format

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 16, 2021

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 16, 2021

olafthiele commented Mar 16, 2021

patrickvonplaten commented Mar 16, 2021

patrickvonplaten commented Mar 16, 2021

olafthiele commented Mar 16, 2021

abhinavsp0730 commented Mar 19, 2021

abhinavsp0730 commented Mar 16, 2021 •

edited