New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whisper decoding returns exception about outputs.logits shape #21057
Comments
Hey! Could you provide a reproducing script with the dataset? The file might be corrupted. |
To reproduce you can try this code
with the attached file This thing happens with fine-tuned models between, not original ones. |
I have the same issue. Model is not finetuned Could you find a workaround @nshmyrev ? |
In this case, using an original model works: from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torchaudio
import torch
fn = "/home/arthur_huggingface_co/transformers/Arthur/test.wav"
processor = WhisperProcessor.from_pretrained("openai/whisper-large")
model = WhisperForConditionalGeneration.from_pretrained("openai/whisper-large")
speech_array, sampling_rate = torchaudio.load(fn)
resampler = torchaudio.transforms.Resample(sampling_rate, 16_000)
sound = resampler(speech_array).squeeze().numpy()
input_features = processor(sound, return_tensors="pt", sampling_rate=16_000).input_features
with torch.no_grad():
generated_ids = model.generate(inputs=input_features, max_length=1000)
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(transcription) I get Duh duh duh duh uh huh. When running with your model however, it seems that the @RuABraun can you share the audio and a reproduction script? |
I fixed it by lowering max_length. Thanks |
System Info
transformers
version: 4.26.0.dev0Same error on cuda servers
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Run simple decoding with Whisper large:
Result is an exception:
The output on this problematic file is
This happens only with a single file in the dataset of 10k files.
Expected behavior
No exception
The text was updated successfully, but these errors were encountered: