Model providing not an accurate transcription , mixing some other language . #23

Bharadwajsai-121 · 2023-04-22T15:03:04Z

I tried to get transcriptions for a video of David Silver's reinforcement learning playlist from YouTube .
The model was able to generate very good transcriptions at some timestamps , but at many timestamps , it generates transcriptions of some other language which apart from English. I haven't changed any settings or anything , just copy pasted the url of the video and clicked on transcribe . The result was out in 23.4 seconds but wasn't accurate .

For more information , please have a look at this image I'm attaching below :

!

In the image , you can clearly observe that the model is generating transcriptions of other language , even though english is asked for . Some part of it was in English , and the other part in some other language . #

sanchit-gandhi · 2023-04-24T14:36:58Z

Hey @Bharadwajsai-121! Looks like it went into Welsh! This is because the Whisper model makes a prediction for the most likely language for each batch, and then predicts in that language for that batch.

We can actually pass the language to the Flax Whisper Pipeline:

pred_txt = pipeline("audio.mp3", task="transcribe", language="English", return_timestamps=True)

This will force the model to predict in the language that you provide it, which should resolve your issue.

You can use the model for yourself using the Kaggle notebook: https://www.kaggle.com/code/sgandhi99/whisper-jax-tpu

If you pass the language argument as detailed above, you should be able to get perfect transcriptions for the David Silver lectures in nearly the same transcription time.

sanchit-gandhi closed this as completed Apr 26, 2023

realfolkcode mentioned this issue Sep 22, 2023

Whisper JAX incorrectly detects the language #142

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model providing not an accurate transcription , mixing some other language . #23

Model providing not an accurate transcription , mixing some other language . #23

Bharadwajsai-121 commented Apr 22, 2023

sanchit-gandhi commented Apr 24, 2023

Model providing not an accurate transcription , mixing some other language . #23

Model providing not an accurate transcription , mixing some other language . #23

Comments

Bharadwajsai-121 commented Apr 22, 2023

sanchit-gandhi commented Apr 24, 2023