[Bug] Duplicate text while using chrome extension #117

lightwastak3n · 2024-01-29T22:21:47Z

I've added saving to a file to a chrome extension but there seems to be a bug in the whisper server. Every now and then it returns repeated text.

I'm running this on cpu (Ryzen 2400g) using tiny or base models. Small isn't quite real time for me so I haven't done any testing there.
I've noticed that sometimes the transcription skips a few seconds right before this happens. Not sure if it's my cpu but using faster whisper directly I get about 0.1 RTF (30 seconds for 5 min of audio) using tiny int8.
Could it just be hallucinations? I haven't encountered any like these while transcribing hours of audio using faster whisper.

This is the unmodified chrome extension from this repo

This is the modified extension. I've also added the whole output to the transcription div so it's easier to see

makaveli10 · 2024-01-31T10:39:49Z

@lightwastak3n Hello, can you check which model are you using? We have seen this behaviour if we use multilingual model with English so, make sure you use english-only model.
By default we use multilingual model, so please change this to "small.en"

WhisperLive/whisper_live/server.py

Line 556 in 8c36768

model="small",

lightwastak3n · 2024-01-31T15:34:03Z

@makaveli10
It will get overridden by recv_audio with params from the extension.
I changed it directly when we call faster whisper

WhisperLive/whisper_live/server.py

Lines 596 to 597 in 2c8a25d

    
           self.transcriber = WhisperModel( 
        
               self.model_size_or_path,

but I still get repetition.
Also if we have Use Multilingual Model checked off in the extension. Shouldn't that automatically switch to .en model?
As far as I can see this function just changes self.multilingual.

makaveli10 · 2024-01-31T15:46:46Z

You're right there are a few things that changed, we will have to change the extension interface and remove the multilingual option instead give options to use tiny, tiny.en and so on. Thanks for pointing that out.

Although about the repition, could you log the segments

WhisperLive/whisper_live/server.py

Line 772 in 2c8a25d

for i, s in enumerate(segments[:-1]):

here and see the no_speech_prob in each segment, maybe filtering the segments based on the no_speech_prob would help.

lightwastak3n mentioned this issue Feb 1, 2024

Chrome extension update #123

Merged

makaveli10 closed this as completed Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Duplicate text while using chrome extension #117

[Bug] Duplicate text while using chrome extension #117

lightwastak3n commented Jan 29, 2024 •

edited

Loading

makaveli10 commented Jan 31, 2024

lightwastak3n commented Jan 31, 2024

makaveli10 commented Jan 31, 2024

[Bug] Duplicate text while using chrome extension #117

[Bug] Duplicate text while using chrome extension #117

Comments

lightwastak3n commented Jan 29, 2024 • edited Loading

makaveli10 commented Jan 31, 2024

lightwastak3n commented Jan 31, 2024

makaveli10 commented Jan 31, 2024

lightwastak3n commented Jan 29, 2024 •

edited

Loading