-
Notifications
You must be signed in to change notification settings - Fork 833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large-v3 model hallucinates, large-v2 doesn't #777
Comments
@Arche151 , could you try again with compute_type="default" (or remove this command when initializing whisper model) ? |
Thanks for the quick reply and suggestion! I'll try that and report back. |
It's known that large-v3 hallucinates much more than large-v2, read there: |
Damn, that sucks hard. In that case, there's ofc nothing that faster-whisper can change about that. Then I guess I'll stay with large-v2. Thanks for linking the article! |
Did you meant "large-v2"? On my Standalone Faster-Whisper I've added auto-offsets to whisper's pseudo-vad thresholds when "v3" is in use, you can try these parameters when using large-v3:
|
does it yield better result than large-v2 using your parameters with large-v3? |
I didn't try for long enough to be able to say. Just went back to large-v2 after reading the deepgram article. |
You tell me, as I don't use large-v3. IMO large-v2 is better. |
I have two scripts, where the large-v3 model hallucinates, for instance by making up things, that weren't said or by spamming a word like 50 times. When I replace
large-v3
in the script withlarge-v2
, the transcription works fine.Script 1:
Script 2:
Does anyone know of a fix? Is something wrong with my scripts or with the faster-whisper large-v3 model?
The text was updated successfully, but these errors were encountered: