Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue of duplicate word lines #28

Closed
MohammedMehdiTBER opened this issue Feb 6, 2023 · 4 comments
Closed

Issue of duplicate word lines #28

MohammedMehdiTBER opened this issue Feb 6, 2023 · 4 comments

Comments

@MohammedMehdiTBER
Copy link

The script works just fine as I try to transcribe an audio file in Arabic language but when It reaches 6min out of 8, It starts duplicating one line even though the timing is correct but the lexical is not.

image

@ortep-llit
Copy link

I have the same problem transcribing audio in german language, most of the times it works great but in some cases a few lines are repeated like the example above.

@Jeronymous
Copy link
Member

I think it is the same as issue #16

Try if you have better results with --accurate option in the CLI (or beam_size=5, best_of=5, temperature=(0.0, 0.2, 0.4, 0.6, 0.8, 1.0) if you call the transcribe function).

If not, it's a problem of whisper itself (which you can check by using whisper instead of whisper_timestamped)

@Jeronymous
Copy link
Member

Closing for lack of feedback. Feel free to re-open

@MohammedMehdiTBER
Copy link
Author

Closing for lack of feedback. Feel free to re-open

It worked well now that I used --accurate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants