-
Notifications
You must be signed in to change notification settings - Fork 667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
encoding errors on example program #23
Comments
Example file that can reproduce the above problem: https://drive.google.com/file/d/19WnNJLL1IThoVznUog6hyMQUwTZKU2T2/view?usp=share_link |
I've noticed the same with Japanese- I get the same line repeated again and again. Using medium and large models. |
feeding higher rate audio than 16k helped me to get rid of "runFullImpl: failed to generate timestamp token - skipping one second". I only fed 16k because the CPU version of whispercpp wanted that but Mr Const-me's version doesnt seem to have that limit. Does anyone like to confirm that higher audio resolution solvves "runFullImpl" error? |
Thank you for your contribution. The directml version of whisper is much faster than the pure cpu version of whisper.cpp. And I had some issues when using it. The first one was the encoding problem. In the debugoutput of the desktop version, the output content sometimes lacked a few characters. I think it was a conversion from utf-8 to CP_ACP (windows936 gb2312-80). Similar encoding errors also happened in the cli, and the output dictation text was almost unreadable '?'
The second issue is that almost every audio that is transcribed will report the error “runFullImpl: failed to generate timestamp token - skipping one second”.
The third problem is similar to #18, it always stops working after recognizing for a period of time, and repeatedly outputs the last sentence of recognition content.
If you plan to track the last two problems, I can open another issue
The text was updated successfully, but these errors were encountered: