Skip to content
Discussion options

You must be logged in to vote

I get the same problem though it is very rare. As far as I understand, Whisper may output a non-timestamp token after the <transcribe> token even if it is not in the without_timestamps mode. But, the transcribe function assumes that the first decoded token (sliced_tokens[0]) is always a timestamp token, and as a result, start_timestamp_position could be negative if is not the case.
See around this line:

last_slice = 0

A possible solution would be to suppress generating non-timestamp tokens as a first generated token.

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@jumon
Comment options

Answer selected by jongwook
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants