Skip to content

Timecode per/ word a future option? #26

Answered by jongwook
samelie asked this question in Q&A
Discussion options

You must be logged in to vote

(Duplicate of #3) Getting word-level timestamps are not directly supported, but it could be possible using the predicted distribution over the timestamp tokens or the cross-attention weights.

Currently, the predicted timestamps tend to be biased towards integers, and there are some failure modes where the timestamps can be constantly shifted, making reliable word-level timestamp prediction difficult. Once this is solved by us or the community, I agree that it'd be a great addition to this repo.

Replies: 5 comments 8 replies

Comment options

You must be logged in to vote
3 replies
@kospl
Comment options

@DAVIDSystems
Comment options

@DAVIDSystems
Comment options

Comment options

You must be logged in to vote
2 replies
@melindadevins
Comment options

@jayn1985
Comment options

Answer selected by jongwook
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
3 replies
@octimot
Comment options

@RaulKite
Comment options

@octimot
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
10 participants