Improving whisper timestamp per word by fine tunning #595

RaulKite · 2022-11-26T06:18:04Z

RaulKite
Nov 26, 2022

Hi,

I need a good timestamp per word accuracy with the transcription of whisper

I have seen that fine tunning whisper with hugging face 🤗 seems easy for other languages so I have thought that maybe to have better accuracy is a feasible task this way.

It could be “easy” to create a dataset with aligned long audios with tools like Gentle( https://github.com/lowerquality/gentle ) I have experience with this.

Also add some layers in the top of the model to train this new output seems possible.

Is there anyone working with this? I’m wrong?

If someone is working on this please ping me. I will be exploring this path next weeks...

P.d. stable ts project is nice but not solve my problems. I have been testing it but I need real accuracy and stable ts works on raw whisper data been sometimes not really accurate.

Thanks

wwbnjsace · 2022-11-28T13:08:36Z

wwbnjsace
Nov 28, 2022

hugging face

where is the fine tunning whisper code in hugging face?why i cant not find it?

1 reply

RaulKite Nov 28, 2022
Author

https://huggingface.co/blog/fine-tune-whisper

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving whisper timestamp per word by fine tunning #595

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Improving whisper timestamp per word by fine tunning #595

Uh oh!

Uh oh!

RaulKite Nov 26, 2022

Replies: 1 comment · 1 reply

Uh oh!

wwbnjsace Nov 28, 2022

Uh oh!

RaulKite Nov 28, 2022 Author

RaulKite
Nov 26, 2022

Replies: 1 comment 1 reply

wwbnjsace
Nov 28, 2022

RaulKite Nov 28, 2022
Author