Replies: 1 comment
-
Hi, Don't know what the issue is, but seems quite random to get stuck like this. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Today I've tested the v2 model vs the v1 and found that in a lot of cases the v2 one tends to just spit out this text: " Feliratok az Amara.org közösségétől". I think there might be a lot of invalid data scraped from this site: amara.org. I mostly tested the models with music, here are a few examples:
https://youtu.be/1BI54w6T_Uo
https://youtu.be/M5CwqYQNRcY
I think it would be beneficial to look this string up and clean the training data.
Otherwise, I've seen a large improvement in Hungarian, whenever the whisper was willing to create real transcripts.
Beta Was this translation helpful? Give feedback.
All reactions