You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
removed English utterances in Aj-DXM5Zqms, Ba5Jl1_JKZY and gffgHgnEhtA
kept Mc044I55SCY as a part used in our data includes only Japanese utteranes
Kept ydhfjNRFzaM, BLElQZfR_2M, TWeYkdIQsk0, Ab-KZT06gR0, and kU9LcoHaFLo as, though they are singing voices (or rap), their transcriptions are actually correct.
The TEDxJP 10k corpus contains some inapporopriate data for evaluation of Japanese speech recognition.
In following videos, Japanese people talk in English.
But the corpus uses subtitles that are automatically translated into Japanese.
Then, following videos contain some music (corpus uses the interval of music).
It may be better to be removed from the corpus.
The text was updated successfully, but these errors were encountered: