-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add KsponSpeech recipe #1353
Add KsponSpeech recipe #1353
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great work! I left 2 comments.
) | ||
|
||
|
||
def normalize( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make normalization optional (and enabled by default) via a parameter in prepare_ksponspeech
? you can see other recipes, some of them have a string option making it possible to choose different flavors of normalization (we can have "default" and "none" here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make normalization optional (and enabled by default) via a parameter in
prepare_ksponspeech
? you can see other recipes, some of them have a string option making it possible to choose different flavors of normalization (we can have "default" and "none" here)
Fix it in e7dc868
lhotse/recipes/ksponspeech.py
Outdated
return manifests | ||
|
||
|
||
def pcm_to_wav( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we convert to FLAC instead? 2x storage space savings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we convert to FLAC instead? 2x storage space savings
Fix it in 36aa16d
Thanks for your feedback! I'll fix the proposal quickly. |
Thanks! LGTM |
KsponSpeech is a large-scale spontaneous speech corpus of Korean.
This corpus contains 969 hours of open-domain dialog utterances,
spoken by about 2,000 native Korean speakers in a clean environment.
All data were constructed by recording the dialogue of two people
freely conversing on a variety of topics and manually transcribing the utterances.
The transcription provides a dual transcription consisting of orthography and pronunciation,
and disfluency tags for spontaneity of speech, such as filler words, repeated words, and word fragments.
The original audio data has a pcm extension.
During preprocessing, it is converted into a file in the wav extension and saved anew.
KsponSpeech is publicly available on an open data hub site of the Korea government.
The dataset must be downloaded manually.
For more details, please visit:
Dataset: https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=123
Paper: https://www.mdpi.com/2076-3417/10/19/6936