Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create dataset loader for LibriVox-Indonesia #246

Closed
SamuelCahyawijaya opened this issue Sep 9, 2022 · 1 comment · Fixed by #267
Closed

Create dataset loader for LibriVox-Indonesia #246

SamuelCahyawijaya opened this issue Sep 9, 2022 · 1 comment · Fixed by #267
Assignees
Labels
help wanted Extra attention is needed

Comments

@SamuelCahyawijaya
Copy link
Member

NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?librivox

Dataset librivox
Description The LibriVox Indonesia dataset consists of MP3 audio and a corresponding text file we generated from the public domain audiobooks LibriVox. We collected only languages in Indonesia for this dataset. The original LibriVox audiobooks or sound files' duration varies from a few minutes to a few hours. Each audio file in the speech dataset now lasts from a few seconds to a maximum of 20 seconds.
License CC0
@holylovenia holylovenia added the help wanted Extra attention is needed label Sep 10, 2022
@jensan-1
Copy link
Contributor

#self-assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
Development

Successfully merging a pull request may close this issue.

3 participants