Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model Training #2

Open
prothej227 opened this issue Aug 8, 2022 · 4 comments
Open

Model Training #2

prothej227 opened this issue Aug 8, 2022 · 4 comments
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@prothej227
Copy link

Hi! Can you add detailed steps on how to train your model using a custom dataset?

@danomatika
Copy link
Member

If you need more info that what is in the README, @bytosaur can answer but he is currently on vacation, so it may be a week or so until he can respond.

@danomatika danomatika added the documentation Improvements or additions to documentation label Aug 8, 2022
@prothej227
Copy link
Author

Hi, thanks for your reply! I'm planning to train your model using a custom dataset which is different from the common voice dataset provided in the documentation. Can you elaborate or give specific beginner-friendly steps on how I can retrain your model using my collated dataset?

@bytosaur
Copy link
Member

hey @prothej227,

how does your dataset look like? Maybe it is not that different from my setup. You can always try the setup with an incomplete common voice dataset, i.e. two languages that have very few samples.

Collecting noise data is optional. The first step is to process the downloaded common voice folders into a structure that is understandable for the training script. There are a couple of tricks I did to clean the data (voice activity detection, debiasing through sampling) which are more advanced. However, in the end you want to have folders named by the class (language) containing mono samples of equal length, sampled at the same frequency, normalized, etc.. see this section.

Please let me know the sections of the README that are not understandable so I can improve them.

@prothej227
Copy link
Author

I have a dataset that contains wav files that vary in length (max = 5 seconds, min = 3 seconds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

3 participants