New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
.wav inputs specifics #49
Comments
Voxceleb is a good one, could you be more specific on what issues you are having? |
Does it work for you? |
I have the same issue due to the features map values. Did you use input_feature.py published with the project as it is? If you did, then the problem is in the input features coming out from input_feature.py. I think (correct me if I'm wrong) that's because it uses the log-energy which most likely negative values. Please let me know once you solve this issue since I'm stuck with it. |
Hi guys, if you read the paper the author does do VAD, however he did state it was done in Matlab if I'm not mistaken. You will be able to find some VAD solutions in python but they do not produce good results. My advice is to not worry about the VAD. The models will work without them. Please try out the keras implication here - > https://github.com/imranparuk/speaker-recognition-3d-cnn and see if that works for you. It's a working progress, then if that works it will help you understand what is being accomplished in this repository. |
Yes, the input_feature.py file is the same as the project. Thank you! ( and happy new year!!) |
Dear all, Please refer to the Pytorch Implementation which uses VoxCeleb dataset. |
Hi guys,
I have a question regarding the input wav files used for training.
What are the audio format specifications?
I used voxceleb ( http://www.robots.ox.ac.uk/~vgg/data/voxceleb/ ) as dataset, but it is giving me some troubles.
Do you know about any other usable dataset?
Thank you ;)
The text was updated successfully, but these errors were encountered: