New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with proper data loading #5
Comments
+1 I faced exact same rabbit hole. Anyone successfully ran the project or it's only paper with some unstructured chunks that can't be ran? |
Uh, the .mp3 files need to be processed and binarized. We are still clearing the codes up. But maybe refer to files as well as the data preparation process in that repo. |
@MrZixi |
Good question, I was reading through the code and was lost there too. @MrZixi mind you know what these .csv files are? |
The csv file is generated by preprocessing codes. See here, an example of preprocessed Ljspeech dataset. They are basically doing things like recording wav file paths, normalizing texts, extracting phoneme and aligning phoneme sequences with wav using MFA tools. |
As I can see, the given csv example has different column names than the keys used in NeuralSVB dataloaders, which prevents us from properly preprocessing the data. Are you going to upload some working examples for NeuralSVB repository? Thank you |
the job demands columns like 'f0', 'uv', 'me12ph', 'me', 'prof_f0' and the code @MrZixi you provided generates 'spk', 'txt', 'txt_raw', 'ph', 'wav_fn'. Can you please explain how to properly load the demo dataset and start training the model? |
@MrZixi ? |
The code I referred to is an example from one of our other repos. The point is that the columns which NeuralSVB needs are processed by those codes. For example, the 'prof_f0' and 'f0' represent the amateur f0 and professional f0 information from the paired pieces. The 'mel2ph' represents the alignment between phonemes and the mel-spectrogram provided by the MFA tool. We are still clearing the codes up and it may take some time as we are now been occupied by some other things. |
@MrZixi |
I think you should probably close this repo as its impossible to run anything and there is none who can help. |
Try to learn some manners, will you? Be patient and ask about our time schedule of releasing codes and data to arrange yours or just un-star. (Maybe nonsenses as it looks like you never star this repo) |
Hi, I'd like to run your model by myself, however I cannot find proper way to load the dataset with .mp3 files you provided. Is there a chance to share the dataloader you've used or give some hints how to process the .mp3 files to valid dataset which could be used in your usage examples? I'll be very grateful!
The text was updated successfully, but these errors were encountered: