New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regarding input data #11
Comments
@abhishekkritarth, looks like you have not read the corresponding paper properly. The first channel is the actual MFEC features of the audio, while the second and third channel represent the 1st and 2nd derivative of the MFEC features respectively. They can be easily extracted using the functions available in speechpy library. |
@abhishekkritarth For further information please take a look at the associated paper: The 3 is the number of channels which consists of static, first order and second order derivative features. Please let me know if you had any other question. |
@astorfi Sorry to bother you.. This might be a silly question but I want to make sure if I understand correctly. |
@cp2923 No problem. There is no trimming. However, you should choose how many of them you need. I case of my paper, only 80 consecutive frames are needed and it's a must as well!! Please let me know if you had any other question. |
@cp2923 I just opened this issue again. So please close it if your got the answer of your question. |
@astorfi Thank you for your quick reply. So for the (3209,40,3) npy file, I should divide them into 40 npy file as the input of create_development.py? |
@cp2923 I am a little bit confused about what you are trying to do. Let's say you want to feed some (80,40,3) cube to the network (discarding the batch dimension for now). So you get those 80s from 3209. Right? The In any case, please do not hesitate to contact me if you had an issue. |
@astorfi I think I totally understand. May I ask one more question? Why did you create two sessions under sample_data/1 ? Do I have to create a folder for each npy? Thanks!! |
@cp2923 Sure. That was just an arbitrary naming for different files. There is nothing much there for considering. |
@astorfi Thank you for your patient! |
@cp2923 My pleasure. I will close this. |
@astorfi |
@cp2923 Please ignore those naming conventions. That's database related. Please follow the paper. |
@astorfi Thanks for your reply. |
Hi @astorfi
I have gone through your code. While extracting mfcc features for sample audio file it contains shape (420,40) here, 420 is number of frames and 40 is number of features.But In sample data of your code youre applying mfec feature file contains shape (3209,40,3). As per my understanding 3209 is Number of Frames,40 is Number of Features,3 Is number of Channels. I didn't understand the number of channels usage.can you please suggest how to create Feature_mfec.npy file in your format.
The text was updated successfully, but these errors were encountered: