Regarding input data #11

abhishekkritarth · 2018-03-21T05:28:24Z

Hi @astorfi
I have gone through your code. While extracting mfcc features for sample audio file it contains shape (420,40) here, 420 is number of frames and 40 is number of features.But In sample data of your code youre applying mfec feature file contains shape (3209,40,3). As per my understanding 3209 is Number of Frames,40 is Number of Features,3 Is number of Channels. I didn't understand the number of channels usage.can you please suggest how to create Feature_mfec.npy file in your format.

mmubeen-6 · 2018-03-21T12:34:55Z

@abhishekkritarth, looks like you have not read the corresponding paper properly. The first channel is the actual MFEC features of the audio, while the second and third channel represent the 1st and 2nd derivative of the MFEC features respectively. They can be easily extracted using the functions available in speechpy library.

astorfi · 2018-03-24T00:48:18Z

@abhishekkritarth For further information please take a look at the associated paper:
Text-Independent Speaker Verification Using 3D Convolutional Neural Networks.

The 3 is the number of channels which consists of static, first order and second order derivative features.
As @mmubeen-6 kindly mentioned, it can directly be extracted using SpeechPy.

Please let me know if you had any other question.

cp2923 · 2018-04-07T22:51:23Z

@astorfi Sorry to bother you.. This might be a silly question but I want to make sure if I understand correctly.
For the 3209 which is Number of Frames of MFEC, it should vary from different samples. But I don't need to trim them into 80(which is the input frame size you mentioned in the paper) before training. Am I correct?

astorfi · 2018-04-07T23:17:22Z

@cp2923 No problem.

There is no trimming. However, you should choose how many of them you need. I case of my paper, only 80 consecutive frames are needed and it's a must as well!!
For example, 80 consecutive frames means 0.8-sec in case of using 10ms stride for generating features.
That 3209 you mentioned is for the whole sound file.

Please let me know if you had any other question.

astorfi · 2018-04-07T23:17:56Z

@cp2923 I just opened this issue again. So please close it if your got the answer of your question.
Thanks

cp2923 · 2018-04-07T23:38:25Z

@astorfi Thank you for your quick reply. So for the (3209,40,3) npy file, I should divide them into 40 npy file as the input of create_development.py?

astorfi · 2018-04-07T23:59:52Z

@cp2923 I am a little bit confused about what you are trying to do. Let's say you want to feed some (80,40,3) cube to the network (discarding the batch dimension for now). So you get those 80s from 3209. Right? The create_development.py file is specific to the dataset that I used and it's just a sample. You do not need to be confused about it. Please read the paper in detail to realize what is the input pipeline.
Please refer to this post as well.

In any case, please do not hesitate to contact me if you had an issue.

cp2923 · 2018-04-08T01:34:39Z

@astorfi I think I totally understand. May I ask one more question? Why did you create two sessions under sample_data/1 ? Do I have to create a folder for each npy? Thanks!!

astorfi · 2018-04-08T08:00:12Z

@cp2923 Sure. That was just an arbitrary naming for different files. There is nothing much there for considering.

cp2923 · 2018-04-08T18:16:57Z

@astorfi Thank you for your patient!
Sorry I can't find how to close the issue here....

astorfi · 2018-04-08T18:47:14Z

@cp2923 My pleasure. I will close this.

cp2923 · 2018-04-08T19:49:26Z

@astorfi
Hi, sorry to bother again.... I still feel a little confused when I read create_development.
Does folder 1 and 2 under sample_data denote two speaker?
What's the difference between train_files_subjects_list and train_files_subjects_ids?

astorfi · 2018-04-09T16:40:15Z

@cp2923 Please ignore those naming conventions. That's database related. Please follow the paper.

cp2923 · 2018-04-09T17:08:02Z

@astorfi Thanks for your reply.
These questions seems too detailed to be mentioned in you paper. I think I need to know what train_files_subjects_list and train_files_subjects_ids are, and then I can build the data input correctly.....

astorfi closed this as completed Mar 29, 2018

astorfi reopened this Apr 7, 2018

astorfi closed this as completed Apr 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regarding input data #11

Regarding input data #11

abhishekkritarth commented Mar 21, 2018

mmubeen-6 commented Mar 21, 2018

astorfi commented Mar 24, 2018

cp2923 commented Apr 7, 2018

astorfi commented Apr 7, 2018

astorfi commented Apr 7, 2018

cp2923 commented Apr 7, 2018

astorfi commented Apr 7, 2018

cp2923 commented Apr 8, 2018

astorfi commented Apr 8, 2018

cp2923 commented Apr 8, 2018 •

edited

astorfi commented Apr 8, 2018

cp2923 commented Apr 8, 2018

astorfi commented Apr 9, 2018

cp2923 commented Apr 9, 2018

Regarding input data #11

Regarding input data #11

Comments

abhishekkritarth commented Mar 21, 2018

mmubeen-6 commented Mar 21, 2018

astorfi commented Mar 24, 2018

cp2923 commented Apr 7, 2018

astorfi commented Apr 7, 2018

astorfi commented Apr 7, 2018

cp2923 commented Apr 7, 2018

astorfi commented Apr 7, 2018

cp2923 commented Apr 8, 2018

astorfi commented Apr 8, 2018

cp2923 commented Apr 8, 2018 • edited

astorfi commented Apr 8, 2018

cp2923 commented Apr 8, 2018

astorfi commented Apr 9, 2018

cp2923 commented Apr 9, 2018

cp2923 commented Apr 8, 2018 •

edited