Retrain on new and own dataset #7

Ahmed-Abouzeid · 2017-12-27T16:14:27Z

Hi,

First, that is a great job and ver well done :)
Now I am trying to use your source code and maybe contribute to it, I am working on a speaker recognition problem to detect if a teacher tutorial is recorded by his voic. I have about 10 hours of historical recordings for 6 teachers. First I used the speechpy to get 3d npy from wav files and used your create_development.py to create the hdf5 files for train and eval. Is that correct? Specially I got 13 instead of 40 regarding the features vector length in the npy files! I ran the run.bash file and it gave me also error saying something like that: ValueError: Negative dimension size caused by subtracting 2 from 1 for 'MaxPool_7' (op: 'MaxPool') with input shapes: [?,1,112,128].

Ostnie · 2017-12-28T10:30:22Z

@Ahmed-Abouzeid Hi， I have some problems too! I also want to know how to create my own data. Now I have run create_development.py directly ,but it gave me that train_files_subjects_list.append(file_name.split('/')[7])
IndexError: list index out of range
could you help me? I don't konw how to change the index

Ahmed-Abouzeid · 2018-01-03T07:35:56Z

@Ostnie are you aware that the training files are not audio samples, they are .npy files which include features of a certain person voice. So, you need first to run the speechpy package on the wav files to create npy files and use them in that repository. I am copying the speechpy code I wrote for clarification:

`for x, f in enumerate(os.listdir(sys.argv[1])):

file_name = os.path.join(os.path.dirname(os.path.abspath(__file__)), sys.argv[1] + '/' + f )
fs, signal = wav.read(file_name)


# Example of pre-emphasizing.
signal_preemphasized = speechpy.processing.preemphasis(signal, cof=0.98)

# Example of staching frames
frames = speechpy.processing.stack_frames(signal, sampling_frequency=fs, frame_length=0.020, frame_stride=0.01, filter=lambda x: np.ones((x,)),
	 zero_padding=True)

# Example of extracting power spectrum
power_spectrum = speechpy.processing.power_spectrum(frames, fft_points=512)
#print('power spectrum shape=', power_spectrum.shape)


############# Extract MFCC features #############
mfcc = speechpy.feature.mfcc(signal, sampling_frequency=fs, frame_length=0.020, frame_stride=0.01,
	     num_filters=40, fft_length=512, low_frequency=0, high_frequency=None)
mfcc_cmvn = speechpy.processing.cmvnw(mfcc,win_size=301,variance_normalization=True)
#print('mfcc(mean + variance normalized) feature shape=', mfcc_cmvn.shape)

mfcc_feature_cube = speechpy.feature.extract_derivative_feature(mfcc)
print('mfcc feature cube shape=', mfcc_feature_cube.shape)

np.save(sys.argv[1] + '/' + f, mfcc_feature_cube)`

Ahmed-Abouzeid · 2018-01-03T07:49:48Z

@Ostnie Next step should be running the create_development.py after preparing the train_subjects_path.txt. For me I wrote that:
../sample_data/nihalEssmat
../sample_data/samarBahaaeldin
because I placed these two folders (nihal_essmat, samarbahaeldin) in sample data and they contain the .npy files generated for each from the speechpy code I provided above. The final result should be the hdf5 to use during the training.

Note (1) the output of mine, the numpy array was with a specific shape which I changed the code in this repository accordingly. For example, in the create_development.py, I changed num_coefficient = 40 to be num_coefficient = 13 since 13 was in the the shape of my features npy files.

Note (2), I created eval_subjects_path.txt and placed into it something similar to the train_subjects_path.txt but this time I placed npy files that I will use for evaluation, you will need that later when you run the bash script (run.sh)

I hope what I am saying make sense to you. I will let you know if I was able to fix the original issue I posted earlier.

nkcsfight · 2018-01-16T12:27:40Z

@Ahmed-Abouzeid Hello, have you ever fixed your issue?

Ahmed-Abouzeid · 2018-01-22T09:16:01Z

@nkcsfight Hi, I am again visiting this problem and will work on it these days. If I reached something I will post it here!

marianadehon · 2018-01-30T11:45:01Z

Hi, I believe you are using the wrong method here. You should be using lmfe instead mfcc.

mmubeen-6 · 2018-01-30T21:17:27Z

@Ahmed-Abouzeid I guess your problem is that you have decreased the number of filters from 40 to 13. The network is structured in such a way that if input of 40x80 in not as an input, during convolution operation the size becomes negative, which is exactly happening in your case. Changing it back to 40 and getting your data prepared using 40 filters would definitely solve the problem.

naeemrehmat65 · 2018-01-31T06:52:12Z

Hi,
Your work is really interesting. I used your code and trying to contribute in it. I could not see any code that creates these two data files "development_sample_dataset_speaker.hdf5" and "enrollment-evaluation_sample_dataset.hdf5". Any help will be appreciated.

astorfi · 2018-02-10T19:40:06Z

@naeemrehmat65 Thanks for your interest. Please refer to this part for creating development. Creating enrollment and evaluation is similar. This part is just related to how you create an HDF5 for feeding it to the network. It must be customized and modified considering your specific datat.

naeemrehmat65 · 2018-02-12T10:57:17Z

Thanks for your response.

…

On Feb 11, 2018 12:40 AM, "Amirsina Torfi" ***@***.***> wrote: @naeemrehmat65 <https://github.com/naeemrehmat65> Thanks for your interest. Please refer to this part <https://github.com/astorfi/3D-convolutional-speaker-recognition/blob/master/code/0-input/create_hdf5/create_development.py> for creating development. Creating enrollment and evaluation is similar. *This part is just related to how you create an HDF5 for feeding it to the network*. *It must be customized and modified considering your specific datat*. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AYztPWWQcr6yuTAMMywo4mu17P8oXkkZks5tTfCXgaJpZM4RNnqj> .

astorfi closed this as completed Feb 27, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retrain on new and own dataset #7

Retrain on new and own dataset #7

Ahmed-Abouzeid commented Dec 27, 2017

Ostnie commented Dec 28, 2017

Ahmed-Abouzeid commented Jan 3, 2018 •

edited

Ahmed-Abouzeid commented Jan 3, 2018

nkcsfight commented Jan 16, 2018

Ahmed-Abouzeid commented Jan 22, 2018

marianadehon commented Jan 30, 2018

mmubeen-6 commented Jan 30, 2018

naeemrehmat65 commented Jan 31, 2018

astorfi commented Feb 10, 2018

naeemrehmat65 commented Feb 12, 2018 via email

Retrain on new and own dataset #7

Retrain on new and own dataset #7

Comments

Ahmed-Abouzeid commented Dec 27, 2017

Ostnie commented Dec 28, 2017

Ahmed-Abouzeid commented Jan 3, 2018 • edited

Ahmed-Abouzeid commented Jan 3, 2018

nkcsfight commented Jan 16, 2018

Ahmed-Abouzeid commented Jan 22, 2018

marianadehon commented Jan 30, 2018

mmubeen-6 commented Jan 30, 2018

naeemrehmat65 commented Jan 31, 2018

astorfi commented Feb 10, 2018

naeemrehmat65 commented Feb 12, 2018 via email

Ahmed-Abouzeid commented Jan 3, 2018 •

edited