Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when commenting out "--rir-set-parameters" #11

Open
deciding opened this issue Feb 29, 2020 · 10 comments
Open

Error when commenting out "--rir-set-parameters" #11

deciding opened this issue Feb 29, 2020 · 10 comments

Comments

@deciding
Copy link

It's strange that in the run.sh, we comment out the following lines

#  # Make a version with reverberated speech
#  rvb_opts=()
#  rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/smallroom/rir_list")                                         
#  rvb_opts+=(--rir-set-parameters "0.5, RIRS_NOISES/simulated_rirs/mediumroom/rir_list") 

however, --rir-set-parameters is a required parameter for steps/data/reverberate_data_dir.py, thus commenting out these lines will cause error.
Can I know why we comment out them, and whether in your experiments you include the reverberation augmentation training data? Since I am having problem on reproducing your results, thus I want to make sure our training data is same. Thanks!

@mycrazycracy
Copy link
Owner

Hi

The lines are commented out accidentally. They are needed if you do the reverberation augmentation. This augmentation is standard in the Kaldi recipe (pls see kaldi/egs/voxceleb/v2/run.sh for reference).
Generally speaking, we add reverberation, noise, music and babble in the training data.

Have you tried the pre-trained models? I just want to make sure that the misalignment occurs in the training phase.

@deciding
Copy link
Author

@mycrazycracy Hi, thanks for the reply. I tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config. And the result is roughly the same which is 2% EER. however, on my side when I trained from the very beginning, the result is much worse. (around 4%)

the misalignment of validation loss happens even at the beginning of training.

yours                               mine
0 4.347158 0.121704     0 5.595568 0.153933                                                                                                           
1 4.409681 0.124732     1 6.142251 0.152045                                                                                                           
2 4.953392 0.164300     2 5.387073 0.145626                                                                                                           
3 4.171205 0.121704     3 5.853278 0.138200                                                                                                           
4 6.030734 0.141988     4 4.879803 0.145833                                                                                                           
5 3.373980 0.117647     5 3.483222 0.127753                                                                                                           
6 3.551348 0.114610     6 2.925486 0.126240                                                                                                           
7 3.499111 0.126222     7 2.618020 0.112649                                                                                                           
8 2.706735 0.106290     8 2.495200 0.108464                                                                                                           
9 2.505243 0.087221     9 2.797635 0.107489                                                                                                           
10 2.579709 0.105946    10 3.352109 0.108477                                                                                                          
11 2.386637 0.083784    11 2.149656 0.094273                                                                                                          
12 2.215215 0.078906    12 2.788486 0.107827                                                                                                          
13 2.093923 0.070809    13 2.154233 0.096413                                                                                                          
14 2.614316 0.095335    14 2.664895 0.099308                                                                                                          
15 2.402408 0.068661    15 2.585773 0.102958
16 2.269810 0.081136    16 2.037069 0.082545
17 2.459356 0.068966    17 2.123691 0.084204
18 1.660872 0.069354    18 1.782891 0.078462
19 1.641136 0.070407    19 1.666662 0.078414
20 1.789717 0.070994    20 1.884485 0.074890
21 1.732624 0.062880    21 1.991562 0.085085
22 1.685123 0.060852    22 1.975221 0.087411
23 1.925408 0.078476    23 1.887127 0.078248
24 1.225578 0.060852    24 1.545569 0.067716
25 1.237508 0.060852    25 1.556502 0.076639
26 1.168123 0.058824    26 1.478283 0.072876
27 1.178835 0.045722    27 1.438764 0.072354
28 1.394852 0.062880    28 1.483797 0.067086
29 1.245886 0.061878    29 1.449579 0.072498
30 1.333659 0.056795    30 1.443876 0.066027
31 0.997353 0.043031    31 1.379931 0.071366
32 0.920696 0.044625    32 1.365147 0.073002
33 1.091778 0.052738    33 1.386801 0.066677
34 1.029863 0.046250    34 1.350820 0.067212
35 0.953727 0.051637    35 1.251097 0.064168
36 0.956934 0.051722    36 1.254138 0.063059
37 0.675873 0.036846    37 1.358343 0.066842
38 0.815823 0.044149    38 1.531808 0.067086
39 0.705373 0.040568    39 1.218243 0.068974
40 0.714766 0.032454    40 1.435390 0.061162
41 0.763318 0.046653    41 1.284261 0.065576
42 0.574937 0.034483    42 1.291959 0.063184
43 0.592286 0.031686    43 1.354145 0.069352
44 0.581585 0.036511    44 1.020841 0.058024
45 0.690763 0.043893    45 1.134171 0.060919
46 0.698204 0.040568    46 0.971369 0.062687
47 0.545573 0.026369    47 1.049776 0.057784
48 0.564414 0.029049    48 1.095432 0.059408
49 0.577830 0.034483    49 1.046128 0.056462
50 0.578830 0.033454    50 1.139795 0.063059
51 0.570985 0.034483    51 0.849101 0.051990
52 0.534715 0.032454    52 0.845482 0.053041
53 0.577185 0.035037    53 0.867068 0.055129
54 0.524684 0.033184    54 0.847392 0.051038
55 0.507371 0.034354    55 0.837217 0.052612
56 0.503855 0.032454    56 0.879506 0.053876
57 0.535493 0.029426    57 0.881150 0.055632
58 0.536870 0.040568    58 0.906975 0.057269
59 0.505829 0.033793    59 0.884071 0.056639
60 0.567816 0.038540    60 0.730023 0.051479
61 0.556293 0.038540    61 0.737830 0.050850
62 0.473255 0.032454    62 0.740354 0.052108
63 0.533865 0.040224    63 0.753932 0.051596
64 0.514418 0.034483    64 0.727656 0.053619
65 0.487497 0.028398    65 0.774739 0.053870
66 0.513700 0.025216    66 0.832094 0.048081
67 0.447182 0.026369    67 0.799185 0.052486
68 0.464616 0.026369    68 0.785597 0.053367
69 0.433060 0.024231    69 0.660124 0.049570
70 0.463813 0.024233    70 0.699606 0.050598
71 0.475865 0.028398    71 0.673907 0.048836
72 0.460187 0.026679    72 0.647345 0.046193
73 0.473736 0.027065    73 0.678176 0.048212
74 0.515551 0.024341    74 0.708431 0.051314
75 0.481606 0.034483    75 0.680705 0.049200
76 0.487582 0.027921    76 0.700234 0.048962

One thing notable is that, in your log file the validation batch size is 29, however, in my case it is 32, thus I think maybe our preprocessed dataset has some difference

@mycrazycracy
Copy link
Owner

mycrazycracy commented Feb 29, 2020 via email

@deciding
Copy link
Author

@mycrazycracy yes, I agree with the validation set is random, but I have ran the data preprocessing for several times, the validation loss are quite similar, and I never got the similar validation loss as your result. This may indicate some problems.

I used PLDA, if using cosine or lda-cosine the result will be even worse.

Another thing I want to state is that, even if the validation set is randomly generated, the size of it should be same I think? I shouldn't have a 32 batches validation set while yours is 29 batches. Is this correct?

@mycrazycracy
Copy link
Owner

mycrazycracy commented Feb 29, 2020 via email

@shatealaboxiaowang
Copy link

@mycrazycracy Hi, thanks for the reply. I tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config. And the result is roughly the same which is 2% EER. however, on my side when I trained from the very beginning, the result is much worse. (around 4%)

the misalignment of validation loss happens even at the beginning of training.

yours                               mine
0 4.347158 0.121704     0 5.595568 0.153933                                                                                                           
1 4.409681 0.124732     1 6.142251 0.152045                                                                                                           
2 4.953392 0.164300     2 5.387073 0.145626                                                                                                           
3 4.171205 0.121704     3 5.853278 0.138200                                                                                                           
4 6.030734 0.141988     4 4.879803 0.145833                                                                                                           
5 3.373980 0.117647     5 3.483222 0.127753                                                                                                           
6 3.551348 0.114610     6 2.925486 0.126240                                                                                                           
7 3.499111 0.126222     7 2.618020 0.112649                                                                                                           
8 2.706735 0.106290     8 2.495200 0.108464                                                                                                           
9 2.505243 0.087221     9 2.797635 0.107489                                                                                                           
10 2.579709 0.105946    10 3.352109 0.108477                                                                                                          
11 2.386637 0.083784    11 2.149656 0.094273                                                                                                          
12 2.215215 0.078906    12 2.788486 0.107827                                                                                                          
13 2.093923 0.070809    13 2.154233 0.096413                                                                                                          
14 2.614316 0.095335    14 2.664895 0.099308                                                                                                          
15 2.402408 0.068661    15 2.585773 0.102958
16 2.269810 0.081136    16 2.037069 0.082545
17 2.459356 0.068966    17 2.123691 0.084204
18 1.660872 0.069354    18 1.782891 0.078462
19 1.641136 0.070407    19 1.666662 0.078414
20 1.789717 0.070994    20 1.884485 0.074890
21 1.732624 0.062880    21 1.991562 0.085085
22 1.685123 0.060852    22 1.975221 0.087411
23 1.925408 0.078476    23 1.887127 0.078248
24 1.225578 0.060852    24 1.545569 0.067716
25 1.237508 0.060852    25 1.556502 0.076639
26 1.168123 0.058824    26 1.478283 0.072876
27 1.178835 0.045722    27 1.438764 0.072354
28 1.394852 0.062880    28 1.483797 0.067086
29 1.245886 0.061878    29 1.449579 0.072498
30 1.333659 0.056795    30 1.443876 0.066027
31 0.997353 0.043031    31 1.379931 0.071366
32 0.920696 0.044625    32 1.365147 0.073002
33 1.091778 0.052738    33 1.386801 0.066677
34 1.029863 0.046250    34 1.350820 0.067212
35 0.953727 0.051637    35 1.251097 0.064168
36 0.956934 0.051722    36 1.254138 0.063059
37 0.675873 0.036846    37 1.358343 0.066842
38 0.815823 0.044149    38 1.531808 0.067086
39 0.705373 0.040568    39 1.218243 0.068974
40 0.714766 0.032454    40 1.435390 0.061162
41 0.763318 0.046653    41 1.284261 0.065576
42 0.574937 0.034483    42 1.291959 0.063184
43 0.592286 0.031686    43 1.354145 0.069352
44 0.581585 0.036511    44 1.020841 0.058024
45 0.690763 0.043893    45 1.134171 0.060919
46 0.698204 0.040568    46 0.971369 0.062687
47 0.545573 0.026369    47 1.049776 0.057784
48 0.564414 0.029049    48 1.095432 0.059408
49 0.577830 0.034483    49 1.046128 0.056462
50 0.578830 0.033454    50 1.139795 0.063059
51 0.570985 0.034483    51 0.849101 0.051990
52 0.534715 0.032454    52 0.845482 0.053041
53 0.577185 0.035037    53 0.867068 0.055129
54 0.524684 0.033184    54 0.847392 0.051038
55 0.507371 0.034354    55 0.837217 0.052612
56 0.503855 0.032454    56 0.879506 0.053876
57 0.535493 0.029426    57 0.881150 0.055632
58 0.536870 0.040568    58 0.906975 0.057269
59 0.505829 0.033793    59 0.884071 0.056639
60 0.567816 0.038540    60 0.730023 0.051479
61 0.556293 0.038540    61 0.737830 0.050850
62 0.473255 0.032454    62 0.740354 0.052108
63 0.533865 0.040224    63 0.753932 0.051596
64 0.514418 0.034483    64 0.727656 0.053619
65 0.487497 0.028398    65 0.774739 0.053870
66 0.513700 0.025216    66 0.832094 0.048081
67 0.447182 0.026369    67 0.799185 0.052486
68 0.464616 0.026369    68 0.785597 0.053367
69 0.433060 0.024231    69 0.660124 0.049570
70 0.463813 0.024233    70 0.699606 0.050598
71 0.475865 0.028398    71 0.673907 0.048836
72 0.460187 0.026679    72 0.647345 0.046193
73 0.473736 0.027065    73 0.678176 0.048212
74 0.515551 0.024341    74 0.708431 0.051314
75 0.481606 0.034483    75 0.680705 0.049200
76 0.487582 0.027921    76 0.700234 0.048962

One thing notable is that, in your log file the validation batch size is 29, however, in my case it is 32, thus I think maybe our preprocessed dataset has some difference

Hi:
I have also tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config, but the eer is 0.05, What is the distance measure you use between two embeddings? and mine is cosin. are there any points to note in the prediction process?
Thanks

@mycrazycracy
Copy link
Owner

@mycrazycracy Hi, thanks for the reply. I tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config. And the result is roughly the same which is 2% EER. however, on my side when I trained from the very beginning, the result is much worse. (around 4%)
the misalignment of validation loss happens even at the beginning of training.

yours                               mine
0 4.347158 0.121704     0 5.595568 0.153933                                                                                                           
1 4.409681 0.124732     1 6.142251 0.152045                                                                                                           
2 4.953392 0.164300     2 5.387073 0.145626                                                                                                           
3 4.171205 0.121704     3 5.853278 0.138200                                                                                                           
4 6.030734 0.141988     4 4.879803 0.145833                                                                                                           
5 3.373980 0.117647     5 3.483222 0.127753                                                                                                           
6 3.551348 0.114610     6 2.925486 0.126240                                                                                                           
7 3.499111 0.126222     7 2.618020 0.112649                                                                                                           
8 2.706735 0.106290     8 2.495200 0.108464                                                                                                           
9 2.505243 0.087221     9 2.797635 0.107489                                                                                                           
10 2.579709 0.105946    10 3.352109 0.108477                                                                                                          
11 2.386637 0.083784    11 2.149656 0.094273                                                                                                          
12 2.215215 0.078906    12 2.788486 0.107827                                                                                                          
13 2.093923 0.070809    13 2.154233 0.096413                                                                                                          
14 2.614316 0.095335    14 2.664895 0.099308                                                                                                          
15 2.402408 0.068661    15 2.585773 0.102958
16 2.269810 0.081136    16 2.037069 0.082545
17 2.459356 0.068966    17 2.123691 0.084204
18 1.660872 0.069354    18 1.782891 0.078462
19 1.641136 0.070407    19 1.666662 0.078414
20 1.789717 0.070994    20 1.884485 0.074890
21 1.732624 0.062880    21 1.991562 0.085085
22 1.685123 0.060852    22 1.975221 0.087411
23 1.925408 0.078476    23 1.887127 0.078248
24 1.225578 0.060852    24 1.545569 0.067716
25 1.237508 0.060852    25 1.556502 0.076639
26 1.168123 0.058824    26 1.478283 0.072876
27 1.178835 0.045722    27 1.438764 0.072354
28 1.394852 0.062880    28 1.483797 0.067086
29 1.245886 0.061878    29 1.449579 0.072498
30 1.333659 0.056795    30 1.443876 0.066027
31 0.997353 0.043031    31 1.379931 0.071366
32 0.920696 0.044625    32 1.365147 0.073002
33 1.091778 0.052738    33 1.386801 0.066677
34 1.029863 0.046250    34 1.350820 0.067212
35 0.953727 0.051637    35 1.251097 0.064168
36 0.956934 0.051722    36 1.254138 0.063059
37 0.675873 0.036846    37 1.358343 0.066842
38 0.815823 0.044149    38 1.531808 0.067086
39 0.705373 0.040568    39 1.218243 0.068974
40 0.714766 0.032454    40 1.435390 0.061162
41 0.763318 0.046653    41 1.284261 0.065576
42 0.574937 0.034483    42 1.291959 0.063184
43 0.592286 0.031686    43 1.354145 0.069352
44 0.581585 0.036511    44 1.020841 0.058024
45 0.690763 0.043893    45 1.134171 0.060919
46 0.698204 0.040568    46 0.971369 0.062687
47 0.545573 0.026369    47 1.049776 0.057784
48 0.564414 0.029049    48 1.095432 0.059408
49 0.577830 0.034483    49 1.046128 0.056462
50 0.578830 0.033454    50 1.139795 0.063059
51 0.570985 0.034483    51 0.849101 0.051990
52 0.534715 0.032454    52 0.845482 0.053041
53 0.577185 0.035037    53 0.867068 0.055129
54 0.524684 0.033184    54 0.847392 0.051038
55 0.507371 0.034354    55 0.837217 0.052612
56 0.503855 0.032454    56 0.879506 0.053876
57 0.535493 0.029426    57 0.881150 0.055632
58 0.536870 0.040568    58 0.906975 0.057269
59 0.505829 0.033793    59 0.884071 0.056639
60 0.567816 0.038540    60 0.730023 0.051479
61 0.556293 0.038540    61 0.737830 0.050850
62 0.473255 0.032454    62 0.740354 0.052108
63 0.533865 0.040224    63 0.753932 0.051596
64 0.514418 0.034483    64 0.727656 0.053619
65 0.487497 0.028398    65 0.774739 0.053870
66 0.513700 0.025216    66 0.832094 0.048081
67 0.447182 0.026369    67 0.799185 0.052486
68 0.464616 0.026369    68 0.785597 0.053367
69 0.433060 0.024231    69 0.660124 0.049570
70 0.463813 0.024233    70 0.699606 0.050598
71 0.475865 0.028398    71 0.673907 0.048836
72 0.460187 0.026679    72 0.647345 0.046193
73 0.473736 0.027065    73 0.678176 0.048212
74 0.515551 0.024341    74 0.708431 0.051314
75 0.481606 0.034483    75 0.680705 0.049200
76 0.487582 0.027921    76 0.700234 0.048962

One thing notable is that, in your log file the validation batch size is 29, however, in my case it is 32, thus I think maybe our preprocessed dataset has some difference

Hi:
I have also tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config, but the eer is 0.05, What is the distance measure you use between two embeddings? and mine is cosin. are there any points to note in the prediction process?
Thanks

Pls use PLDA. Just follow run.sh in voxceleb egs.

@shatealaboxiaowang
Copy link

@mycrazycracy Hi, thanks for the reply. I tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config. And the result is roughly the same which is 2% EER. however, on my side when I trained from the very beginning, the result is much worse. (around 4%)
the misalignment of validation loss happens even at the beginning of training.

yours                               mine
0 4.347158 0.121704     0 5.595568 0.153933                                                                                                           
1 4.409681 0.124732     1 6.142251 0.152045                                                                                                           
2 4.953392 0.164300     2 5.387073 0.145626                                                                                                           
3 4.171205 0.121704     3 5.853278 0.138200                                                                                                           
4 6.030734 0.141988     4 4.879803 0.145833                                                                                                           
5 3.373980 0.117647     5 3.483222 0.127753                                                                                                           
6 3.551348 0.114610     6 2.925486 0.126240                                                                                                           
7 3.499111 0.126222     7 2.618020 0.112649                                                                                                           
8 2.706735 0.106290     8 2.495200 0.108464                                                                                                           
9 2.505243 0.087221     9 2.797635 0.107489                                                                                                           
10 2.579709 0.105946    10 3.352109 0.108477                                                                                                          
11 2.386637 0.083784    11 2.149656 0.094273                                                                                                          
12 2.215215 0.078906    12 2.788486 0.107827                                                                                                          
13 2.093923 0.070809    13 2.154233 0.096413                                                                                                          
14 2.614316 0.095335    14 2.664895 0.099308                                                                                                          
15 2.402408 0.068661    15 2.585773 0.102958
16 2.269810 0.081136    16 2.037069 0.082545
17 2.459356 0.068966    17 2.123691 0.084204
18 1.660872 0.069354    18 1.782891 0.078462
19 1.641136 0.070407    19 1.666662 0.078414
20 1.789717 0.070994    20 1.884485 0.074890
21 1.732624 0.062880    21 1.991562 0.085085
22 1.685123 0.060852    22 1.975221 0.087411
23 1.925408 0.078476    23 1.887127 0.078248
24 1.225578 0.060852    24 1.545569 0.067716
25 1.237508 0.060852    25 1.556502 0.076639
26 1.168123 0.058824    26 1.478283 0.072876
27 1.178835 0.045722    27 1.438764 0.072354
28 1.394852 0.062880    28 1.483797 0.067086
29 1.245886 0.061878    29 1.449579 0.072498
30 1.333659 0.056795    30 1.443876 0.066027
31 0.997353 0.043031    31 1.379931 0.071366
32 0.920696 0.044625    32 1.365147 0.073002
33 1.091778 0.052738    33 1.386801 0.066677
34 1.029863 0.046250    34 1.350820 0.067212
35 0.953727 0.051637    35 1.251097 0.064168
36 0.956934 0.051722    36 1.254138 0.063059
37 0.675873 0.036846    37 1.358343 0.066842
38 0.815823 0.044149    38 1.531808 0.067086
39 0.705373 0.040568    39 1.218243 0.068974
40 0.714766 0.032454    40 1.435390 0.061162
41 0.763318 0.046653    41 1.284261 0.065576
42 0.574937 0.034483    42 1.291959 0.063184
43 0.592286 0.031686    43 1.354145 0.069352
44 0.581585 0.036511    44 1.020841 0.058024
45 0.690763 0.043893    45 1.134171 0.060919
46 0.698204 0.040568    46 0.971369 0.062687
47 0.545573 0.026369    47 1.049776 0.057784
48 0.564414 0.029049    48 1.095432 0.059408
49 0.577830 0.034483    49 1.046128 0.056462
50 0.578830 0.033454    50 1.139795 0.063059
51 0.570985 0.034483    51 0.849101 0.051990
52 0.534715 0.032454    52 0.845482 0.053041
53 0.577185 0.035037    53 0.867068 0.055129
54 0.524684 0.033184    54 0.847392 0.051038
55 0.507371 0.034354    55 0.837217 0.052612
56 0.503855 0.032454    56 0.879506 0.053876
57 0.535493 0.029426    57 0.881150 0.055632
58 0.536870 0.040568    58 0.906975 0.057269
59 0.505829 0.033793    59 0.884071 0.056639
60 0.567816 0.038540    60 0.730023 0.051479
61 0.556293 0.038540    61 0.737830 0.050850
62 0.473255 0.032454    62 0.740354 0.052108
63 0.533865 0.040224    63 0.753932 0.051596
64 0.514418 0.034483    64 0.727656 0.053619
65 0.487497 0.028398    65 0.774739 0.053870
66 0.513700 0.025216    66 0.832094 0.048081
67 0.447182 0.026369    67 0.799185 0.052486
68 0.464616 0.026369    68 0.785597 0.053367
69 0.433060 0.024231    69 0.660124 0.049570
70 0.463813 0.024233    70 0.699606 0.050598
71 0.475865 0.028398    71 0.673907 0.048836
72 0.460187 0.026679    72 0.647345 0.046193
73 0.473736 0.027065    73 0.678176 0.048212
74 0.515551 0.024341    74 0.708431 0.051314
75 0.481606 0.034483    75 0.680705 0.049200
76 0.487582 0.027921    76 0.700234 0.048962

One thing notable is that, in your log file the validation batch size is 29, however, in my case it is 32, thus I think maybe our preprocessed dataset has some difference

Hi:
I have also tested with the pretrained model on xvector_nnet_tdnn_arcsoftmax_m0.30_linear_bn_1e-2 config, but the eer is 0.05, What is the distance measure you use between two embeddings? and mine is cosin. are there any points to note in the prediction process?
Thanks

Pls use PLDA. Just follow run.sh in voxceleb egs.

Yes it is as expected, Thank you for your sharing and replaying;

@deciding
Copy link
Author

deciding commented Mar 9, 2020

@shatealaboxiaowang hope you can have the same training result. if you can reproduce on your own training, can you update in this thread? thanks

@shatealaboxiaowang
Copy link

@shatealaboxiaowang hope you can have the same training result. if you can reproduce on your own training, can you update in this thread? thanks

ok, I will update in this thread if having the same training result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants