The poor performance of DeepClustering #138

hangtingchen · 2020-06-03T07:45:18Z

Hi
First thanks a lot for such an excellent tool for speech separation. I have tried the deep clustering part of wsj0-mix
https://github.com/mpariente/asteroid/tree/master/egs/wsj0-mix/DeepClustering
My performance was poor (si-sdr=3.5, sdr=4.5 in 35 epochs with 1 gpu for training). As reported here, the sdr is expected to be closed to 10dB. I am wondering the reason of the failure. Is there any tricks for training, or more epochs are needed for improvement?

Thanks a lot.

mpariente · 2020-06-03T08:39:24Z

Hey, thanks for opening an issue.
Let me paste my training config here:

data:
  n_src: 2
  sample_rate: 8000
  train_dir: data/2speakers/wav8k/min/tr
  valid_dir: data/2speakers/wav8k/min/cv
filterbank:
  kernel_size: 256
  n_filters: 256
  stride: 64
main_args:
  exp_dir: exp/train_chimera_dcalone_newlr/
  help: null
masknet:
  dropout: 0.3
  embedding_dim: 40
  hidden_size: 600
  n_layers: 4
  n_src: 2
  rnn_type: lstm
  take_log: true
optim:
  lr: 0.0001
  optimizer: adam
  weight_decay: 0.0
positional arguments: {}
training:
  batch_size: 32
  early_stop: true
  epochs: 200
  half_lr: true
  loss_alpha: 1.0
  num_workers: 8

Most importantly, the learning rate is 1e-4 instead of 1e-5 in the recipe. I'll change the default now.

Here are the metrics

{
"si_sdr": 9.846718255569847,
"si_sdr_imp": 9.84787033402749,
"sdr": 10.364474047942608,
"sdr_imp": 10.213430163640966,
"sir": 19.018816263769104,
"sir_imp": 18.867772082693932,
"sar": 11.336185832787844,
"sar_imp": -64.43806377592414,
"stoi": 0.8787521784931114,
"stoi_imp": 0.1407063969268085
}

My best val loss was around 2930 and I trained for 130 epochs.

I you come over slack, I can share the pretrained model folder (under research only license) to you.

hangtingchen · 2020-06-03T09:23:51Z

thanks a lot, I am to try this config. The results may come out tomorrow. If it is fine, I will close this issue. One more question,
https://github.com/mpariente/asteroid/blob/d106902c9c1c939dc70f9ff1b963223c42d61547/asteroid/data/wsj0_mix.py#L33
the dataset utilized 4-second segments. Do you adopt the same setting?

mpariente · 2020-06-03T09:45:15Z

the dataset utilized 4-second segments. Do you adopt the same setting?

Yes, I did.

hangtingchen · 2020-06-03T10:06:35Z

Sorry for bothering again. After the 1st epoch, my loss was around 6000. Is this normal?
Could you share the model and log if it is convenient for you ? I will check the problem myself.

mpariente · 2020-06-03T10:34:09Z

Sounds about right.
Here is the log file.
The exp folder is quite heavy, I cannot pass it on github.
train_dcalone_newlr.log

hangtingchen · 2020-06-06T12:05:27Z

It seemed that I could not produce nice results. After adopting the same settings, the sisdr is 8.3., still 1dB lower. I don't know the reason since I used the newest code but the software version might be different . Please comment if you have any suggestions. Otherwise, I am to close the issue. Thanks a lot.

mpariente · 2020-06-07T09:34:53Z

It seemed that I could not produce nice results. After adopting the same settings, the sisdr is 8.3., still 1dB lower. I don't know the reason since I used the newest code but the software version might be different . Please comment if you have any suggestions. Otherwise, I am to close the issue. Thanks a lot.

Could you please use version 0.2.0 (git checkout v0.2.0)and try training with the same config?
We've had some issues with LSTMs and the new version of pytorch lightning in the past, we upgraded lightning since then.

Thanks for reporting your problems

mpariente · 2020-06-12T08:28:47Z

@hangtingchen Could you try it please?

hangtingchen · 2020-06-12T09:35:16Z

font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } I am a little busy recently. But yes, I am trying, and I will tell you the results if the model is finished training. Maybe 2-3 days later. font{ line-height: 1.6; } Sincerely,Hangting Chen On 6/12/2020 17:32，Pariente Manuel<notifications@github.com> wrote： @hangtingchen Could you try it please? —You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.

hangtingchen · 2020-07-05T05:30:33Z

font{ line-height: 1.6; } ul,ol{ padding-left: 20px; list-style-position: inside; } Dear Pariente Manuel, Sorry for the late relay. We are still unable to reproduce the work. The environment is Pytorch_lightning 0.6.0 and asteriod 0.2.0, 1 gpuWe follow the same settings, but the final loss is about 3.5e+3, 20% higher than 2.9e+3.We don't know the reason, just report the results to you. font{ line-height: 1.6; } Sincerely,Hangting Chen On 6/12/2020 17:32，Pariente Manuel<notifications@github.com> wrote： @hangtingchen Could you try it please? —You are receiving this because you were mentioned.Reply to this email directly, view it on GitHub, or unsubscribe.

mpariente · 2020-07-06T08:50:24Z

Maybe you're using wsj1?
IDK it seems weird to me, anybody else can reproduce the problem?

hangtingchen · 2020-07-07T14:27:51Z

Hi
Sorry to disturb you again.
Could you please answer one more question about DeepClustering?
I have noticed that the wsj0-2mix dataset was created with the waveform generated by
https://github.com/mpariente/asteroid/blob/0bdec2644f2d770d037ce804b7f70cb98bd5c9fa/egs/wsj0-mix/DeepClustering/local/convert_sphere2wav.sh#L31
The line utilizes both wv1 and wv2. However, the wav generated by wv2 will cover the one from wv1. And the wv1 is noise-free however wv2 contains more noise.
In summary, I use wv2 to generate wsj0-mix dataset, but I don't know whether this is the reason of my poor performance. And which do you use to generate the wav files ?

An example :
2 lines with same names in sph.list

wv1 spectrogram

wv2 spectrogram

mpariente · 2020-07-08T16:49:34Z

Oh that's a very good point, could you please open a separate issue please?
Thanks !

hangtingchen closed this as completed Jun 7, 2020

hangtingchen reopened this Jul 7, 2020

mpariente closed this as completed Jul 8, 2020

mpariente mentioned this issue Jul 9, 2020

fix bugs in generating wsj0-mix dataset with wv1 && loading model acc… #166

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The poor performance of DeepClustering #138

The poor performance of DeepClustering #138

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 6, 2020

mpariente commented Jun 7, 2020

mpariente commented Jun 12, 2020

hangtingchen commented Jun 12, 2020 via email

hangtingchen commented Jul 5, 2020 via email

mpariente commented Jul 6, 2020

hangtingchen commented Jul 7, 2020

mpariente commented Jul 8, 2020

The poor performance of DeepClustering #138

The poor performance of DeepClustering #138

Comments

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 3, 2020

mpariente commented Jun 3, 2020

hangtingchen commented Jun 6, 2020

mpariente commented Jun 7, 2020

mpariente commented Jun 12, 2020

hangtingchen commented Jun 12, 2020 via email

hangtingchen commented Jul 5, 2020 via email

mpariente commented Jul 6, 2020

hangtingchen commented Jul 7, 2020

mpariente commented Jul 8, 2020