Deep clustering updates #92

sunits · 2020-05-04T07:54:29Z

Deep clustering updates
=> Create collate functions to ensure the seq length of batch elements are the same
=> Bucketing sampler to ensure each element of the batch are approximately of the same seq length, this avoid chopping of large parts of the dataset
=> Fix the bug where different random samples were taken for mixture and sources
=> Python code to create wav id sample count files
=> Introduce VAD mask for deep clustering
=> Normalize the deep clustering loss
=> Fix bug in the chimera++ model where projection view mixes up the seq and source indexes
=> Take log of spectra as input to train the model
=> Simple eval script which needs to be enhanced
=> Training script without pytorch lightning (To be made compatible with the core asteroid)

…s are the same => Bucketing sampler to ensure each element of the batch are approximately the same => Fix the bug where different random samples were taken for mixture and sources => Python code to create wav id sample count files => Introduce VAD mask for deep clustering => Normalize the deep clustering loss => Fix bug in the chimera++ model where projection view mixes up the seq and source indexes => Take log of spectra as input to train the model => Simple eval script which needs to be enhanced => Training script without pytorch lightning

mpariente

Thanks a bunch for the PR !
I didn't spend much time reviewing the training script because it looked like WIP, let me know when I can review it.

Oh and the test for the DC loss breaks because the code change. The test will have to be updated before merging but that's not the priority

mpariente · 2020-05-04T11:07:18Z

asteroid/masknn/blocks.py

+        if log:
+            #TODO: Use pytorch lightning logger here
+            print('Using log spectrum as input')


It's fine not to print anything actually, it will be be printed in the conf dictionary.
I would call it differently though, maybe take_log or log_spec?

mpariente · 2020-05-04T11:08:08Z

asteroid/masknn/blocks.py

@@ -433,7 +433,7 @@ class ChimeraPP(nn.Module):
    """
    def __init__(self, in_chan, n_src, rnn_type = 'lstm',
            embedding_dim=20, n_layers=2, hidden_size=600,
-            dropout=0, bidirectional=True):
+            dropout=0, bidirectional=True, log=False):


Update docstring as well

mpariente · 2020-05-04T11:10:39Z

egs/wsj0-mix/DeepClustering/local/conf.yml

+ n_filters: 256
+ kernel_size: 256
+ stride: 64
+ log: True # Use log spectra as input


If the log is taken in the mask network, then it belongs in a mask network config

mpariente · 2020-05-04T11:13:36Z

egs/wsj0-mix/DeepClustering/local/conf.yml

+  tr_wav_len_list: exp/tr.wavid.samples
+  cv_wav_len_list: exp/cv.wavid.samples
+  tt_wav_len_list: exp/tt.wavid.samples
+  wav_base_path: /srv/storage/talc3@talc-data.nancy/multispeech/calcul/users/ssivasankaran/experiments/data/speech_separation/wsj0-mix/2speakers/wav8k/min/


The config file shouldn't need modification, absolute path have to be in the run.sh.
The text files containing infos about the dataset have to be under ./data/

mpariente · 2020-05-04T11:14:11Z

egs/wsj0-mix/DeepClustering/model.py

    masker = ChimeraPP(int(enc.filterbank.n_feats_out/2), 2,
                       embedding_dim=20, n_layers=2, hidden_size=600, \
-                       dropout=0, bidirectional=True)
+                       dropout=0.5, bidirectional=True, \
+                       log=conf['filterbank']['log'])


These constants should go in conf.yml, expose the ones that we'll experiment with in the run.sh as wel.

mpariente · 2020-05-04T11:15:16Z

egs/wsj0-mix/DeepClustering/model.py

+    # Removing additional saved info 
+    checkpoint['state_dict'].pop('enc.filterbank._filters')


Why was this necessary? Did it crash?

mpariente · 2020-05-04T11:16:48Z

egs/wsj0-mix/DeepClustering/model.py

@@ -33,8 +35,31 @@ def make_model_and_optimizer(conf):
    enc = fb.Encoder(fb.STFTFB(**conf['filterbank']))
    masker = ChimeraPP(int(enc.filterbank.n_feats_out/2), 2,
                       embedding_dim=20, n_layers=2, hidden_size=600, \
-                       dropout=0, bidirectional=True)
+                       dropout=0.5, bidirectional=True, \
+                       log=conf['filterbank']['log'])
    model = Model(enc, masker)


Even if the STFT is not inverted at train time, the goal with eventually be to go back to the time domain. You might as well attach the iSTFT to the model, it will make the rest easier IMO.

mpariente · 2020-05-04T11:18:06Z

egs/wsj0-mix/DeepClustering/simple_train.py

+    source: github.com/funcwj/deep-clustering.git
+    '''
+    # to dB
+    spectra_db = 20 * torch.log10(spectra)


Stabilize log here

mpariente · 2020-05-04T11:18:21Z

egs/wsj0-mix/DeepClustering/simple_train.py

+    max_magnitude_db = torch.max(spectra_db)
+    threshold = 10**((max_magnitude_db - threshold_db) / 20)
+    mask = spectra > threshold
+    return mask.double()


Is float not sufficient?

mpariente · 2020-05-04T11:22:05Z

egs/wsj0-mix/create_meta_files.py

@@ -0,0 +1,31 @@
+import os


Could this follow the same format as in the WHAM recipes? It would make it much easier to switch from one to the other (specially for the wsj0-3mix that is not covered in WHAM)

mpariente · 2020-05-11T16:25:09Z

Closed by #95 and #96

sunits marked this pull request as draft May 4, 2020 07:55

sunits changed the title ~~=> Create collate functions to ensure the seq length of batch element…~~ Deep clustering updates May 4, 2020

mpariente reviewed May 4, 2020

View reviewed changes

mpariente closed this May 11, 2020

mpariente deleted the dc_recipe branch August 11, 2020 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deep clustering updates #92

Deep clustering updates #92

sunits commented May 4, 2020 •

edited

Loading

mpariente left a comment •

edited

Loading

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente May 4, 2020

mpariente commented May 11, 2020

		# Removing additional saved info
		checkpoint['state_dict'].pop('enc.filterbank._filters')

Deep clustering updates #92

Deep clustering updates #92

Conversation

sunits commented May 4, 2020 • edited Loading

mpariente left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mpariente commented May 11, 2020

sunits commented May 4, 2020 •

edited

Loading

mpariente left a comment •

edited

Loading