Add a new separator for speech enhancement/separation tasks #4062

LiChenda · 2022-02-11T07:07:17Z

This PR is an implementation for Skipping Memory (SkiM) net.
It is a low-computational-cost model for long sequence modeling.
In the original paper, it is used for low-latency real-time Continuous Speech Separation (CSS).
Currently, the CSS task is not ready in the main ESPNet branch. So this PR mainly Implements the SkiM model.
For testing the model itself, I also add two related configuration files (causal and non-causal) in the WSJ0-2mix recipe. (By using a small encoder kernel size, the utterance-level input also generates a long feature sequence.)

codecov · 2022-02-11T07:26:31Z

Codecov Report

Merging #4062 (ce2dee8) into master (637d8c3) will increase coverage by 0.07%.
The diff coverage is 98.35%.

@@            Coverage Diff             @@
##           master    #4062      +/-   ##
==========================================
+ Coverage   80.94%   81.02%   +0.07%     
==========================================
  Files         435      437       +2     
  Lines       37651    37822     +171     
==========================================
+ Hits        30477    30645     +168     
- Misses       7174     7177       +3

Flag	Coverage Δ
test_integration_espnet1	`67.13% <ø> (ø)`
test_integration_espnet2	`51.81% <19.23%> (-0.27%)`	⬇️
test_python	`66.83% <98.35%> (+0.14%)`	⬆️
test_utils	`24.45% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/enh/layers/dprnn.py	`100.00% <ø> (ø)`
espnet2/enh/layers/tcn.py	`95.90% <96.15%> (-0.37%)`	⬇️
espnet2/enh/separator/skim_separator.py	`97.05% <97.05%> (ø)`
espnet2/enh/layers/skim.py	`99.17% <99.17%> (ø)`
espnet2/tasks/enh.py	`99.06% <100.00%> (+<0.01%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 637d8c3...ce2dee8. Read the comment docs.

sw005320

How about adding a result to wsj0_2mix/enh1/README.md with pre-trained models?
WSJ0_2mix is not a suitable example, but it is still useful.

espnet2/enh/layers/skim.py

sw005320 · 2022-02-12T10:02:38Z

espnet2/enh/layers/tcn.py

@@ -253,20 +258,37 @@ def forward(self, y):
        Returns:
            cLN_y: [M, N, K]
        """
+        dim = 3
+        if y.dim() == 4:


Can you add a test for this case?

This if-branch is never used in the codebase, so I removed this case.

sw005320 · 2022-02-14T18:28:59Z

LGTM for me.
@Emrys365, do you have some comments?
I think this PR is already in a good shape.

LiChenda · 2022-02-15T06:45:45Z

How about adding a result to wsj0_2mix/enh1/README.md with pre-trained models? WSJ0_2mix is not a suitable example, but it is still useful.

Yes, I have some pre-trained models on WSJ0_2mix. However, I rewrote the code for this open-source PR to avoid some potential IP issues, thus the previously trained model can't be directly loaded due to some renaming. Modifying the parameter keys in the previous checkpoint file might be too hacking. So, I am training a new model with this codebase. The progress looks good. I'll upload a pre-trained model when the training is done.

Emrys365

LGTM. I just added some minor comments.

espnet2/enh/separator/skim_separator.py

espnet2/enh/layers/tcn.py

egs2/wsj0_2mix/enh1/conf/tuning/train_enh_skim_tasnet.yaml

espnet2/enh/layers/skim.py

sw005320 · 2022-02-18T20:34:14Z

After we fix @Emrys365's comments and upload a model, we can marge this PR.

LiChenda · 2022-02-24T11:49:08Z

The CI failure seems to be caused by a pytorch_complex-related issue. Maybe we can retry the CI test when the issue is solved.

LiChenda added 7 commits February 10, 2022 16:48

add skim model

2ba3457

update skim.py

7d7686f

add skim separator

d103494

add causal config for skim

cf535c2

add unit test

016b12e

update config

8310b73

fix for testing

f1ac925

mergify bot added the ESPnet2 label Feb 11, 2022

fixing an assertion missing

ac3c10c

Emrys365 added the SE Speech enhancement label Feb 11, 2022

sw005320 requested a review from Emrys365 February 11, 2022 11:13

sw005320 added this to the v.0.10.7 milestone Feb 11, 2022

sw005320 added the New Features label Feb 11, 2022

sw005320 reviewed Feb 12, 2022

View reviewed changes

Emrys365 approved these changes Feb 15, 2022

View reviewed changes

LiChenda added 2 commits February 23, 2022 17:52

add pre-trained SkiM model

9169d98

fix for review comments

79b9f11

mergify bot added the README label Feb 23, 2022

Merge branch 'espnet:master' into skim

ce2dee8

Emrys365 merged commit 5edb478 into espnet:master Feb 28, 2022

LiChenda deleted the skim branch April 1, 2022 16:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a new separator for speech enhancement/separation tasks #4062

Add a new separator for speech enhancement/separation tasks #4062

LiChenda commented Feb 11, 2022

codecov bot commented Feb 11, 2022 •

edited

sw005320 left a comment

sw005320 Feb 12, 2022

LiChenda Feb 23, 2022

sw005320 commented Feb 14, 2022

LiChenda commented Feb 15, 2022 •

edited

Emrys365 left a comment

sw005320 commented Feb 18, 2022

LiChenda commented Feb 24, 2022

Add a new separator for speech enhancement/separation tasks #4062

Add a new separator for speech enhancement/separation tasks #4062

Conversation

LiChenda commented Feb 11, 2022

codecov bot commented Feb 11, 2022 • edited

Codecov Report

sw005320 left a comment

Choose a reason for hiding this comment

sw005320 Feb 12, 2022

Choose a reason for hiding this comment

LiChenda Feb 23, 2022

Choose a reason for hiding this comment

sw005320 commented Feb 14, 2022

LiChenda commented Feb 15, 2022 • edited

Emrys365 left a comment

Choose a reason for hiding this comment

sw005320 commented Feb 18, 2022

LiChenda commented Feb 24, 2022

codecov bot commented Feb 11, 2022 •

edited

LiChenda commented Feb 15, 2022 •

edited