-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a new separator for speech enhancement/separation tasks #4062
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4062 +/- ##
==========================================
+ Coverage 80.94% 81.02% +0.07%
==========================================
Files 435 437 +2
Lines 37651 37822 +171
==========================================
+ Hits 30477 30645 +168
- Misses 7174 7177 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a result to wsj0_2mix/enh1/README.md with pre-trained models?
WSJ0_2mix is not a suitable example, but it is still useful.
espnet2/enh/layers/tcn.py
Outdated
@@ -253,20 +258,37 @@ def forward(self, y): | |||
Returns: | |||
cLN_y: [M, N, K] | |||
""" | |||
dim = 3 | |||
if y.dim() == 4: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if-branch is never used in the codebase, so I removed this case.
LGTM for me. |
Yes, I have some pre-trained models on WSJ0_2mix. However, I rewrote the code for this open-source PR to avoid some potential IP issues, thus the previously trained model can't be directly loaded due to some renaming. Modifying the parameter keys in the previous checkpoint file might be too hacking. So, I am training a new model with this codebase. The progress looks good. I'll upload a pre-trained model when the training is done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I just added some minor comments.
After we fix @Emrys365's comments and upload a model, we can marge this PR. |
The CI failure seems to be caused by a pytorch_complex-related issue. Maybe we can retry the CI test when the issue is solved. |
This PR is an implementation for Skipping Memory (SkiM) net.
It is a low-computational-cost model for long sequence modeling.
In the original paper, it is used for low-latency real-time Continuous Speech Separation (CSS).
Currently, the CSS task is not ready in the main ESPNet branch. So this PR mainly Implements the SkiM model.
For testing the model itself, I also add two related configuration files (causal and non-causal) in the WSJ0-2mix recipe. (By using a small encoder kernel size, the utterance-level input also generates a long feature sequence.)