Add streaming speech enhancemnt inference. #5049

LiChenda · 2023-03-22T11:42:24Z

Hi,
I added espnet2/bin/enh_inference_streaming.py to support the streaming inference of some causal enh-models.

Now, the separators can implement a forward_stream method (a new abstract method in Enh Separator), and then it can be called in espnet2/bin/enh_inference_streaming.py.
Currently, I implement the forward_stream function for three separators:

TCN (Consistency is OK, performance needs to be optimized)
LSTM (Consistency is OK)
SkiM (Consistency is OK, currently only support no segment padding mode, segment padding will be updated soon)

Some separated waves (upper is from old sequence inference, lower is from the streaming inference):

Future works:

Prepare a streaming demo in Colab notebook
Performance optimization and support more separators.

Updated:

I prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged.

Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.:
https://colab.research.google.com/drive/17vd1V78eJpp3PHBnbFE5aVY5uMxQFL6o?usp=sharing

for more information, see https://pre-commit.ci

LiChenda · 2023-03-28T15:00:45Z

Hi, I also prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged.

Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.:
https://colab.research.google.com/drive/17vd1V78eJpp3PHBnbFE5aVY5uMxQFL6o?usp=sharing

sw005320 · 2023-03-28T15:09:47Z

Cool!
Would it be better to include the installation part?

LiChenda · 2023-03-28T15:11:09Z

Cool! Would it be better to include the installation part?

Yes, I'm adding cells to install it from my git repo.

LiChenda · 2023-03-28T15:56:42Z

Cool! Would it be better to include the installation part?

Done! Now you @sw005320 can run it in the Colab notebook. But the RTF is about 2 with the Colab's runtime... In my local machine, it's smaller than 1 😂. It looks like we need more optimizations.

sw005320 · 2023-03-29T12:34:10Z

some issues in the test. check https://github.com/espnet/espnet/actions/runs/4553841977/jobs/8030914388#step:8:17

codecov · 2023-03-31T06:21:56Z

Codecov Report

Merging #5049 (b799aa1) into master (b8aa7e0) will increase coverage by 0.00%.
The diff coverage is 76.43%.

@@           Coverage Diff            @@
##           master    #5049    +/-   ##
========================================
  Coverage   74.98%   74.99%            
========================================
  Files         617      618     +1     
  Lines       55176    55544   +368     
========================================
+ Hits        41376    41654   +278     
- Misses      13800    13890    +90

Flag	Coverage Δ
test_integration_espnet1	`66.28% <ø> (ø)`
test_integration_espnet2	`47.58% <20.09%> (-0.17%)`	⬇️
test_python	`65.44% <75.91%> (+0.06%)`	⬆️
test_utils	`23.28% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
espnet2/enh/layers/ifasnet.py	`90.14% <ø> (ø)`
espnet2/bin/enh_inference_streaming.py	`55.95% <55.95%> (ø)`
espnet2/enh/encoder/abs_encoder.py	`87.50% <66.66%> (-12.50%)`	⬇️
espnet2/enh/separator/rnn_separator.py	`94.82% <83.33%> (-5.18%)`	⬇️
espnet2/enh/layers/skim.py	`94.04% <83.63%> (-5.12%)`	⬇️
espnet2/enh/separator/skim_separator.py	`90.56% <86.66%> (-1.54%)`	⬇️
espnet2/enh/encoder/stft_encoder.py	`98.30% <97.14%> (-1.70%)`	⬇️
espnet2/bin/enh_inference.py	`91.20% <100.00%> (+0.03%)`	⬆️
espnet2/enh/decoder/abs_decoder.py	`100.00% <100.00%> (ø)`
espnet2/enh/decoder/conv_decoder.py	`100.00% <100.00%> (ø)`
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

espnet2/bin/enh_inference_streaming.py

for more information, see https://pre-commit.ci

Co-authored-by: Wangyou Zhang <C0me_On@163.com>

for more information, see https://pre-commit.ci

Co-authored-by: nateanl <nizhaoheng@gmail.com>

for more information, see https://pre-commit.ci

LiChenda · 2023-04-26T11:05:29Z

Thanks for @Emrys365 @nateanl 's comments. I refactored the code slightly, the frame split and merge functions are now moved into AbsEncoder and AbsDecoder, and implemented by their subclass. That is because the Conv- and STFT-based Enc/Dec act differently. And that should be more flexible to implement streaming functions for other new Enc/Decs.

BTW, all the consistency checks have been passed now.

Emrys365

Thanks! I only have some minor comments.

espnet2/bin/enh_inference_streaming.py

espnet2/enh/decoder/stft_decoder.py

Co-authored-by: Wangyou Zhang <C0me_On@163.com>

for more information, see https://pre-commit.ci

sw005320 · 2023-04-27T11:37:22Z

Is it OK to merge it?

Emrys365 · 2023-04-27T18:10:13Z

Yes, I think it is in a good shape now.

sw005320 · 2023-04-27T18:13:19Z

OK, I merged!
This is super cool!
Thanks, @LiChenda, @Emrys365, @nateanl!

LiChenda added 2 commits March 15, 2023 15:51

skim streaming ready

8d393d7

update enh streaming inference

a0ff52b

mergify bot added the ESPnet2 label Mar 22, 2023

[pre-commit.ci] auto fixes from pre-commit.com hooks

6efe802

for more information, see https://pre-commit.ci

sw005320 added SE Speech enhancement Streaming labels Mar 22, 2023

sw005320 added this to the v.202303 milestone Mar 22, 2023

sw005320 added the New Features label Mar 22, 2023

LiChenda added 2 commits March 28, 2023 16:28

update test

6352027

Merge remote-tracking branch 'upstream/master' into streaming_enh

4ff376b

LiChenda added 3 commits March 28, 2023 23:03

add enh inference (forget to previously)

3722838

update

045a04c

Merge remote-tracking branch 'origin/streaming_enh' into streaming_enh

ebc30dc

Merge branch 'master' into streaming_enh

48437b1

sw005320 requested a review from Emrys365 March 29, 2023 12:14

LiChenda added 2 commits March 31, 2023 13:52

fix for pytest

c874902

Merge remote-tracking branch 'origin/streaming_enh' into streaming_enh

8e12752