Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming speech enhancemnt inference. #5049

Merged
merged 44 commits into from Apr 27, 2023

Conversation

LiChenda
Copy link
Contributor

@LiChenda LiChenda commented Mar 22, 2023

Hi,
I added espnet2/bin/enh_inference_streaming.py to support the streaming inference of some causal enh-models.

Now, the separators can implement a forward_stream method (a new abstract method in Enh Separator), and then it can be called in espnet2/bin/enh_inference_streaming.py.
Currently, I implement the forward_stream function for three separators:

  1. TCN (Consistency is OK, performance needs to be optimized)
  2. LSTM (Consistency is OK)
  3. SkiM (Consistency is OK, currently only support no segment padding mode, segment padding will be updated soon)

Some separated waves (upper is from old sequence inference, lower is from the streaming inference):

image

Future works:

  • Prepare a streaming demo in Colab notebook
  • Performance optimization and support more separators.

Updated:

I prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged.

Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.:
https://colab.research.google.com/drive/17vd1V78eJpp3PHBnbFE5aVY5uMxQFL6o?usp=sharing

@mergify mergify bot added the ESPnet2 label Mar 22, 2023
@sw005320 sw005320 added SE Speech enhancement Streaming labels Mar 22, 2023
@sw005320 sw005320 added this to the v.202303 milestone Mar 22, 2023
@LiChenda
Copy link
Contributor Author

Hi, I also prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged.

Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.:
https://colab.research.google.com/drive/17vd1V78eJpp3PHBnbFE5aVY5uMxQFL6o?usp=sharing

@sw005320
Copy link
Contributor

Cool!
Would it be better to include the installation part?

image

@LiChenda
Copy link
Contributor Author

Cool! Would it be better to include the installation part?

image

Yes, I'm adding cells to install it from my git repo.

@LiChenda
Copy link
Contributor Author

LiChenda commented Mar 28, 2023

Cool! Would it be better to include the installation part?

image

Done! Now you @sw005320 can run it in the Colab notebook. But the RTF is about 2 with the Colab's runtime... In my local machine, it's smaller than 1 😂. It looks like we need more optimizations.

@sw005320 sw005320 requested a review from Emrys365 March 29, 2023 12:14
@sw005320
Copy link
Contributor

some issues in the test. check https://github.com/espnet/espnet/actions/runs/4553841977/jobs/8030914388#step:8:17

@codecov
Copy link

codecov bot commented Mar 31, 2023

Codecov Report

Merging #5049 (b799aa1) into master (b8aa7e0) will increase coverage by 0.00%.
The diff coverage is 76.43%.

@@           Coverage Diff            @@
##           master    #5049    +/-   ##
========================================
  Coverage   74.98%   74.99%            
========================================
  Files         617      618     +1     
  Lines       55176    55544   +368     
========================================
+ Hits        41376    41654   +278     
- Misses      13800    13890    +90     
Flag Coverage Δ
test_integration_espnet1 66.28% <ø> (ø)
test_integration_espnet2 47.58% <20.09%> (-0.17%) ⬇️
test_python 65.44% <75.91%> (+0.06%) ⬆️
test_utils 23.28% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
espnet2/enh/layers/ifasnet.py 90.14% <ø> (ø)
espnet2/bin/enh_inference_streaming.py 55.95% <55.95%> (ø)
espnet2/enh/encoder/abs_encoder.py 87.50% <66.66%> (-12.50%) ⬇️
espnet2/enh/separator/rnn_separator.py 94.82% <83.33%> (-5.18%) ⬇️
espnet2/enh/layers/skim.py 94.04% <83.63%> (-5.12%) ⬇️
espnet2/enh/separator/skim_separator.py 90.56% <86.66%> (-1.54%) ⬇️
espnet2/enh/encoder/stft_encoder.py 98.30% <97.14%> (-1.70%) ⬇️
espnet2/bin/enh_inference.py 91.20% <100.00%> (+0.03%) ⬆️
espnet2/enh/decoder/abs_decoder.py 100.00% <100.00%> (ø)
espnet2/enh/decoder/conv_decoder.py 100.00% <100.00%> (ø)
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@mergify mergify bot removed the conflicts label Apr 26, 2023
@LiChenda
Copy link
Contributor Author

Thanks for @Emrys365 @nateanl 's comments. I refactored the code slightly, the frame split and merge functions are now moved into AbsEncoder and AbsDecoder, and implemented by their subclass. That is because the Conv- and STFT-based Enc/Dec act differently. And that should be more flexible to implement streaming functions for other new Enc/Decs.

BTW, all the consistency checks have been passed now.

Copy link
Collaborator

@Emrys365 Emrys365 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I only have some minor comments.

espnet2/bin/enh_inference_streaming.py Outdated Show resolved Hide resolved
espnet2/bin/enh_inference_streaming.py Outdated Show resolved Hide resolved
espnet2/enh/decoder/stft_decoder.py Outdated Show resolved Hide resolved
espnet2/enh/decoder/stft_decoder.py Outdated Show resolved Hide resolved
@LiChenda LiChenda changed the title Add streaming speech enhancemnt inference [WIP]. Add streaming speech enhancemnt inference. Apr 27, 2023
@sw005320
Copy link
Contributor

Is it OK to merge it?

@Emrys365
Copy link
Collaborator

Yes, I think it is in a good shape now.

@sw005320 sw005320 merged commit ff45e65 into espnet:master Apr 27, 2023
25 checks passed
@sw005320
Copy link
Contributor

OK, I merged!
This is super cool!
Thanks, @LiChenda, @Emrys365, @nateanl!

@LiChenda LiChenda deleted the streaming_enh branch May 19, 2023 06:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants