New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add streaming speech enhancemnt inference. #5049
Conversation
for more information, see https://pre-commit.ci
Hi, I also prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged. Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.: |
Done! Now you @sw005320 can run it in the Colab notebook. But the RTF is about 2 with the Colab's runtime... In my local machine, it's smaller than 1 😂. It looks like we need more optimizations. |
some issues in the test. check https://github.com/espnet/espnet/actions/runs/4553841977/jobs/8030914388#step:8:17 |
Codecov Report
@@ Coverage Diff @@
## master #5049 +/- ##
========================================
Coverage 74.98% 74.99%
========================================
Files 617 618 +1
Lines 55176 55544 +368
========================================
+ Hits 41376 41654 +278
- Misses 13800 13890 +90
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
for more information, see https://pre-commit.ci
Co-authored-by: Wangyou Zhang <C0me_On@163.com>
for more information, see https://pre-commit.ci
Co-authored-by: nateanl <nizhaoheng@gmail.com>
for more information, see https://pre-commit.ci
Thanks for @Emrys365 @nateanl 's comments. I refactored the code slightly, the frame split and merge functions are now moved into BTW, all the consistency checks have been passed now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I only have some minor comments.
Co-authored-by: Wangyou Zhang <C0me_On@163.com>
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Is it OK to merge it? |
Yes, I think it is in a good shape now. |
Hi,
I added
espnet2/bin/enh_inference_streaming.py
to support the streaming inference of some causal enh-models.Now, the separators can implement a
forward_stream
method (a new abstract method in Enh Separator), and then it can be called inespnet2/bin/enh_inference_streaming.py
.Currently, I implement the
forward_stream
function for three separators:no segment padding
mode,segment padding
will be updated soon)Some separated waves (upper is from old sequence inference, lower is from the streaming inference):
Future works:
Updated:
I prepared a streaming demo for speech separation. The notebook demo requires the new features of this PR, so I'll update it to the demo repo when this PR gets merged.
Now, anyone with the following link can post comments on it, it would be great if you could leave your comments and help improve it.:
https://colab.research.google.com/drive/17vd1V78eJpp3PHBnbFE5aVY5uMxQFL6o?usp=sharing