Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimal parameters to apply on small chunks of data for streaming application #89

Closed
H-G-11 opened this issue Aug 11, 2023 · 4 comments
Closed

Comments

@H-G-11
Copy link

H-G-11 commented Aug 11, 2023

Hi,

I am building an application that streams data from an audio input.
I am applying the noisereduce algorithm (torch version) to every chunk of 500ms audio data, but will probably go down to 20ms at some point.

It seems to me that this algorithm works great for one big audio file, but applying it to many small audio files each after another leads to the final filtered output to be of poor quality (the noise is not removed in the same way everywhere).

I am sure I can improve things by tunning the hyper-parameters. Would someone be so kind as telling me which ones should be optimized?
I am quite new to audio data, so I am not sure how I can tackle this issue.

Thank you so much for your answers and this great repository.

@timsainb
Copy link
Owner

Hi Hugues,

I assume you are using the stationary version of the algorithm? The stationary version of the algorithm should be the same for long vs short clips if you are providing the same noise clip as input. It doesn't make sense to perform non-stationary noise reduction because you are basically providing stationary input if the timescale is too short.

If you have some metric of quality it is possible to search parameter space that way - e.g. training a prediction model on the output and seeing what set of parameters perform best.

All the parameters are in the main readme. I would focus on the prop_decrease, time_constant_s, freq_mask_smooth_hz, time_mask_smooth_ms, sigmoid_slope_nonstationary, n_std_thresh_stationary

These all relate to how the mask is built.

Best,
Tim

@H-G-11
Copy link
Author

H-G-11 commented Aug 23, 2023

Thank you very much for your answer Tim.

Yes, I am using the stationary version. I will try to optimize on the parameters you indicated!

Best

@H-G-11 H-G-11 closed this as completed Aug 23, 2023
@DamienDeepgram
Copy link

Interested in how to get this working for streaming audio also, did you ever get something working @HuguesGallier ?

@H-G-11
Copy link
Author

H-G-11 commented Sep 9, 2023

Hello @DamienDeepgram,

I couldn't find satisfying parameters for small chunks of data (200ms). When I process each of them separately, the quality of the resulting audio file when I join the treated chunks is not satifying.

So I will probably just remove the noise when I really need to (for instance, before speech to text).
Otherwise, you can find this other library if you want to remove the noise directly from the microphone itself with a LADSPA plugin (if you are on Linux).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants