Optimal parameters to apply on small chunks of data for streaming application #89

H-G-11 · 2023-08-11T14:00:33Z

Hi,

I am building an application that streams data from an audio input.
I am applying the noisereduce algorithm (torch version) to every chunk of 500ms audio data, but will probably go down to 20ms at some point.

It seems to me that this algorithm works great for one big audio file, but applying it to many small audio files each after another leads to the final filtered output to be of poor quality (the noise is not removed in the same way everywhere).

I am sure I can improve things by tunning the hyper-parameters. Would someone be so kind as telling me which ones should be optimized?
I am quite new to audio data, so I am not sure how I can tackle this issue.

Thank you so much for your answers and this great repository.

timsainb · 2023-08-15T14:45:55Z

Hi Hugues,

I assume you are using the stationary version of the algorithm? The stationary version of the algorithm should be the same for long vs short clips if you are providing the same noise clip as input. It doesn't make sense to perform non-stationary noise reduction because you are basically providing stationary input if the timescale is too short.

If you have some metric of quality it is possible to search parameter space that way - e.g. training a prediction model on the output and seeing what set of parameters perform best.

All the parameters are in the main readme. I would focus on the prop_decrease, time_constant_s, freq_mask_smooth_hz, time_mask_smooth_ms, sigmoid_slope_nonstationary, n_std_thresh_stationary

These all relate to how the mask is built.

Best,
Tim

H-G-11 · 2023-08-23T08:37:20Z

Thank you very much for your answer Tim.

Yes, I am using the stationary version. I will try to optimize on the parameters you indicated!

Best

DamienDeepgram · 2023-08-31T23:24:28Z

Interested in how to get this working for streaming audio also, did you ever get something working @HuguesGallier ?

H-G-11 · 2023-09-09T10:00:38Z

Hello @DamienDeepgram,

I couldn't find satisfying parameters for small chunks of data (200ms). When I process each of them separately, the quality of the resulting audio file when I join the treated chunks is not satifying.

So I will probably just remove the noise when I really need to (for instance, before speech to text).
Otherwise, you can find this other library if you want to remove the noise directly from the microphone itself with a LADSPA plugin (if you are on Linux).

H-G-11 closed this as completed Aug 23, 2023

H-G-11 mentioned this issue Sep 20, 2023

Add ONNX export for Pytorch Model #93

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimal parameters to apply on small chunks of data for streaming application #89

Optimal parameters to apply on small chunks of data for streaming application #89

H-G-11 commented Aug 11, 2023

timsainb commented Aug 15, 2023

H-G-11 commented Aug 23, 2023

DamienDeepgram commented Aug 31, 2023

H-G-11 commented Sep 9, 2023 •

edited

Loading

Optimal parameters to apply on small chunks of data for streaming application #89

Optimal parameters to apply on small chunks of data for streaming application #89

Comments

H-G-11 commented Aug 11, 2023

timsainb commented Aug 15, 2023

H-G-11 commented Aug 23, 2023

DamienDeepgram commented Aug 31, 2023

H-G-11 commented Sep 9, 2023 • edited Loading

H-G-11 commented Sep 9, 2023 •

edited

Loading