Skip to content
Noise reduction / speech enhancement for python using spectral gating
Jupyter Notebook Other
Branch: master
Clone or download
Latest commit a2d94e5 Jun 11, 2019

README.md

Build Status Coverage Status Binder Open In Colab PyPI version

Noise reduction in python using spectral gating

  • This algorithm is based (but not completely reproducing) on the one outlined by Audacity for the noise reduction effect (Link to C++ code)
  • The algorithm requires two inputs:
    1. A noise audio clip comtaining prototypical noise of the audio clip
    2. A signal audio clip containing the signal and the noise intended to be removed

Steps of algorithm

  1. An FFT is calculated over the noise audio clip
  2. Statistics are calculated over FFT of the the noise (in frequency)
  3. A threshold is calculated based upon the statistics of the noise (and the desired sensitivity of the algorithm)
  4. An FFT is calculated over the signal
  5. A mask is determined by comparing the signal FFT to the threshold
  6. The mask is smoothed with a filter over frequency and time
  7. The mask is appled to the FFT of the signal, and is inverted

Installation

pip install noisereduce

noisereduce optionally uses Tensorflow as a backend to speed up FFT and gaussian convolution. It is not listed in the requirements.txt so because (1) it is optional and (2) tensorflow-gpu and tensorflow (cpu) are both compatible with this package. The package requires Tensorflow 2+ for all tensorflow operations.

Usage

See example notebook: Open In Colab

import noisereduce as nr
# load data
rate, data = wavfile.read("mywav.wav")
# select section of data that is noise
noisy_part = data[10000:15000]
# perform noise reduction
reduced_noise = nr.reduce_noise(audio_clip=data, noise_clip=noisy_part, verbose=True)

Arguments to noise_reduce

n_grad_freq (int): how many frequency channels to smooth over with the mask.
n_grad_time (int): how many time channels to smooth over with the mask.
n_fft (int): number audio of frames between STFT columns.
win_length (int): Each frame of audio is windowed by `window()`. The window will be of length `win_length` and then padded with zeros to match `n_fft`..
hop_length (int):number audio of frames between STFT columns.
n_std_thresh (int): how many standard deviations louder than the mean dB of the noise (at each frequency level) to be considered signal
prop_decrease (float): To what extent should you decrease noise (1 = all, 0 = none)
pad_clipping (bool): Pad the signals with zeros to ensure that the reconstructed data is equal length to the data
        use_tensorflow (bool): Use tensorflow as a backend for convolution and fft to speed up computation
verbose (bool): Whether to plot the steps of the algorithm


Project based on the cookiecutter data science project template. #cookiecutterdatascience

You can’t perform that action at this time.