#Biodenoising - Animal vocalization denoising
This is a demo for animal vocalization denoising without access to clean data.
For the more info check the [associated page](https://mariusmiron.com/research/biodenoising/) and the code repository on [github](https://github.com/earthspecies/biodenoising).

First, let's install the package from pip:

In [None]:
!pip install biodenoising

We import the libraries:

In [None]:
from IPython import display as disp
import os
import torch
import torchaudio
from biodenoising import pretrained
from biodenoising.denoiser.dsp import convert_audio

We download some noisy animal vocalizations from the biodenoising_validation dataset. Note that these files, species, noise conditions were not seen during training, to test for generalization.

We set the device, gpu or cpu. You can use a computing instance with a GPU for faster processing.

In [None]:
if torch.cuda.is_available():
    device = torch.device('cuda')
else:
    device = torch.device('cpu')

Let's load the 16kHz model. If it's the first time you run this, it will download the model locally.

In [None]:
model = pretrained.biodenoising16k_dns48().to(device)

We use the model above to denoise the first demo sound.

In [None]:
wav, sr = torchaudio.load(os.path.join('whale.wav'))
wav = convert_audio(wav, sr, model.sample_rate, model.chin).to(device)

if wav.shape[-1] > 640000:
    import asteroid
    ola_model = asteroid.dsp.overlap_add.LambdaOverlapAdd(
        nnet=model,  # function to apply to each segment.
        n_src=1,  # number of sources in the output of nnet
        window_size=640000,  # Size of segmenting window
        hop_size=640000//4,  # segmentation hop size
        window="hann",  # Type of the window (see scipy.signal.get_window
        reorder_chunks=False,  # Whether to reorder each consecutive segment.
        enable_grad=False,  # Set gradient calculation on of off (see torch.set_grad_enabled)
    )
    ola_model.window = ola_model.window.to(device)
    with torch.no_grad():
      denoised = ola_model(wav[None])[0]
else:
  with torch.no_grad():
      denoised = model(wav[None])[0]
disp.display(disp.Audio(wav.data.cpu().numpy(), rate=model.sample_rate))
disp.display(disp.Audio(denoised.data.cpu().numpy(), rate=model.sample_rate))