# Separation of speakers using Lab41's model

This notebook contains an example of loading an already trained version of Lab41's source separation model.  It also shows how to use the loaded model to separate individual speakers from an example waveform.

In [None]:
# Generic imports
import sys
import time

import numpy as np
import tensorflow as tf

# Plotting imports
import IPython
from IPython.display import Audio
from matplotlib import pyplot as plt
fig_size = [0,0]
fig_size[0] = 8
fig_size[1] = 4
plt.rcParams["figure.figsize"] = fig_size

# Import Lab41's separation model
from magnolia.dnnseparate.L41model import L41Model

# Import utilities for using the model
from magnolia.utils.clustering_utils import clustering_separate, get_cluster_masks, process_signal
from magnolia.features.mixer import FeatureMixer
from magnolia.features.supervised_iterator import SupervisedIterator, SupervisedMixer
from magnolia.features.hdf5_iterator import SplitsIterator
from magnolia.features.spectral_features import istft
from magnolia.features.data_preprocessing import undo_preemphasis

### Hyperparameters

    fft_size    : Number of samples in the fft window
    overlap     : Amount of overlap in the fft windows
    sample_rate : Number of samples per second in the input signals

In [None]:
fft_size = 512
overlap = 0.0256
sample_rate = 1e4

### Create and load a pretrained instance of Lab41's model

Here an untrained model instance is created, and the pretrained weights are loaded

In [None]:
model = L41Model(nonlinearity='tanh', normalize=False)
model.load("Path to model file")

### Example separation process

Samples can be generated from the dev set for qualitatively evaluating the perfomance of the model and to test the separation process.  For this example, a sample will be generated, converted to a raw waveform, and then separated into two sources.

In [None]:
# Create a mixer for recordings from the dev set
libridev = "Path to dev set"
long_mixer = FeatureMixer([libridev,libridev], shape=(200,None)) 

Get an example from the mixer and convert it back into a waveform via the istdt function and undo the preemphasis.

In [None]:
data = next(long_mixer)
spec = data[0]
signal = istft(spec,sample_rate,None,overlap,two_sided=False,fft_size=512)
signal = undo_preemphasis(signal)

Audio(signal,rate=sample_rate)

Use the model and the clustering_separate function to separate the signal waveform into sources.

In [115]:
sources = clustering_separate(signal,sample_rate,model,2)

Listen to the results

In [116]:
Audio(sources[0], rate=sample_rate)

In [117]:
Audio(sources[1], rate=sample_rate)