 <pre>
 ____   ____    _    ____  _____                          _      _     
|  _ \ / ___|  / \  / ___|| ____|     _ __ ___   ___   __| | ___| |___ 
| | | | |     / _ \ \___ \|  _| _____| '_ ` _ \ / _ \ / _` |/ _ \ / __|
| |_| | |___ / ___ \ ___) | |__|_____| | | | | | (_) | (_| |  __/ \__ \
|____/ \____/_/   \_\____/|_____|    |_| |_| |_|\___/ \__,_|\___|_|___/
                                                                        
</pre>

# DCASE-models Notebooks
Python Notebooks for [DCASE-models](https://github.com/pzinemanas/DCASE-models)

---

### About 
This Notebook reproduces the results for **Sound Event Detection (SED)** presented in:
<ul>
<li><a href="http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_scaper_waspaa_2017.pdf"><strong>
    Scaper: A Library for Soundscape Synthesis and Augmentation</strong></a>
    J. Salamon, D. MacConnell, M. Cartwright, P. Li, and J. P. Bello.
    In IEEE Workshop on Applications of Signal Processing to
    Audio and Acoustics (WASPAA), New Paltz, NY, USA, Oct. 2017.
    <br>
   <a type="button" class="btn btn-default btn-xs" target="_blank" href="http://www.justinsalamon.com/uploads/4/3/9/4/4394963/salamon_scaper_waspaa_2017.pdf"> PDF </a>
   <a type="button" class="btn btn-default btn-xs" target="_blank" href="https://ieeexplore.ieee.org/document/8170052"> IEEE</a>
    </li>   
</ul>

### Overview

The paper introduces [Scaper](https://github.com/justinsalamon/scaper), an open-source library for soundscape synthesis and augmentation. To illustrate the potential of the library, the authors generate a dataset of 10,000 sound-scapes, namely [URBAN-SED](http://urbansed.weebly.com/), and use it to compare the performance of two state-of-the-art algorithms for sound event detection:
- the Convolutional Recurrent Neural Net-work (CRNN) proposed by Cakir et al. [[C-CRNN]](https://ieeexplore.ieee.org/document/7933050)
- an adaptation of the Convolutional Neural Network (CNN) proposed by Salamon and Bello [[SB-CNN]](http://ieeexplore.ieee.org/document/7829341/)

### Organization

The Notebook is organized into the following sections.
* [1. Load parameters](#LoadParameters)
* [2. Extract features](#ExtractFeatures)
* [3. Load data](#LoadData)
* [4. Initialize model](#InitModel)
* [5. Train model](#TrainModel)
* [6. Evaluate model](#EvaluateModel)

In [None]:
%load_ext autoreload
%autoreload 2
rootdir_path = '../../'
import sys
import os
import json
import warnings
import glob
import numpy as np
import argparse

sys.path.append(rootdir_path)
from dcase_models.utils.files import load_json, mkdir_if_not_exists
from dcase_models.data.data_generator import DataGenerator
from dcase_models.data.datasets import URBAN_SED
from dcase_models.data.features import MelSpectrogram
from dcase_models.model.models import SB_CNN_SED
from dcase_models.data.scaler import Scaler
from dcase_models.utils.files import load_json
from dcase_models.utils.data import evaluation_setup

os.environ["CUDA_VISIBLE_DEVICES"] = "1"

<a id="LoadeParameters"></a>
## 1. Load parameters

Dataset, feature extraction and training parameters are stored in a json file on the root directory.

In [None]:
# load all parameters from json file
params = load_json(os.path.join(rootdir_path, 'parameters.json'))
# set the dataset we are going to use
dataset = 'URBAN_SED'

# get dataset parameters
params_dataset = params["datasets"][dataset]

# get feature extraction parameters
params_features = params["features"]

# get training parameters
params_train = params["train"]
# Replacing default training parameters by paper parameters
params_train["epochs"]=300
params_train["early_stopping"]=100

params_model = params["models"]["SB_CNN_SED"]

Check that the values of the parameters are correct.

In [None]:
# print the dataset parameters 
print("Dataset Parameters:\n", json.dumps(params_dataset, indent=4, sort_keys=True))
# print feature extraction parameters 
print("Features' Parameters:\n",json.dumps(params_features, indent=4, sort_keys=True))
# print training parameters 
print("Training Parameters:\n",json.dumps(params_train, indent=4, sort_keys=True))

<a id="ExtractFeatures"></a>
## 2. Extract features

Initialize Feature Extractor and Data Generator.

In [None]:
# Initialize Feature Extractor
features = MelSpectrogram(sequence_time=params_features['sequence_time'], \
                                   sequence_hop_time=params_features['sequence_hop_time'], 
                                   audio_win=params_features['audio_win'], 
                                   audio_hop=params_features['audio_hop'], 
                                   sr=params_features['sr'],
                                   **params_features['MelSpectrogram'])

print(features.get_shape())

In [None]:
# Initialize Data Generator as an instance of URBAN_SED
kwargs = {'sequence_hop_time': params_features['sequence_hop_time']}
dataset = URBAN_SED(os.path.join(rootdir_path, params_dataset["dataset_path"]), **kwargs)

Check if dataset exists, and download it if doesn't exist.

In [None]:
dataset.download()

Extract the features (if they were not extracted before).

In [None]:
if not features.check_if_extracted(dataset):
    features.extract(dataset)
print('Done!')

<a id="LoadData"></a>
## 3. Load data

In [None]:
# Get train/test folds
folds_train, folds_val, folds_test = evaluation_setup('fold1', dataset.fold_list,\
                                             params_dataset['evaluation_mode'],
                                             use_validate_set=True)
#initialise Data Generator
data_gen_train = DataGenerator(dataset, features, folds=folds_train,\
                                batch_size=params_train['batch_size'],
                                shuffle=True, train=True, scaler=None)

And also fit a scaler to transform training data.

In [None]:
scaler = Scaler(normalizer=params_model['normalizer'])
print('Fitting features ...')
scaler.fit(data_gen_train)
print('Done!')

data_gen_train.set_scaler(scaler)


Also initialise validation data generator.

In [None]:
data_gen_val = DataGenerator(dataset, features, folds=folds_val,\
                             batch_size=params_train['batch_size'],
                             shuffle=False, train=False, scaler=scaler)

In [None]:
print(f"X: {data_gen_train.get_data_batch(0)[0][0].shape}")
print(f"Y: {data_gen_train.get_data_batch(0)[1][0].shape}")

<a id="InitModel"></a>
## 4. Initialize model

In [None]:
X, y = data_gen_train.get_data_batch(0)
 
n_frames_cnn = X.shape[1]
n_freq_cnn = X.shape[2]
n_classes = y.shape[1]

metrics = ['sed']

model_container = SB_CNN_SED(model=None, model_path=None, n_classes=n_classes, 
                             n_frames_cnn=n_frames_cnn, n_freq_cnn=n_freq_cnn,
                             metrics=metrics, **params_model['model_arguments'])

model_container.model.summary()

<a id="TrainModel"></a>
## 5. Train model

In [None]:
### TESTING ONLY#########
params_train["epochs"]=1
#########################
exp_folder = './'
kwargs = {'label_list': dataset.label_list}
model_container.train(data_gen_train, data_gen_val, weights_path=exp_folder, **params_train, **kwargs)

<a id="EvaluateModel"></a>
## 6. Evaluate Model

In [None]:
# Load best_weights
model_container.load_model_weights(exp_folder)
data_gen_test = DataGenerator(dataset, features, folds=folds_test,\
                              batch_size=params_train['batch_size'],\
                              shuffle=False, train=False, scaler=scaler)

kwargs = {'sequence_time_sec': params_features['sequence_hop_time'],
          'metric_resolution_sec': 1.0}
results = model_container.evaluate(data_gen_test, label_list=dataset.label_list, **kwargs)

print(results[metrics[0]])