<pre>
 ____   ____    _    ____  _____                          _      _     
|  _ \ / ___|  / \  / ___|| ____|     _ __ ___   ___   __| | ___| |___ 
| | | | |     / _ \ \___ \|  _| _____| '_ ` _ \ / _ \ / _` |/ _ \ / __|
| |_| | |___ / ___ \ ___) | |__|_____| | | | | | (_) | (_| |  __/ \__ \
|____/ \____/_/   \_\____/|_____|    |_| |_| |_|\___/ \__,_|\___|_|___/
                                                                        
</pre>

# DCASE-models Notebooks
Python Notebooks for [DCASE-models](https://github.com/pzinemanas/DCASE-models)

---
### About

This notebook shows how to train a model using [DCASE-models](https://github.com/pzinemanas/DCASE-models).

### Overview

The dataset used is [Urban SED](http://urbansed.weebly.com/), an adaptation of the Convolutional Neural Network (CNN) proposed by Salamon and Bello [[SB-CNN]](http://ieeexplore.ieee.org/document/7829341/) is trained end evaluated.

For details on how to download a dataset, extract features or perform data augmentation refer to the respective notebooks.


### Organization

The Notebook is organized into the following sections.
* [1. Prepare data](#Data)
* [2. Initialize model](#InitModel)
* [3. Train model](#TrainModel)
* [4. Evaluate model](#Evaluate)

In [None]:
%load_ext autoreload
%autoreload 2
rootdir_path = '../../'
import sys
import os
import json
import warnings
sys.path.append(rootdir_path)

from dcase_models.data.datasets import get_available_datasets
from dcase_models.data.features import get_available_features
from dcase_models.model.models import get_available_models
from dcase_models.data.data_generator import DataGenerator
from dcase_models.data.data_augmentation import AugmentedDataset
from dcase_models.data.scaler import Scaler
from dcase_models.util.files import load_json
from dcase_models.util.files import mkdir_if_not_exists, save_pickle
from dcase_models.util.data import evaluation_setup

<a id="Data"></a>
## 1. Prepare data

Many of the following steps could be  ommited if *feature_extraction* and *download_and_prepare_datasets* have already been run. Nonetheless, they have been included to make the notebook self contained. 

In [None]:
#load parameters from json file
parameters_file = os.path.join(rootdir_path, 'parameters.json')
params = load_json(parameters_file)
params_dataset = params['datasets']['URBAN_SED']
params_features = params['features']
kwargs = {'sequence_hop_time': params_features['sequence_hop_time']}
#print dataset parmeters
print(json.dumps(params_features, indent=4, sort_keys=True))
# print feature extraction parameters 
print(json.dumps(params_features, indent=4, sort_keys=True))

features =  MelSpectrogram(sequence_time=params_features['sequence_time'], \
                                            sequence_hop_time=params_features['sequence_hop_time'], 
                                            audio_win=params_features['audio_win'], 
                                            audio_hop=params_features['audio_hop'],  
                                            sr=params_features['sr'], 
                                            **params_features['MelSpectrogram'])
dataset = URBAN_SED(os.path.join(rootdir_path, params_dataset["dataset_path"]))

if dataset.check_if_downloaded():
    print('URBAN_SED dataset was already downloaded. ')
else:
    print('downloading URBAN_SED dataset')
    dataset.download()
# Extract features
if features.check_if_extracted(dataset):
    print('Features were already extracted for URBAN_SED dataset. ')
else:
    print('Extracting features ...')
    features.extract(dataset)


Initialise data generators and fit scaler.

In [None]:
# Get train/test folds
folds_train, folds_val, folds_test = evaluation_setup('test', dataset.fold_list,\
                                             params_dataset['evaluation_mode'],
                                             use_validate_set=True)
#initialise train Data Generator
data_gen_train = DataGenerator(dataset, features, folds=folds_train,\
                                batch_size=params_train['batch_size'],
                                shuffle=True, train=True, scaler=None)
# fit scaler
scaler = Scaler(normalizer=params_model['normalizer'])
print('Fitting features ...')
scaler.fit(data_gen_train)
print('Done!')

data_gen_train.set_scaler(scaler)

#Initialise validation data Generator

data_gen_val = DataGenerator(dataset, features, folds=folds_val,\
                             batch_size=params_train['batch_size'],
                             shuffle=False, train=False, scaler=scaler)

<a id="Model"></a>
## 2. Define Model

In [None]:
params_model = params['models'][SB_CNN_SED]
metrics = ['sed']
X, y = data_gen_train.get_data_batch(0)
 
n_frames_cnn = X.shape[1]
n_freq_cnn = X.shape[2]
n_classes = y.shape[1]

metrics = ['sed']

model_container = SB_CNN_SED(model=None, model_path=None, n_classes=n_classes, 
                             n_frames_cnn=n_frames_cnn, n_freq_cnn=n_freq_cnn,
                             metrics=metrics, **params_model['model_arguments'])

model_container.model.summary()
model_container = model_class(model=None, model_path=None, n_classes=n_classes,\
                              n_frames_cnn=n_frames_cnn, n_freq_cnn=n_freq_cnn,
                              metrics=metrics,
                              **params_model['model_arguments'])

model_container.model.summary()

<a id="Train"></a>
## 3. Train

In [None]:
model_container.train(data_gen_train, data_gen_val,\
                      label_list=dataset.label_list,
                      weights_path=exp_folder, **params['train'],
                      sequence_time_sec=params_features['sequence_hop_time'])

<a id="Evaluate"></a>
## 4. Evaluate

In [None]:
# Load best_weights
model_container.load_model_weights(exp_folder)
data_gen_test = DataGenerator(dataset, features, folds=folds_test,\
                              batch_size=params_train['batch_size'],\
                              shuffle=False, train=False, scaler=scaler)

kwargs = {'sequence_time_sec': params_features['sequence_hop_time'],
          'metric_resolution_sec': 1.0}
results = model_container.evaluate(data_gen_test, label_list=dataset.label_list, **kwargs)

print(results[metrics[0]])