# 1. Experimental frame setup

## 1.1 Data download
### 1.1.1 UrbanSound8K
Download the dataset from https://serv.cusp.nyu.edu/projects/urbansounddataset/download-urbansound8k.html .

### 1.1.2 Extracted Features - long60 
Download the extracted features from https://dtudk-my.sharepoint.com/personal/s161162_win_dtu_dk/_layouts/15/guestaccess.aspx?folderid=06fa713b5cb82417da3b25ae4c5f77e91&authkey=AbYzk0UgqyHRyQT1r7rBmDs&e=68cd647d5e5f491986e2f921fc832940

### 1.1.3 Extracted Features - short60_9010
Download the extracted features from https://dtudk-my.sharepoint.com/personal/s161162_win_dtu_dk/_layouts/15/guestaccess.aspx?folderid=0a9f151ce43994f6394fe7894adf76c69&authkey=AZIM2-nYzd8uz4NQETl-638&e=978da6b2b80e4831844f275c50db185c

### 1.1.4 Trained Models - long60
Download the trained models from https://dtudk-my.sharepoint.com/personal/s161162_win_dtu_dk/_layouts/15/guestaccess.aspx?folderid=0cf9ad23411344bcc8bb22295fb1747a8&authkey=AU73y6XMKpwbsFP0cOQ4gA4&e=6e503350a63544a58cb03a1b7edec5a4

## 1.2 Feature extraction
Starting from the $\texttt{.wav}$ files of the UrbanSound8K dataset, we use the $\texttt{librosa}$ library to obtain a numerical representation. We save the extracted features and labels in two $\texttt{.npy}$ files that we can easily load in memory. This is crucial in the process of cross-validation, where we would otherwise lose a lot of time reloading the files with $\texttt{librosa}$.

In [None]:
from preprocessor import preprocessor

pp = preprocessor(parent_dir='data/UrbanSound8K')

train_dirs = ["fold1", "fold2", "fold3", "fold4", "fold5", "fold6", "fold7", "fold8", "fold9", "fold10"]

#pp.save_fts_lbs(train_dirs=train_dirs, save_path='extracted/short_60', segment_size=20480, overlap=0.5, bands=60, frames=41)
pp.save_fts_lbs(train_dirs=train_dirs, save_path='extracted/long_60', segment_size=51200, overlap=0.9, bands=60, frames=101)
#pp.save_fts_lbs(train_dirs=train_dirs, save_path='extracted/short_200', segment_size=20480, overlap=0.5, bands=200, frames=41)
#pp.save_fts_lbs(train_dirs=train_dirs, save_path='extracted/long_200', segment_size=51200, overlap=0.9, bands=200, frames=101)

If you don't want to run the preprocessor, you can download the long60 extracted features, as shown in [Section 1.1.2](#1.1.2-Extracted-Features---long60).

## 1.3 Overlapping of segments

If you don't want to run the script to overlay the segments, you can download the extracted features of the 90-10 dataset, as shown in [Section 1.1.3](#1.1.3-Extracted-Features---short60_9010)

# 2. Piczak CNN
## 2.1 Cross Validation
For each model (i.e. short60, long60, short200, long200) we ran 10-fold cross-validation, using eight folders as training data, one folder as validation set and one folder as test set. 

In [None]:
from train_models import piczac_cross_validation

#piczac_cross_validation(epochs=300, load_path='extracted/short_60')
piczac_cross_validation(epochs=150, load_path='extracted/long_60')
#piczac_cross_validation(epochs=300, load_path='extracted/short_200')
#piczac_cross_validation(epochs=150, load_path='extracted/long_200')

If you want to see the training process, you can look at the TensorBoard folder in this repo. 

$\texttt{tensorboard --logdir='TensorBoard'}$

Below you can see our final results.

In [14]:
import pandas as pd

df = pd.read_excel('logs/piczak_cv_results.xlsx', sheetname=1)
df.head()

Unnamed: 0,run1,run2,run3,run4,run5,run6,run7,run8,run9,run10,Average,StDev
short_60,0.631746,0.524219,0.519109,0.614269,0.623094,0.550229,0.555519,0.587509,0.647792,0.67597,0.583721,0.048142
long_60,0.690345,0.67187,0.518753,0.651414,0.683282,0.515301,0.623859,0.548439,0.703675,0.737907,0.634484,0.080128
short_200,0.568919,0.584927,0.511141,0.590961,0.631396,0.576656,0.583974,0.660578,0.658915,0.646573,0.601404,0.047432
long_200,0.639606,0.586811,0.563517,0.670743,0.634963,0.564297,0.60919,0.704243,0.675298,0.601156,0.624982,0.048254


## 2.1 Confusion Matrix

In [None]:
import confusion_matrix as cm

model_filename = 'models/long60/long60_150_(1, 2).h5'
load_path = 'extracted/long_60'

cm.plot_confusion_matrix(model_filename, load_path)

If you haven't run the cross-validation, you can download the trained models for the long60 variant, as shown in [Section 1.1.4](#1.1.4-Trained-Models---long60)

## 2.2 Boxplot

In [None]:
import boxplot as bxplt

file_name = 'logs/piczak_cv_results.xlsx'
colors = ['lightgreen', 'lightgreen', 'lightblue', 'lightblue']
labels = ['short60', 'long60', 'short200', 'long200']

bxplt.draw_boxplot(file_name, colors, labels)

In [None]:
file_name = 'logs/overlay_results.xlsx'
colors = ['lightblue', 'lightgreen','lightblue', 'lightgreen']
labels = ['Train: Single - Test: Overlay', 'Train: Single - Test: Single','Train: Single + Overlay - Test: Overlay', 'Train: Single + Overlay - Test: Single']

bxplt.draw_boxplot(file_name, colors, labels)

# 3. Piczak cnn on overlapping sounds