# Classifying urban sounds using deep learning models.
## Model Training and Evaluation
### Load preprocessed data

The data is loaded from a .dat file that contains prepocessed data that was previously created. After loading the data will be split into training and testing sets. 

In [4]:
# Retrieve contents of the .dat file stored in the data folder

import pickle

file_path = 'data/serial_dataset.pickle'
with open(file_path, "rb") as f:
    contents = pickle.load(f)

X = contents[0]
y = contents[1]

In [5]:
#Data is split into training and testing sets.
from sklearn.model_selection import train_test_split 

X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.2,random_state=632)

In [6]:
y.shape

(8732, 8732)

### Convolutional Neural Network model architecture
CNNs are some of the best tools for image classification do to their ability to quantify spatial relationships. 
Our architecture will be a relatively small `Sequential`model consisting of 3 `Conv2D`convolution layers each followed by a `Pool2D` pooling layer. Finally, the output is from a `Dense` layer connected to the Convolutional.
    
The convolutional layer perform the actual feature selection. It slides a small window(aka kernek aka filter) with weights on it over the pixels of the image. It starts at the top and performs summation matrix operations and saves the result as part of an activation map. This process is called convolution. The filter parameter specifies the number of filters that will be applied to a layer in the CNN. Each layer of a CNN can have many filters. Each filter learns something differnt about the problem statment and crosses the activation threshold for different inputs.

The pooling layers reduce dimensionality wihout much loss in data. This increases robustness of the model and lowers its compute and storage requirements. Two type of pooling layers will be used in the model. Firstly, `Pool2D` will be used between Convolutional layers to reduce dimensionality while preserving spatial information. 

After features are extracted, they are fed into a `GlobalAveragePooling2D` layer to flatten the features into a vector while preserving more information than simply flattening it with the `Flatten` layer.

The output layer will have 10 nodes which matches the number of target classes. The `Softmax` activation function will used on the output layer giving us a vector of probabilities.

In [4]:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Conv2D, MaxPooling2D, GlobalAveragePooling2D 
from keras.callbacks import ModelCheckpoint, ReduceLROnPlateau
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics 

width = 40
height = 175
channels = 1

X_train = X_train.reshape(X_train.shape[0],width,height,channels)
X_test = X_test.reshape(X_test.shape[0],width,height,channels)

label_count = y.shape[1]

model = Sequential([
    Conv2D(filters=16,kernel_size=2,input_shape=(width,height,channels),activation='relu'),
    MaxPooling2D(pool_size=2),
    Dropout(0.2),
    
    Conv2D(filters=32,kernel_size=2,activation='relu'),
    MaxPooling2D(pool_size=2),
    Dropout(0.2),
    
    Conv2D(filters=64,kernel_size=2,activation='relu'),
    MaxPooling2D(pool_size=2),
    Dropout(0.2),
    
    Conv2D(filters=128,kernel_size=2,activation='relu'),
    GlobalAveragePooling2D(),
    
    Dense(label_count,activation='softmax')
    
])


Using TensorFlow backend.


### Model Compilation

In [5]:
model.compile(loss='categorical_crossentropy',metrics=['accuracy'], optimizer='adam')

In [6]:
# Model architecture
model.summary()

#pretraining_accuracy
score = model.evaluate(X_test,y_test,verbose=1)
accuracy = 100*score[1]

print('Pretraining accuracy:%.4f%%' % accuracy)

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 39, 174, 16)       80        
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 19, 87, 16)        0         
_________________________________________________________________
dropout_1 (Dropout)          (None, 19, 87, 16)        0         
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 18, 86, 32)        2080      
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 9, 43, 32)         0         
_________________________________________________________________
dropout_2 (Dropout)          (None, 9, 43, 32)         0         
_________________________________________________________________
conv2d_3 (Conv2D)            (None, 8, 42, 64)        

### Training

The training of the model begins. CNNs are best trained on the GPU because of their high parallelization capability. One can train lower epoch numbers on the CPU.

In [7]:
%%timeit

from datetime import datetime

EPOCHS =70
MAX_BATCH_SIZE = 256
#learning rate reduction
MAX_PATIENCE = 2

best_filepath ='./best_model.hdf5'

#callbacks
callback = [ReduceLROnPlateau(patience = MAX_PATIENCE, verbose = 1),ModelCheckpoint(filepath=best_filepath, monitor='loss',verbose=1,save_best_only=True)]

#compile
model.compile(optimizer='adam',loss='categorical_crossentropy',metrics=['accuracy'])

#train
print('Starting...')
model.fit(x=X_train,y=y_train,epochs=50,batch_size=128,verbose=0, validation_data=(X_test,y_test), callbacks=callback)

Starting...

Epoch 00001: loss improved from inf to 2.77160, saving model to ./best_model.hdf5

Epoch 00002: loss improved from 2.77160 to 1.71385, saving model to ./best_model.hdf5

Epoch 00003: loss improved from 1.71385 to 1.44300, saving model to ./best_model.hdf5

Epoch 00004: loss improved from 1.44300 to 1.29350, saving model to ./best_model.hdf5

Epoch 00005: loss improved from 1.29350 to 1.20973, saving model to ./best_model.hdf5

Epoch 00006: loss improved from 1.20973 to 1.13178, saving model to ./best_model.hdf5

Epoch 00007: loss improved from 1.13178 to 1.08166, saving model to ./best_model.hdf5

Epoch 00008: loss improved from 1.08166 to 1.00517, saving model to ./best_model.hdf5

Epoch 00009: loss improved from 1.00517 to 0.96583, saving model to ./best_model.hdf5

Epoch 00010: loss improved from 0.96583 to 0.91061, saving model to ./best_model.hdf5

Epoch 00011: loss improved from 0.91061 to 0.84592, saving model to ./best_model.hdf5

Epoch 00012: loss improved from 0.


Epoch 00039: loss did not improve from 0.30892

Epoch 00040: ReduceLROnPlateau reducing learning rate to 1.000000032889008e-20.

Epoch 00040: loss did not improve from 0.30892

Epoch 00041: loss did not improve from 0.30892

Epoch 00042: ReduceLROnPlateau reducing learning rate to 1.0000000490448793e-21.

Epoch 00042: loss did not improve from 0.30892

Epoch 00043: loss did not improve from 0.30892

Epoch 00044: ReduceLROnPlateau reducing learning rate to 1.0000000692397185e-22.

Epoch 00044: loss did not improve from 0.30892

Epoch 00045: loss did not improve from 0.30892

Epoch 00046: ReduceLROnPlateau reducing learning rate to 1.0000000944832675e-23.

Epoch 00046: loss did not improve from 0.30892

Epoch 00047: loss did not improve from 0.30892

Epoch 00048: ReduceLROnPlateau reducing learning rate to 1.0000000787060494e-24.

Epoch 00048: loss did not improve from 0.30892

Epoch 00049: loss did not improve from 0.30892

Epoch 00050: ReduceLROnPlateau reducing learning rate to 1.000


Epoch 00026: ReduceLROnPlateau reducing learning rate to 1.0000001044244145e-13.

Epoch 00026: loss did not improve from 0.20375

Epoch 00027: loss did not improve from 0.20375

Epoch 00028: ReduceLROnPlateau reducing learning rate to 1.0000001179769417e-14.

Epoch 00028: loss did not improve from 0.20375

Epoch 00029: loss did not improve from 0.20375

Epoch 00030: ReduceLROnPlateau reducing learning rate to 1.0000001518582595e-15.

Epoch 00030: loss did not improve from 0.20375

Epoch 00031: loss did not improve from 0.20375

Epoch 00032: ReduceLROnPlateau reducing learning rate to 1.0000001095066122e-16.

Epoch 00032: loss did not improve from 0.20375

Epoch 00033: loss did not improve from 0.20375

Epoch 00034: ReduceLROnPlateau reducing learning rate to 1.0000000830368326e-17.

Epoch 00034: loss did not improve from 0.20375

Epoch 00035: loss did not improve from 0.20375

Epoch 00036: ReduceLROnPlateau reducing learning rate to 1.0000000664932204e-18.

Epoch 00036: loss did not i


Epoch 00015: loss did not improve from 0.15429

Epoch 00016: ReduceLROnPlateau reducing learning rate to 1.0000001111620805e-07.

Epoch 00016: loss did not improve from 0.15429

Epoch 00017: loss improved from 0.15429 to 0.15228, saving model to ./best_model.hdf5

Epoch 00018: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-08.

Epoch 00018: loss improved from 0.15228 to 0.14789, saving model to ./best_model.hdf5

Epoch 00019: loss did not improve from 0.14789

Epoch 00020: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-09.

Epoch 00020: loss did not improve from 0.14789

Epoch 00021: loss did not improve from 0.14789

Epoch 00022: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-10.

Epoch 00022: loss did not improve from 0.14789

Epoch 00023: loss did not improve from 0.14789

Epoch 00024: ReduceLROnPlateau reducing learning rate to 1.000000082740371e-11.

Epoch 00024: loss did not improve from 0.14789

Epoch 00025: loss did not improve f


Epoch 00006: loss improved from 0.18109 to 0.17121, saving model to ./best_model.hdf5

Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.

Epoch 00007: loss improved from 0.17121 to 0.16014, saving model to ./best_model.hdf5

Epoch 00008: loss improved from 0.16014 to 0.12707, saving model to ./best_model.hdf5

Epoch 00009: loss improved from 0.12707 to 0.11925, saving model to ./best_model.hdf5

Epoch 00010: loss improved from 0.11925 to 0.11279, saving model to ./best_model.hdf5

Epoch 00011: loss improved from 0.11279 to 0.11167, saving model to ./best_model.hdf5

Epoch 00012: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.

Epoch 00012: loss improved from 0.11167 to 0.11113, saving model to ./best_model.hdf5

Epoch 00013: loss did not improve from 0.11113

Epoch 00014: ReduceLROnPlateau reducing learning rate to 1.0000000656873453e-06.

Epoch 00014: loss improved from 0.11113 to 0.10713, saving model to ./best_model.hdf5

Epoch 000