### Deep Learning 

This notebook uses the code from the kaggle kernel [Deep_1: 0.91 acc](https://www.kaggle.com/potamitis/deep-1-0-91-acc), which has been released under Apache License 2.0 Copyright of Ilyas Potamitis. The code is applied to a smaller dataset, so the accuracy on the validation set is pretty low, even though the accuracy on the test set goes above .90.

The notebook uses the deep learning library [keras](https://keras.io/) which wraps different deep learning frameworks. If you do not have it you can install it by uncommenting the cell below.

In [1]:
#!conda install keras

In [4]:
# Deep learning on time domain samples. Acc on 20% hold out test set is 91%
from __future__ import division
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation
from keras.layers import Conv1D, GlobalAveragePooling1D, MaxPooling1D
from keras.optimizers import SGD
from keras.layers.normalization import BatchNormalization
from sklearn.model_selection import train_test_split
from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau, CSVLogger

In [5]:
# reading the stored data
X = np.load('data/X.npy')
y = np.load('data/y.npy')

print(X.shape)

(5181, 5000)


In [6]:
fs = 8000
target_names = ['Ae. aegypti', 'Ae. albopictus', 'An. gambiae', 'An. arabiensis', 'C. pipiens', 'C. quinquefasciatus']


X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=2018)

# Convert label to onehot
y_train = keras.utils.to_categorical(y_train, num_classes=6)
y_test = keras.utils.to_categorical(y_test, num_classes=6)

X_train = np.expand_dims(X_train, axis=2)
X_test = np.expand_dims(X_test, axis=2)

In [7]:
# Build the Neural Network
model = Sequential()

model.add(Conv1D(16, 3, activation='relu', input_shape=(5000, 1)))
model.add(Conv1D(16, 3, activation='relu'))
model.add(BatchNormalization())

model.add(Conv1D(32, 3, activation='relu'))
model.add(Conv1D(32, 3, activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling1D(2))
model.add(Conv1D(64, 3, activation='relu'))
model.add(Conv1D(64, 3, activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling1D(2))
model.add(Conv1D(128, 3, activation='relu'))
model.add(Conv1D(128, 3, activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling1D(2))
model.add(Conv1D(256, 3, activation='relu'))
model.add(Conv1D(256, 3, activation='relu'))
model.add(BatchNormalization())
model.add(GlobalAveragePooling1D())

model.add(Dropout(0.5))
model.add(Dense(6, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])




model_name = 'deep_1'
top_weights_path = 'model_' + str(model_name) + '.h5'

callbacks_list = [ModelCheckpoint(top_weights_path, monitor = 'val_acc', verbose = 1, save_best_only = True, save_weights_only = True), 
    EarlyStopping(monitor = 'val_acc', patience = 6, verbose = 1),
    ReduceLROnPlateau(monitor = 'val_acc', factor = 0.1, patience = 3, verbose = 1),
    CSVLogger('model_' + str(model_name) + '.log')]

In [8]:
# Plot model

# install graphviz and pydot
#!conda install graphviz
#!conda install pydot

from keras.utils import plot_model
plot_model(model,to_file='model_plot.png') # requires pydot and graphviz

<img src="model_plot.png" alt="Drawing" style="width: 200px;"/>

In [9]:
%%time
# Fitting the Model (this will take a loooooooot of time)
model.fit(X_train, y_train, batch_size=128, epochs=100, validation_data = [X_test, y_test], callbacks = callbacks_list)


model.load_weights(top_weights_path)
loss, acc = model.evaluate(X_test, y_test, batch_size=16)

#print('loss', loss)
print('Test accuracy:', acc)

Train on 4144 samples, validate on 1037 samples
Epoch 1/100

KeyboardInterrupt: 

***Remark:*** 
The validation accuracy stops to improve after epoch 6, while the test accuracy goes above .90.

Further ideas:
* construct spectrograms
* do deep learning on the spectrograms using 2D convolutions
* model the time domension using a Long Short Term Memory Network (LSTM)