# Mosquito Classifier Tutorial v1
## Author: Arvin Lin (林奕竹） Date: 2019 01 23
This notebook will go through the process of using transfer learning to train the mosquito classifier. To run this notebook, open terminal and navigate to the directory where this notebook resides (/hdd/Train), then enter the following commands:

source ~/.bashrc
jupyter notebook
The Home page of the jupyter notebook will be automatically opened in the web browser. In the Home page click on Mosquito Classifier Tutorial.ipynb to open this notebook

(Notes: You can move this notebook along with the /hdd/Train/data to the SSD disk to run the program faster)



# Step 1: Load the Required Libraries
load the required libraries by running the following cell:

In [None]:
#Easy Tutorial on how to perform transfer learning on SqueezeNet. 
#Author: Arvin Lin
#Please do not modify the following codes unless you know what you are doing.
#Keras
from keras.models import Sequential, Model
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.optimizers import SGD,RMSprop,Adam
from keras.utils import np_utils
from keras.models import model_from_json
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import EarlyStopping, ReduceLROnPlateau, CSVLogger, ModelCheckpoint
from keras import applications
from keras import regularizers

#Squeezenet
from keras_squeezenet import SqueezeNet
from keras.applications.imagenet_utils import preprocess_input, decode_predictions
from keras.preprocessing import image
#from keras.preprocessing.image.ImageDataGenerator import flow_from_directory

import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import os
from PIL import Image
from numpy import *

from sklearn.utils import shuffle
#from sklearn.cross_validation import train_test_split
from sklearn.model_selection import train_test_split

# Step 2: Define the Hyper-Parameters for Taining
Change the following variables to test which set of numbers gives the best training results.

# these values can be altered to test
# number of epochs to train
batch_size = 64                     #recommendation: 16, 32, 64, ..., too big then may cost too much memory
nb_epoch = 3                        #the bigger the number the longer it trains (ex 50)
minus_layers = 10                   #how many layers would be subtracted in squeezenet
freeze_layers = 10                  #how many layers would be non-trainable in squeezenet
dense_num = 1024                    #recommendation: 256, 512, 1024, ...
dropout_rate = 0.25                 #between 0~1
learning_rate = 0.00001             #recommendation: 0.00001, 0.00003, 0.0001, 0.0003, ...
ESP = 6                             #recommendation: don't set too high(Early Stopping Patience)
RLROPP = 4                          #recommendation: don't set too high(RediceLROnPlateau Patience)

# Step 3: Define the Training Input Shape and Define a Data Generator
Using a data generator can create more images to train by applying changes to existing images. In the following cell please follow the comments on how to modify the data generator. Do not change other lines of codes unless you know what you are doing.

## Do Not Chage These Lines
#Define the input shape of the images
img_rows, img_cols = 227, 227
img_channels = 3

## This Line Should be Changed Only When There Are More Types in the Image Dataset
# number of output classes
nb_classes = 3


# Modify the ImageDataGenerator to Test for Different Configurations
## Notice!!! Setting too much of the variables may result in very LONG training time !!!
train_datagen = ImageDataGenerator(
	rescale = 1./255,              # Do Not Change This Line !!!
	#rotation_range = 20,          # set rotation_range's number or comment out this line 
	#zoom_range=0.1,               # Set zoom_range's number or comment out this line
	#width_shift_range=0.2,        # Set width_shift_range's number or comment out this line
	#height_shift_range=0.2,       # Set height_shift_range's number or comment out this line
	#shear_range=0.1,              # Set shear_range's number or comment out this line
	#horizontal_flip=True,         # Set horizontal_flip to True or False,  or comment out this line
	#vertical_flip=True            # Set vertical_flip to True or False,  or comment out this line
)

## Do Not Change The Following Lines
val_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
	'data/train',
	target_size = (img_rows, img_cols),
	batch_size = batch_size,
	class_mode = 'categorical'
)

validation_generator = val_datagen.flow_from_directory(
	'data/val',
	target_size = (img_rows, img_cols),
	batch_size = batch_size,
	class_mode = 'categorical'
)

test_generator = test_datagen.flow_from_directory(
	'data/test',
	target_size = (img_rows, img_cols),
	batch_size = batch_size,
	class_mode = 'categorical'
)

# Step 4: Load and Modify the SqueezeNet Model
We use the SqueezeNet model becuase it is sufficiently small to run on Raspberry Pi and can attain a very good accuracy. Some of the following lines can be modified given that you have experience in Keras.

In [None]:
model = SqueezeNet()
#Freeze some layers, the number of freezed layers should be defined in Step 2.
for layer in model.layers[:freeze_layers]:
	layer.trainable = False
#Pop out some layers, the number of popped layers should also be defined in Step 2.
for i in range(minus_layers):
	model.layers.pop()
#Get the output of the modified squeezenet
x = model.output
#Add custum layers
x = Dense(dense_num, activation="relu")(x)

## Modify The Following Lines
## Do So If You Know Keras

x = Dropout(dropout_rate)(x)
## Modify The Above Lines




# Do Not Modify the Following Lines
predictions = Dense(nb_classes, activation = "softmax")(x)
## Do Not Modify the Following Lines
final_model = Model(input = model.input, output = predictions)   
# Compile model
final_model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=learning_rate), metrics=['accuracy'])
#implement early stopping
callback = [EarlyStopping(patience=ESP), 
	    ReduceLROnPlateau(patience = RLROPP, verbose = 1),
	    CSVLogger(filename='log.csv'),
	    ModelCheckpoint('checkpoint' + '.check',
	    			save_best_only = True,
				save_weights_only = True)]

# Step 5: Train the Model
After this step, 3 files would be created:

model.json
model.h5
accuracy.txt

The two model files contains the model and could be placed in the mosquito trap raspberry pi to predict mosquitoes. The accuracy.txt file saves the testing accuracy of the model.

In [None]:
#start training
hist = final_model.fit_generator(
				train_generator, 
				epochs=nb_epoch, 
                steps_per_epoch = len(train_generator),
				verbose=1, 		
				validation_data= validation_generator, 
                validation_steps = len(validation_generator),
				callbacks = callback)

scores = final_model.evaluate_generator(test_generator, steps = len(test_generator), verbose=1)
print("%s: %.2f%%" % (final_model.metrics_names[1], scores[1]*100))

model_json = final_model.to_json()
with open("model.json","w") as json_file:
	json_file.write(model_json)

final_model.save_weights("model.h5")
print("Saved model")

f = open('accuracy.txt','w')
f.write("%s: %.2f%%" % (final_model.metrics_names[1], scores[1]*100))
f.close()