# Lab 4.1 Basic Convolution Neural Network

In [0]:
'''
Copyright (c) 2019 Oscar PANG (oscarpang@vtc.edu.hk) All rights reserved.

The MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of 
this software and associated documentation files (the "Software"), to deal in 
the Software without restriction, including without limitation the rights to 
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all 
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER 
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN 
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
'''

## 4.1.1 Import necessary packages

In [0]:
import os, sys
import cv2
import time
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import pyplot
from glob import glob

## 4.1.2 Upload dataset

In this exercise, we will use the "Plant Seedlings Dataset" from the ***Computer Vision and Biosystems Signal Processing Group, Aarhus University, Denmark***. More information on the dataset can be found in this link: 

https://vision.eng.au.dk/plant-seedlings-dataset/

You should have uploaded the dataset into your Google Drive in advance.

The following code snippet will authorize access to your Google Drive and retrieve the dataset file. Execute the cell below and follow the instruction to get and enter the access code for your google drive. Upon successful, the google drive wil be mounted to the colab as **/content/gdrive**.

In [0]:
from google.colab import drive
drive.mount('/content/gdrive')

You can execute some basic linux terminal commands in the cell below.

Execute the command below in the cell for checking if you have the dataset file ready in your Google Drive.

Run the unzip command to unzip the dataset, which contains two folders namely **train** and **test**. They respectively store the ***labelled*** image data for training and ***unlabelled*** image data for testing. This will take a while to run.

In [0]:
ls /content/gdrive/My\ Drive/public/plant_seedlings_dataset.zip 

In [0]:
!unzip /content/gdrive/My\ Drive/public/plant*.zip 

In [0]:
# Run the following command to browse the "plant_seedlings_dataset" directory. 
# You should see two folders "test" and "train"
ls plant_seedlings_dataset

## 4.1.3 Define Constants

The dataset should be ready for our use. Now we have to define some environment variables and constants for our use.

In [0]:
# define some constants and path names
label_path = "plant_seedlings_dataset/train"
train_path = "plant_seedlings_dataset/train/*/*.png"
test_path = "plant_seedlings_dataset/test/*.png"

IMG_SIZE = (100, 100)
CHANNEL = 3

## 4.1.4 Prepare and load the dataset

First let's load the filenames of the plant seedling images and get the class names of the labels. The names of the seedlings are saved and indexed in a dictionary *label_dict*.



In [0]:
# use glob to read the path of each image file
train_filenames = glob(train_path)
test_filenames = glob(test_path)

#print(train_filenames)
#print(test_filenames)

label_dict = {}
class_num = 0

# create a dictionary for the key-value pairs of seed names and the respective values
for subdir in sorted(os.listdir(label_path)):
    label_dict[subdir] = class_num
    class_num+=1

print(label_dict)

Then, we read the image data (train data and test data) and the respective labels into the lists. This may take a while.

In [0]:
train_img = []
train_label = []

for filename in train_filenames:
    image = cv2.resize(cv2.imread(filename), IMG_SIZE)
    image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
    seed_name = filename.split("/")[2]
    train_img.append(image)
    train_label.append(label_dict[seed_name])

# convert the lists into numpy array
train_img = np.asarray(train_img)
train_label = np.asarray(train_label)

# examine the datasets
print(train_img.shape)
print(train_label.shape)

In [0]:
test_img = []
test_label = []

for filename in test_filenames:
    image = cv2.resize(cv2.imread(filename), IMG_SIZE)
    image = cv2.cvtColor(image,cv2.COLOR_BGR2RGB)
    test_img.append(image)

# convert the image data list into numpy array
test_img = np.asarray(test_img)

# examine the datasets
print(test_img.shape)
#print(train_label.shape)

We can visualize the image data read from the train data lists using *matplotlib*.

In [0]:
plt.rcParams['figure.figsize'] = (10.0,8.0)
for i in range(20):
    plt.subplot(4,5,i+1)
    plt.imshow(train_img[i])

print(train_label[:10])

## 4.1.5 Data and Feature Processing

Color image has pixel range of 0 - 255. But neural network prefers input data in the range of 0 - 1 (why?)

###Exercise: Complete the code below to rescale the image data in the numpy array from the range of 0-255 to 0-1.

We also encode the labels of the images into **one-hot vector**. One-hot encoding means using binary format (0 & 1) to represent the label data.

If you have 5 classes of data, namely A, B, C, D and E. One-hot representation of a label A is then '1 0 0 0 0', label C is ''0 0 1 0 0', and so on.


In [0]:
from keras.utils import np_utils

# rescale the pixel value of the color images from 0-255 to 0-1

### TO DO: rescale the train image data and test image data by 255 (2 lines)
# ========================= ++Your code here ===========================#
train_img_scaled = None
test_img_scaled = None
# ======================================================================#

# perform one-hot encoding to the train labels
train_label_onehot = np_utils.to_categorical(train_label)

print(train_label_onehot[:20])

The images in the train data folder are read into the numpy array sequentially. We have to shuffle them before feeding into the CNN for training. This ensures that the data is randomly distributed and no bias will be on a particular class of data.

During training of the model, we constantly check the accuracy and loss by taking a small sample of training data for comparing the prediction of the model under training and the respective ground truths. This help us keep track on the accuracy of the model as training progresses and check if the model is overfitting / underfitting. Such small subset of data taken from the training data is commonly known as "validation data / set".

Hence, it is useful to further split the train data into two subsets, one is for the training (accounting for 80% of the total train data) and the remaining for validation during training. Note that the neural network model will not be trained / learn from the validation data.

There are some existing libraries useful for achieving the tasks above, e.g. **sklearn preprocessing**. Here we demonstrate the skills using **numpy**.

---
### Exercise: Use numpy slicing to divide the train image dataset and one-hot encoding into training and validation subsets

In [0]:
# shuffle the image dataset and split it into training dataset (80%) and validation dataset (20%)
np.random.seed(5)
indices = np.random.permutation(train_img_scaled.shape[0])

train_img_scaled = train_img_scaled[indices]
train_label_onehot = train_label_onehot[indices]

train_num = int(train_img_scaled.shape[0] * 0.8)

# split train and evaluation sets using numpy slicing
x_Train = train_img_scaled[:train_num]
y_Train = train_label_onehot[:train_num]


### TO DO: Use numpy slicing to slice the remaining data for x_Eval and y_Eval
# ========================= ++Your code here ===========================#
x_Eval = None 
y_Eval = None 
# ======================================================================#

print(x_Train.shape)
print(y_Train.shape)
print(x_Eval.shape)
print(y_Eval.shape)
print("Total number of samples in the train folder: ",train_img_scaled.shape[0])

### Exercise: Employ other image processing techniques on pre-processing of image data

The following code cell serves as a place holder for you to try different image processing techniques for enhancing the image data / reduce the size of the image data without scarificing image quality. The smaller the size of the image data, the faster the processing. This part is left for your free trial afterward.

In [0]:
### TO DO: Free trail of image processing / feature engineering techniques
### You may consider using OpenCV and various photo enhancement techniques to boost the accuracy
### Remember to save the enhanced data into the respective numpy container.

# E.g. Try converting the color image into gray-scale
#x_Train = (x_Trainl[:,:,0] + x_Train[:,:,1] + x_Train[:,:,2])/3
#x_Eval = (x_Eval[:,:,0] + x_Eval[:,:,1] + x_Eval[:,:,2])/3

##4.1.6 Define Helper Function

We define a graph plotting function here which can visualize the accuracy of the model on the training dataset and validation dataset throughout the training process.

In [0]:
# create helper function
def show_train_history(train_history, train, validation):
    plt.plot(train_history.history[train])
    plt.plot(train_history.history[validation])
    plt.title('Train History')
    plt.xlabel('Epochs')
    plt.ylabel(train)
    plt.show()

##4.1.7 Build a CNN Model

Finally comes to the neural network model building part!

Keras has already implemented some handy APIs for building various types of deep neural networks. It's like making a cake by stacking up different layers. You are encouraged to check out the online documentation of Kears for details of different parameters.

https://keras.io/

### Step a: Import Keras Packages

In [0]:
# import keras packages

from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard

###Step b: Define Model

We will build a Convolution Neural Network having this architecture:

`Input -> [CONV -> RELU -> MAX_ POOL] * 3 -> FC -> RELU -> FC -> SOFTMAX`

The model is [sequential](https://keras.io/getting-started/sequential-model-guide/) in nature. We can stack up different layers linerly using this syntax: `model.add()`

The image data will be fed to the first 2D convolution layer with the following specifications:

*Convolution Block 1*
* `Conv2D`: Input dimension: 100x100x3, No. of Filters: 32, Filter Size: 3x3, Activation Function: ReLU, Default Stride and Padding
* `MaxPooling2D`: Filter Size 2x2
---
**Exercise: Implement Convolution Block 2 and 3**
Continue to implement model using the convolution block 2 and 3 according to the requirements below. 

**Convolution Block 2**
* `Conv2D`: No. of filters: 64, Filter Size: 3x3, Activation Function: ReLU, Default Stride and Padding
* `MaxPooling2D`: Filter Size 2x2

*Convolution Block 3*
* `Conv2D`: No. of filters: 128, Filter Size: 3x3, Activation Function: ReLU, Default Stride and Padding
* `MaxPooling2D`: Filter Size 2x2

After Convolution Block 3, the 2D output of the convolution block will be "roll out" / **flatten** into a 1D vector for feeding into a fully-connected network (called **Dense** layer in Keras). Continue to implement and add the following layers to the model:
 
 *Fully Connected Layer 1*
 * `Dense`: Input node: 256, Activation Function: ReLU
 
 *Fully Connected Layer 2 (Output Layer)*
 * `Dense`: Input node: 12, Activation Function: softmax

The final output layer will consist of 12 nodes, each of which output the probability of prediction of each seedling class after the special activation function called **softmax**. The total sum of probabilities of all classes should be equal to 1.0 in the output layer.

---

Refer to the Keras API Documentations for the following:

* [Conv2D](https://keras.io/layers/convolutional/)
* [MaxPooling2D](https://keras.io/layers/pooling/)
* [Fully Connected Layer (Dense)](https://keras.io/layers/core/)


You can come back to tune various hyper-parameters in the layers and re-run the training later on.

In [0]:
# create a sequential model
model = Sequential()

# start building convolution layer
model.add(Conv2D(32, (3, 3), activation = 'relu', input_shape = (IMG_SIZE[0], IMG_SIZE[1], CHANNEL)))
model.add(MaxPooling2D((2, 2)))

### TO DO: Implement Convolution Block 2 and 3 (~ 4 lines)
#================ Your code below ================#


#=================================================#
model.add(Flatten())

### TO DO: Implement the 2 Fully Connected Layer (~2 lines)
#================ Your code below ================#


#=================================================#


You can examine the architecture of the model using `model.summary()`

In [0]:
model.summary()

### Step c: Define Loss Function and Parameters of Training

After defining the model architecture, we have to compile the model with configuration of how the model is trained. 

In [0]:
model.compile(loss='categorical_crossentropy', optimizer=optimizers.RMSprop(lr=1e-4), metrics=['accuracy'])

## 4.1.8 Model Training

After building the model, we can start the training of the model.

During the training, we can monitor the intermediate result of the training and can determine if the loss function can converge. If not, we can stop the training and change the parameters. This saves time especially when the model is large and trainng typically takes time.

### Exercise: Implement the training
Implemnet the training using `model.fit()` the following parameters:
* Input Data: x_Train
* Input Label: y_Train
* Epochs (no. of iterations): 20
* Batch size: 128
* Validation data: (x_Eval, y_Eval)

You can refer to the Keras API for [model.fit()](https://keras.io/models/model/)

---

When you are ready, execute the cell and you will see the training progresses from epoch 0. 

Monitor the **val acc** and Note that the loss function should be plateau and the accuracy of the model should increase as the training continues. Also notice the number of epochs when the accuracy of validation saturates.

**The training takes roughly 30min for 20 epochs. Do not stop the Colab during training or you risks starting all over again!**

In [0]:
# log the start time of the training
start_time = time.time()
train_history = model.fit(x_Train, y_Train, epochs=40, batch_size=128, validation_data=(x_Eval, y_Eval))

### TO DO: Implement the training using model.fit() and save the train history return object in train_history (1 line) 
#================ Your code below ================#
train_history = None
#=================================================#

# log the end time of the training
end_time = time.time()

##4.1.9 Model Evaluation

After the training is stopped, we can visualise the results of the training by plotting the graphs of the loss function and accuracy over time.

In [0]:
print("Total time elapsed = %d sec" % int(end_time - start_time))
print("="*70)
print()
# evaluate model accuracy
scores = model.evaluate(x_Eval, y_Eval)
print("model scores = ", scores)

# for accuracy
show_train_history(train_history, 'acc', 'val_acc')

# for loss
show_train_history(train_history, 'loss', 'val_loss')

With a simple CNN having 3 convolution layers, you have built a model achieving 70% of accuracy. This is not bad though! 

But this is not the end of this Lab! We should strive for a better accuracy. Now save the model for later use and move on to the next part.

## Save Trained Model

Save the model weights into a hdf5 file format in Keras. The weights can be inputted into another model having the same model architecture and can continue to train / do neural network inferencing.

In [0]:
filename = str(time.strftime("%Y%m%d_%H%M%S"))
filename = 'seedling_classifier_' + filename + '.h5'
model.save(filename)
print("Model has been saved successfully!")

In [0]:
# Download the saved model to your local system
from google.colab import files

files.download(filename)

#Lab4.2 Model Fine Tuning

## 4.2.1 Fine-tuning Model

You can return to the model architecture part and tune the hyper-parameters of the convolutional layers. Feel free to practice trail and error to see how the hyper-parameters affect the training time and accuracy.


## 4.2.2 Building a Deeper Network

Usually a deeper (more layers), more complex neural network perform better than a simple, shallow neural network. 

In the above example, you have implemented the model with 3 convolution blocks. You can try to insert more convolution blocks before the fully connected layer, or even combining two convolutional layer into a single build block like the following:

`Input -> [CONV -> RELU -> CONV -> RELU -> POOL] * 3 -> FC -> RELU -> FC -> SOFTMAX`

This may increase training time significantly, however, as more parameters have to be trained in the model.





#Lab 4.3 Data Augmentation

We have built a CNN using roughly 3000 images of 12 classes of seedlings with accuracy of 70%. Each class of seedling has about 200 - 300 images, which is not quite enough for building a robust image classifier. In fact, this scenario happens commonly in real life, that you are always presented with insufficient data for tackling an image recognition tasks using machine learning approach.

In this section, we learn to use a useful trick called **Data Augmentation** which boosts the quantity of data for trainng and helps increase the performance of the model.

---




In [0]:
### TO DO: uncomment the parameters in the ImageDataGenerator for trying various combination of image augmentation methods
### We first try flipping horizontally and vertically the images to produce some new images

# define the way new training data is generated
train_datagen = ImageDataGenerator(#featurewise_center=True, 
                             #featurewise_std_normalization=True,
                             #rotation_range=90,
                             #zoom_range=0.2,
                             #shear_range=0.2,
                             #width_shift_range=0.2,
                             #height_shift_range=0.2,
                             horizontal_flip=True,
                             vertical_flip=True) 
                             #zca_whitening=True)
  
# create the data generator by feeding the original data from numpy array into the generator
train_generator = train_datagen.flow(x = x_Train, y = y_Train, batch_size = 128)

To train the model with new data generation, we will use model.fit_generator().

Consult the [API](https://keras.io/preprocessing/image/) on the usage and example.

You can also create a data generator for validation dataset and feed it into the model training too.

In [0]:
start_time = time.time()

train_history = model.fit_generator(train_generator, 
                                    steps_per_epoch=len(x_Train), epochs = 10, 
                                    validation_data=(x_Eval, y_Eval))

end_time = time.time()

Likewise, evaluate the accuracy of the model.

In [0]:
print("Total time elapsed = %d sec" % int(end_time - start_time))
print("="*70)
print()
# evaluate model accuracy
scores = model.evaluate(x_Eval, y_Eval)
print("model scores = ", scores)

# for accuracy
show_train_history(train_history, 'acc', 'val_acc')

# for loss
show_train_history(train_history, 'loss', 'val_loss')

In [0]:
# save model weights for future use
filename = str(time.strftime("%Y%m%d_%H%M%S"))
filename = 'seedling_classifier_data_augmentation_' + filename + '.h5'
model.save(filename)
print("Model has been saved successfully!")

In [0]:
# Download the saved model to your local system
from google.colab import files

files.download(filename)