<a href="https://colab.research.google.com/github/ferrari212/Machine_Learning_0213/blob/master/AI_workshop_%E2%80%93_from_hype_to_real_world_applications_Traffic_sign_classification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep Learning for image classification
For Workshop background and intro go to https://www.tekna.no/en/events/ai-workshop--from-hype-to-real-world-applications-39390/

The following tutorial covers how to set up a simple deep learning experiment for image classification. The approach is based on using building blocks and modules from the machine learning framework "Tensorflow". 

This model will classify images into different categories. The prerequisites for setting up the model is access to labelled data. As an initial test case we have included some example images of various traffic signs. The task of the model is thus to predict what kind of sign it sees. To make the example more realistic, both image quality and amount of data are quite limited (as is often the case in practical applications of machine learning). 

These images are of course only included as an example to get started, and could easily be replaced with other images as long as you follow the same folder structure as the current setup, as explained below. 

## Helper code for loading data and setting tensorflow version

In [0]:
#@title
# Use Tensorflow version 1.x
# This is a colab specific command that can be used to set a specific version of Tensorflow.
# Both Colab and Tensorflow are Google products, so expect some "shortcuts" and tricks that are Colab specific.
%tensorflow_version 1.x

In [0]:
# Access training, validation and test data from a google drive account. Here
# you will need to authenticate with your own google account.

# File url: "https://drive.google.com/open?id=1HQiZx5tugjmdK4wEHLGxgfaghBLAS6Mx"
GDRIVE_FILE_ID = "1HQiZx5tugjmdK4wEHLGxgfaghBLAS6Mx"

In [0]:
#@title
# This cell contains code needed to import data from Google Drive.
# You don't need to understand this code!

# Import PyDrive and associated libraries.
# This only needs to be done once per notebook.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import os

# Authenticate and create the PyDrive client.
# This only needs to be done once per notebook.
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

# Download a file based on its file ID.
def gdrive_zip_to_working_dir(file_id, filename='data.zip'):
  downloaded = drive.CreateFile({'id':file_id})
  downloaded.GetContentFile(filename)

In [0]:
#@title
%%capture
os.chdir('/content/')
!rm data.zip
!rm -r sign_classifier
gdrive_zip_to_working_dir(GDRIVE_FILE_ID);
!unzip data.zip;
!rm data.zip
os.chdir('sign_classifier')

In [0]:
#@title
def print_dir(dir, level=0):
  print('|- ' * level, dir.split('/')[-1],sep='')
  [print_dir(os.path.join(dir, d), level+1) for d in os.listdir(dir) if os.path.isdir(os.path.join(dir, d))]

In [7]:
print('Current working directory is:',os.getcwd())
print('And the directory looks like this:')
print_dir(os.getcwd())

Current working directory is: /content/sign_classifier
And the directory looks like this:
sign_classifier
|- augm_images
|- data
|- |- test
|- |- |- Slippery_road
|- |- |- Intersection
|- |- |- Stop
|- |- |- Bikes
|- |- |- Forbidden_for_traffic
|- |- |- Yield
|- |- |- No_entry
|- |- |- Right_of_way
|- |- |- Pedestrians
|- |- |- Speed_60
|- |- test_subset
|- |- |- Slippery_road
|- |- |- Intersection
|- |- |- Stop
|- |- |- Bikes
|- |- |- Forbidden_for_traffic
|- |- |- Yield
|- |- |- No_entry
|- |- |- Right_of_way
|- |- |- Pedestrians
|- |- |- Speed_60
|- |- train
|- |- |- Slippery_road
|- |- |- Intersection
|- |- |- Stop
|- |- |- Bikes
|- |- |- Forbidden_for_traffic
|- |- |- Yield
|- |- |- No_entry
|- |- |- Right_of_way
|- |- |- Pedestrians
|- |- |- Speed_60
|- |- val
|- |- |- Slippery_road
|- |- |- Intersection
|- |- |- Stop
|- |- |- Bikes
|- |- |- Forbidden_for_traffic
|- |- |- Yield
|- |- |- No_entry
|- |- |- Right_of_way
|- |- |- Pedestrians
|- |- |- Speed_60


## Folder structure for replacing included images with your own data: 



```
# This is formatted as code
```

Place your images in subfolders under the main folder "data/" with the name of the image category as subfolder name, as in the example folder structure shown below. 
You need to split the images between "training data", "validation data" and "test data", as the model uses example images from the training folder during the training of the model. The validation data is used as an early indicator of accuracy to optimize model parameters. The test data is then used as the final assesment in the end to test the accuracy of the model on a totally independent set of images. One way of splitting the images between "train" "validate" and "test" is e.g to use 80% of the images for training the model, and validate/test on 10% each.  


For a brief introduction to the importance of seperating between "train", "validation" and "test" data, you can have a read here: https://en.wikipedia.org/wiki/Training,_validation,_and_test_sets



Example folder structure with included dataset:

Training data:

- data/train/category_1 : Images of signs from category 1

- data/train/category_2 : Images of signs from category 2

- .........................................

Validation data:

- data/val/category_1 : Images of signs from category 1

- data/val/category_2 : Images of signs from category 2

- .........................................


Test data: 

- data/test/category_1 : Images of signs from category 1

- data/test/category_2 : Images of signs from category 2

- .........................................

## Import various libraries and packages neccesary for defining and running the models

These are some python libraries/packages that makes our life a lot simpler, as we do not have to write all the code and functionality from scratch. Building a deep learning model from scratch without any of these libraries/packages would actually be a tremendous task! 

In [0]:
import tensorflow as tf
print(tf.__version__)

In [0]:
import numpy as np
import matplotlib.pyplot as plt
plt.rcParams.update({'figure.max_open_warning': 0})
import seaborn as sns

from numpy.random import seed
seed(1337)
from tensorflow import set_random_seed
set_random_seed(42)

from tensorflow.python.keras.applications import vgg16
from tensorflow.python.keras.applications.vgg16 import preprocess_input
from tensorflow.python.keras.preprocessing.image import ImageDataGenerator, load_img
from tensorflow.python.keras.callbacks import ModelCheckpoint
from tensorflow.python.keras import layers, models, Model, optimizers

from sklearn.metrics import classification_report, confusion_matrix, accuracy_score
from plot_conf_matr import plot_confusion_matrix

## Define train/test data and categories

Define folders for the location of train/val/test images and the names of all the different categories we want to classify. 
We then plot the number of images per category in the training set. 

In [0]:
train_data_dir = "data/train"
val_data_dir = "data/val"
test_data_dir = "data/test"

category_names = sorted(os.listdir('data/train'))
nb_categories = len(category_names)
img_pr_cat = []

for category in category_names:
    folder = 'data/train' + '/' + category
    img_pr_cat.append(len(os.listdir(folder)))

sns.barplot(y=category_names, x=img_pr_cat).set_title("Number of training images per category:")

As a start, let us also plot an example image from each of the sign categories, to visualize typical image quality

In [0]:
for subdir, dirs, files in os.walk('data/train'):
    for file in files:
        img_file = subdir + '/' + file
        image = load_img(img_file)
        plt.figure()
        plt.title(subdir)
        plt.imshow(image)
        break

As you can see from the example images, the resolution and quality are not great. However, both image quality and amount of data are often quite limited in practical applications of machine learning. As such, low quality images limited to a maximum of 200 training images per category represents a more realistic example than using "perfect" high quality images.  

## Loading a pre-trained Deep Learning model

 There is no need at this stage to understand the details of the various types of deep learning models, but a summary of some common ways of building models can be found here for those interested (https://towardsdatascience.com/transfer-learning-from-pre-trained-models-f2393f124751). 

In this case, we use an already pre-trained deep learning model (VGG16) as the basis for our image classifier model, and then retrain the model on our own data. 

In [0]:
img_height, img_width = 224,224
conv_base = vgg16.VGG16(weights='imagenet', include_top=False, pooling='max', input_shape = (img_width, img_height, 3))

Having loaded the pre-trained model, we can choose to freeze the "deeper layers" of the model in the code block below, and only re-train the last few layers on our own data. This is a common transfer learning stragety, as explained in the article linked above, and is often a good approach when the amount of data available for training is limited.

This option is currently commented out from the code (using the #symbol), and we are thus retraining all layers of the model. The number of layers to train represents a parameter you can experiment with yourselves. How does the number of trainable layers affect model performance?

In [0]:
#for layer in conv_base.layers[:-13]:
#    layer.trainable = False

As a check we also print a list of all layers of the model, and whether they are trainable or not (True/False)

In [0]:
for layer in conv_base.layers:
    print(layer, layer.trainable)

Using the VGG16 model as a basis, we build a final classification layer on top to predict our defined classes. We then print a model summary, lisiting the number of parameters of the model. If you decide to "freeze" some of the layers, you will notice that the number of "Trainable parameters" below will be lower. 

As you can see, the output shape corresponds to the number of categories, which in our case is 10. 

In [0]:
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(nb_categories, activation='softmax'))
model.summary()

## Define generators that read images from our folders and feeds them to the image classifier models for training/testing

We need to define some functions that read images from our folders and feeds them to the image classifier models. As a part of this we also add some basic image preprocessing, where the input images are scaled to have all pixel values in the range [0,1], (from 0-255 in the original images).

In [0]:
#Number of images to load at each iteration
batch_size = 32

# only rescaling
train_datagen =  ImageDataGenerator(
    rescale=1./255
)
test_datagen =  ImageDataGenerator(
    rescale=1./255
)

# these are generators for train/test data that will read pictures found in
# the defined subfolders of 'data/'

print('Total number of images for "training":')
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size, 
class_mode = "categorical")

print('Total number of images for "validation":')
val_generator = test_datagen.flow_from_directory(
val_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical",
shuffle=False)

print('Total number of images for "testing":')
test_generator = test_datagen.flow_from_directory(
test_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = "categorical",
shuffle=False)


## Define final model parameters and start training process

Here, we define some parameters that controls the training process of the model. Important parameters are e.g. training rate, how many epochs to train the model and which optimizer to use. You do not need to understand all these terms to follow the tutorial, but those interested can have a quick read here: https://towardsdatascience.com/epoch-vs-iterations-vs-batch-size-4dfb9c7ce9c9

We also define a checkpoint parameter, where we keep track of the validation accuracy after each epoch during training. Using this, we save the model that performs best during the training process. 

In [0]:
learning_rate = 5e-5
epochs = 10

checkpoint = ModelCheckpoint("sign_classifier.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
model.compile(loss="categorical_crossentropy", optimizer=optimizers.Adam(lr=learning_rate, clipnorm = 1.), metrics = ['acc'])

We are now ready to start training the model on our own data: For each "epoch" we print out some relevant information from the training process. Importantly, we want the accuracy of our model to be as good as possible. The model accuracy, as measured on the training data, is given by "acc", and the accuracy on the images in the test set is given by "val_acc". The "val_acc" is the most important quantity, as this tells us how accurate the model is on images it has not already seen during the training process. 

Ideally the "val_acc" should increase as we keep training the model, and will eventually reach a steady value when our model is not able to learn any more useful information from our training data. 

In [0]:
history = model.fit_generator(train_generator, 
                              epochs=epochs, 
                              shuffle=True, 
                              validation_data=val_generator,
                              callbacks=[checkpoint]
                              )

After the training has completed, we load the checkpoint file which had the best validation accuracy during training:

In [0]:
model = models.load_model("sign_classifier.h5")

## Model accuracy

To evaluate the model accuracy, we can plot how the model performance changes during the training process. This gives us important information to evaluate what we can do to improve model performance. For a nice introduction to this topic, you can also have a look at this video: https://www.youtube.com/watch?v=yr_qzEzhwqM

In [0]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1,len(acc)+1)

plt.figure()
plt.plot(epochs, acc, 'b', label = 'Training accuracy')
plt.plot(epochs, val_acc, 'r', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
#plt.savefig('Accuracy.jpg')

plt.figure()
plt.plot(epochs, loss, 'b', label = 'Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
#plt.yscale('log')
plt.title('Training and validation loss')
plt.legend()
#plt.savefig('Loss.jpg')

What we can interpret from these curves is as follows: 

Starting with the upper figure of training/valication accuracy: 

The blue line represents the model accuracy as measured on the training images, and we see that this quickly reaches a value of almost 1 (which represents classifying 100% of the training images correctly). However, the validation accuracy is the accuracy measured on the validation/test set, which is the accuracy we really care about. In this case, the accuracy leveled off at around 97-98%, meaning that we succesfully clasified almost all of the images in our test set to the correct category. 

To learn a bit more about the accuracy for each of the categories, we can calculate and plot the "confusion matrix", which is an easy way of visualizing the model performance (https://en.wikipedia.org/wiki/Confusion_matrix). The confusion matrix is calculated and plotted through a function called "plot_confusion_matrix" in the included script "plot_conf.py", for those interested in having a look at the code. This matrix compares the "true" vs. "predicted" class for all images in the test set. 

Note: do not worry if you do not get exactly the same numbers when re-running the code! There are some inherent randomness in model initialization etc. which make the results differ slightly from time to time. 

In [0]:
Y_pred = model.predict_generator(test_generator)
y_pred = np.argmax(Y_pred, axis=1)

cm = confusion_matrix(test_generator.classes, y_pred)
plot_confusion_matrix(cm, classes = category_names, title='Confusion Matrix', normalize=False, figname = 'Confusion_matrix_concrete.jpg')

As seen from the confusion matrix above, the main category the model misclassifies is "Intersection", where it mistakes the category with that of "Yield" in 10 of the images.  

We can also calculate the accuracy averaged over all the different classes: 

In [0]:
accuracy = accuracy_score(test_generator.classes, y_pred)
print("Accuracy in test set: %0.1f%% " % (accuracy * 100))

# Model with image augmentation

As we have limited training data, there are some tricks we could try to improve that. In our case, the model already performs very well with an accuracy of 97-98% and it is not certain we are able to improve that further. However, one strategy when dealing with limited training data is that of "image augmentation". That is, we make a collection of copies of the existing images but with some minor changes. Those changes could be transformations like e.g. rotation images slightly, zooming, flipping images horizontally ++). 
Examples also covered here: https://towardsdatascience.com/image-augmentation-for-deep-learning-histogram-equalization-a71387f609b2

In the following, we define the same model as above, but we here also incorporate image augmentation as a way of artficially increasing the amount of training data. 

In [0]:
conv_base = vgg16.VGG16(weights='imagenet', include_top=False, pooling='max', input_shape = (img_width, img_height, 3))

#for layer in conv_base.layers[:-13]:
#    layer.trainable = False

In [0]:
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(nb_categories, activation='softmax'))
model.summary()

## Augmentations: 

The only thing we need to change in our code, is the function "training datagen". We can here define some data augmentation stragtegies, such as random rotation in the range [-10,10] degrees, a random zoom and width/height shift in the range +-10%, and changes in brightness in the range +-10%. 

As an example of augmented images, we save them to a specified folder "augm_images" as defined in the function "train_generator" below. This option is currently commented out using the #symbol in the code block, to avoid saving thousands of images, but some previous examples of augmented images are included in the folder. 

In [0]:
train_datagen = ImageDataGenerator(
        rescale=1./255,
        rotation_range=10,
        zoom_range=0.1,
        width_shift_range=0.1,
        height_shift_range=0.1,
        horizontal_flip=False,
        brightness_range = (0.9,1.1),
        fill_mode='nearest'
        )

# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data

train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size, 
#save_to_dir='augm_images', 
save_prefix='aug', 
save_format='jpg',
class_mode = "categorical")

Some examples of augmented images: 

In [0]:
img_nr = 1
for subdir, dirs, files in os.walk('augm_images'):
    for file in files:
        img_file = subdir + '/' + file
        image = load_img(img_file,target_size=(img_height,img_width))
        plt.figure()
        plt.title('Augmented image nr: ' + str(img_nr))
        plt.imshow(image)
        img_nr = img_nr +1

## Train new model using augmented data

We are now ready to train the same model using additional augmented data, which should hopefully increase model performance. 

In [0]:
learning_rate = 5e-5
epochs = 20
checkpoint = ModelCheckpoint("sign_classifier_augm.h5", monitor='val_acc', verbose=1, save_best_only=True, save_weights_only=False, mode='auto', period=1)
model.compile(loss="categorical_crossentropy", optimizer=optimizers.Adam(lr=learning_rate, clipnorm=1.), metrics = ['acc'])


In [0]:
history = model.fit_generator(train_generator, 
                              epochs=epochs, 
                              shuffle=True, 
                              validation_data=test_generator,
                              callbacks=[checkpoint]
                              )

In [0]:
model = models.load_model("sign_classifier_augm.h5")

In [0]:
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']

epochs = range(1,len(acc)+1)

plt.figure()
plt.plot(epochs, acc, 'b', label = 'Training accuracy')
plt.plot(epochs, val_acc, 'r', label='Validation accuracy')
plt.title('Training and validation accuracy')
plt.legend()
#plt.savefig('Accuracy_Augmented.jpg')

plt.figure()
plt.plot(epochs, loss, 'b', label = 'Training loss')
plt.plot(epochs, val_loss, 'r', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
#plt.savefig('Loss_Augmented.jpg')

In [0]:
Y_pred = model.predict_generator(test_generator)
y_pred = np.argmax(Y_pred, axis=1)

cm_aug = confusion_matrix(test_generator.classes, y_pred)
plot_confusion_matrix(cm_aug, classes = category_names, title='Confusion Matrix', normalize=False, figname = 'Confusion_matrix_Augm.jpg')

In [0]:
accuracy = accuracy_score(test_generator.classes, y_pred)
print("Accuracy in test set: %0.1f%% " % (accuracy * 100))

## Evaluation of model accuracy

As seen from the above results for model accuracy, data augmentation indeed increased the accuracy of our model. In the current example, we experienced optained a final accuracy of approximately 99%. In addition to the total accuracy, by inspecting the confusion matrix above we can chech which of the sign categories the model classifies incorrectly. Here, we notice that the model stil misclassifies "Intersection" as "Yield" in a few cases, but significantly better than the model without image augmentation. 

Note: do not worry if you do not get exactly the same numbers when re-running the code! There is some inherent randomness in the model initialization etc. which could make the results differ slightly from time to time. 

## Plot a few images from the test set, and compare model prediction with ground truth

As a final visualization of model accuracy, we can plot a subset of the test images along with the corresponding model prediction. Do you agree with the classifications?

In [0]:
test_subset_data_dir = "data/test_subset"

test_subset_generator = test_datagen.flow_from_directory(
test_subset_data_dir,
batch_size = batch_size,
target_size = (img_height, img_width),
class_mode = "categorical",
shuffle=False)

In [0]:
Y_pred = model.predict_generator(test_subset_generator)
y_pred = np.argmax(Y_pred, axis=1)

img_nr = 0
for subdir, dirs, files in os.walk('data/test_subset'):
    for file in files:
        img_file = subdir + '/' + file
        image = load_img(img_file,target_size=(img_height,img_width))
        pred_emotion = category_names[y_pred[img_nr]]
        real_emotion = category_names[test_subset_generator.classes[img_nr]]
        plt.figure()
        plt.title('Predicted: ' + pred_emotion + '\n' + 'Actual:      ' + real_emotion)
        plt.imshow(image)
        img_nr = img_nr +1

# Summary

If you managed to run through the entire tutorial using the included dataset, you have hopefully gotten a feeling for how deep learning and image recognition can be used to solve a real-world problem of traffic sign classification. 

Can you think of any interesting use cases where classifying images into different categories can be of interest? Feel free to explore the model with either data from your company, or with images found from e.g. google image search.

How many images do you need to reach a sufficient accuracy? And does it help to implement data augmentation? 