<a href="https://colab.research.google.com/github/robotics-upo/rva-course-material/blob/master/deeplearningbasics/learning_imageclassification.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction

In this session we will learn to create and train a simple network for **image classification**.

We will be using the following tools:  TensorFlow and its Keras API. 

* **TensorFlow**: open source library for machine learning 
 * https://www.tensorflow.org/
 * https://www.tensorflow.org/guide

* **Keras** high-level API for TensorFlow: 
 * https://keras.io/
 * https://www.tensorflow.org/api_docs/python/tf/keras 

TensorFlow comes preinstalled in the Colab environment.

# Loading data

One fundamental aspect for learning networks for CV application is to have data to learn from.

The Keras API allows to access publicly available datasets in a very simple way. 

* https://keras.io/datasets/

For training this application and example, we will be using data from the **CIFAR dataset**

* https://www.cs.toronto.edu/~kriz/cifar.html

With the next lines, we load the data we will use and show one exemplar image from the dataset. We can use OpenCV in Colab, but not the GUI functions. So we will use the plotting facilities of **matplotlib**:

* https://matplotlib.org/

In [None]:
import tensorflow as tf
from tensorflow.keras.utils import to_categorical

#Load train dataset for CIFAR10
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()

#Check the shape of the data
print('Input shape for the train set', x_train.shape)
print('Label shape for the train set',y_train.shape)
print('Input shape for the test set',x_test.shape)
print('Label shape for the test set', y_test.shape)

#Show one sample image (image 100 from the dataset)
#We can use OpenCV in Colab, but not its function imshow
#We use matplotlib instead

from matplotlib import pyplot as plt
plt.imshow(x_test[100], cmap=plt.cm.binary)


## Preparing the data
We will prepare the data for training. First, we normalize the data. Instead of pixel values between 0 and 255, we normalize them between 0 and 1.

Then, the labels from CIFAR-10 store the class of each image (as a number between 0 and 9). We will convert this to a categorical vector that just has 10 elements (corresponding to the 10 classes) and a 1 in the corresponding class, 0 in the rest of elements

In [None]:
#Normalize data. Instead of pixel values between 0 and 255, we normalize them
#between 0 and 1
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255

#Print label for image 100
print('Label for Image 100:', y_train[100])

# Convert class vectors to binary class matrices.
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)
print('New shape for labels', y_train.shape)
print('Label for Image 100 as categorical vector',y_train[100])



# Creating our CNN

The next lines create a simple network with several convolutional and maxpool layers, followed by fully connected layers, stacked in a sequential way for classification.

* https://www.tensorflow.org/api_docs/python/tf/keras/Sequential
* https://www.tensorflow.org/guide/keras/sequential_model

This network is similar (but with less convolutional layers) than the one described in:

* https://poloclub.github.io/cnn-explainer/


In [None]:
#Simple sequential model
model = tf.keras.Sequential()
#First layer is a convolutional layer with 32 channels (filters), followed
#by ReLU activations
model.add(tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same',
                 input_shape=x_train.shape[1:]))

#We add a MaxPooling layer, reducing the resolution
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))

#New convolutional layer with 64 channels (filters), followed by ReLU
model.add(tf.keras.layers.Conv2D(64, (3, 3), activation='relu', padding='same'))

#Again MaxPooling layer, reducing the resolution
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))

#Convert the output to a vector
model.add(tf.keras.layers.Flatten())

#Add a fully connected layer with ReLU activation
model.add(tf.keras.layers.Dense(512, activation='relu'))

#Output layer: fully connected. The number of neurons should match the number of
#classes 
#The output is converted to probabilities using softmax activation
model.add(tf.keras.layers.Dense(10, activation='softmax'))


print(model.summary())


# Training

The next lines allow to train the model using the data from CIFAR-10 downloaded above.

The important aspects are the **loss function** employed, and the **optimizer**. The function `compile` allows also to provide **metrics** to monitor the performance.


* https://keras.io/api/losses/
* https://keras.io/api/metrics/
* https://keras.io/api/optimizers/

More information:

* https://keras.io/api/models/model_training_apis/
* https://www.tensorflow.org/guide/keras/train_and_evaluate


In [None]:
#Configure the training
model.compile(loss="categorical_crossentropy",
              optimizer="sgd",
              metrics = ['accuracy'])

#Train the model using the training set.
_ = model.fit(x_train, y_train, epochs=10, verbose = 1)

The former process iterate over all training data. In each epoch, different interations of the optimizer are performed (in this case, SGD).
There are several parameters that can be controlled (if they are not set, they are estabilished to default values).

Two important ones are the following:

* **batch_size** (defaults to 32): In each epoch, the training data is divided in batches of this size. Each batch is used to estimate the gradient for the optimizer and perform one iteration. In each epoch, the set of batches is iterated. In general terms, larger values take longer times per iteration/step (but there are less steps per epoch). Larger sizes can lead to poor gradient estimations. Too short sizes can lead to very noisy gradients.
 * https://machinelearningmastery.com/how-to-control-the-speed-and-stability-of-training-neural-networks-with-gradient-descent-batch-size/
* **validation_split** (defaults to 0): Fraction of the training data to be used as validation data. The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. This is useful to see if overfitting is happening, etc

In [None]:
_ = model.fit(x_train, y_train, batch_size=64, validation_split=0.1,epochs=10, verbose = 1)

# Evaluation
Let's check the accuracy for the test set.

In [None]:

test_loss , test_acc = model.evaluate(x_test, y_test)

print('Loss:', test_loss)
print('Accuracy:', test_acc*100)

You can enhance the accuracy by adapting learning parameters (training the model longer, etc), adding more data (we are using already all images from CIFAR-10) and playing with the network architecture (complexity, layers, etc).

You can check the current accuracy that it is achieved in this benchmark at:

https://benchmarks.ai/cifar-10

#Homework #1

Modify the network (changing the architecture) and training parameters to achieve **at least an accuracy of 75%** in the **test set**.

Consider longer periods for training and additional layers. Search for information about and include **dropout** layers to reduce overfitting 
 * https://keras.io/api/layers/regularization_layers/dropout/

# Inference

Once our network is trained, it can be used to infer the category of an image

In [None]:
import numpy as np

cifar10_labels = ['airplane', 'automobile', 'bird','cat','deer','dog','frog',
'horse','ship', 'truck']

prediction = model.predict(x_test)

plt.figure()
plt.imshow(x_test[100], cmap=plt.cm.binary)

print(prediction[100])
print("Predicción del modelo: ", np.argmax(prediction[100]) )
print("Predicción del modelo: ", cifar10_labels[np.argmax(prediction[100])] )

plt.figure()
plt.imshow(x_test[350], cmap=plt.cm.binary)

print(prediction[350])
print("Predicción del modelo: ", np.argmax(prediction[350]) )
print("Predicción del modelo: ", cifar10_labels[np.argmax(prediction[350])] )


# Storing the network

We will store the network weights for future use. In this example, we will store it in Google Drive, but you can use the same functions to save the model locally.

The My Drive from Google Drive is mounted at the location /content/drive of the Colab Virtual Machine

In [None]:
from google.colab import drive
drive.mount('/content/drive')



We are goint to save the whole model. That is, the model and the resultant weights. There are other options, like saving only the weights

https://www.tensorflow.org/guide/keras/save_and_serialize?hl=es



In [None]:
model.save('/content/drive/My Drive/colabfiles/classif.h5')

We unmount the drive. 

Again, these lines are not needed if you use this code locally

We will learn how to use the stored model in next sessions

In [None]:
%ls '/content/drive/My Drive/colabfiles/'
drive.flush_and_unmount()

#Tutorials

* Classification: https://www.tensorflow.org/tutorials/keras/classification
