# **Convolutional neural networks tutorial**
In today's tutorial we will design and train two well-known Convolutional Neural Networks (*LeNet-5* and *AlexNet*) for image classification.

We will use [**TensorFlow**](https://ekababisong.org/gcp-ml-seminar/tensorflow/) framework and [**Keras**](https://keras.io/) open-source library to rapidly prototype Deep Neural Networks.

# **Useful modules import**
First of all, it is necessary to import useful modules used during the tutorial. 

In [None]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
import random
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
import numpy as np
import math

# **Utility functions**
Execute the following code to define some utility functions used in the tutorial:
- **plot_history** draws in a graph the loss trend over epochs on both training and validation sets. Moreover, if provided, it draws in the same graph also the trend of the given metric;
- **show_confusion_matrix** visualizes a 2D confusion matrix as a color-coded image.

In [None]:
def plot_history(history,metric=None):
  fig, ax1 = plt.subplots(figsize=(10, 8))

  epoch_count=len(history.history['loss'])

  line1,=ax1.plot(range(1,epoch_count+1),history.history['loss'],label='train_loss',color='orange')
  ax1.plot(range(1,epoch_count+1),history.history['val_loss'],label='val_loss',color = line1.get_color(), linestyle = '--')
  ax1.set_xlim([1,epoch_count])
  ax1.set_ylim([0, max(max(history.history['loss']),max(history.history['val_loss']))])
  ax1.set_ylabel('loss',color = line1.get_color())
  ax1.tick_params(axis='y', labelcolor=line1.get_color())
  ax1.set_xlabel('Epochs')
  _=ax1.legend(loc='lower left')

  if (metric!=None):
    ax2 = ax1.twinx()
    line2,=ax2.plot(range(1,epoch_count+1),history.history[metric],label='train_'+metric)
    ax2.plot(range(1,epoch_count+1),history.history['val_'+metric],label='val_'+metric,color = line2.get_color(), linestyle = '--')
    ax2.set_ylim([0, max(max(history.history[metric]),max(history.history['val_'+metric]))])
    ax2.set_ylabel(metric,color=line2.get_color())
    ax2.tick_params(axis='y', labelcolor=line2.get_color())
    _=ax2.legend(loc='upper right')

def show_confusion_matrix(conf_matrix,class_names,figsize=(10,10)):
  fig, ax = plt.subplots(figsize=figsize)
  img=ax.matshow(conf_matrix)
  tick_marks = np.arange(len(class_names))
  _=plt.xticks(tick_marks, class_names,rotation=45)
  _=plt.yticks(tick_marks, class_names)
  _=plt.ylabel('Real')
  _=plt.xlabel('Predicted')
  
  for i in range(len(class_names)):
    for j in range(len(class_names)):
        text = ax.text(j, i, '{0:.1%}'.format(conf_matrix[i, j]),
                       ha='center', va='center', color='w')

# **Datasets**
The [**tf.keras.datasets**](https://keras.io/api/datasets/) module contains four simple datasets useful to test CNNs:
- [**digits MNIST**](http://yann.lecun.com/exdb/mnist/) - a dataset of 28x28 grayscale images of the 10 digits;
- [**fashion MNIST**](https://github.com/zalandoresearch/fashion-mnist) - a dataset of 28x28 grayscale images of 10 fashion categories;
- [**CIFAR10**](https://www.cs.toronto.edu/~kriz/cifar.html) - a dataset of 32x32 RGB images labeled over 10 categories;
- [**CIFAR100**](https://www.cs.toronto.edu/~kriz/cifar.html) - a dataset of 32x32 RGB images labeled over 100 classes.

The following code loads in memory the selected dataset.


In [None]:
dataset='mnist' #   'mnist' 'fashion_mnist' 'cifar10'   'cifar100'

if dataset=='mnist':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.mnist.load_data()
    class_names=range(10)
elif dataset=='fashion_mnist':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.fashion_mnist.load_data()
    class_names=('T-shirt/top','Trouser','Pullover','Dress','Coat','Sandal','Shirt','Sneaker','Bag','Ankle boot')
elif dataset=='cifar10':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.cifar10.load_data()
    class_names=('airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck')
    data_train_y=data_train_y.squeeze()
    data_test_y=data_test_y.squeeze()
elif dataset=='cifar100':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.cifar100.load_data()
    class_names=('apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle','bicycle', 'bottle', 'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel','can', 'castle', 'caterpillar', 'cattle', 'chair', 'chimpanzee', 'clock','cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur','dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster','house', 'kangaroo', 'keyboard', 'lamp', 'lawn_mower', 'leopard', 'lion','lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain', 'mouse','mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear','pickup_truck', 'pine_tree', 'plain', 'plate', 'poppy', 'porcupine','possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket', 'rose','sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake','spider', 'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table','tank', 'telephone', 'television', 'tiger', 'tractor', 'train', 'trout','tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman','worm')
    data_train_y=data_train_y.squeeze()
    data_test_y=data_test_y.squeeze()

class_count=len(class_names)

print('Train image shape: ',data_train_x.shape)
print('Train label shape: ',data_train_y.shape)
print('Test image shape: ',data_test_x.shape)
print('Test label shape: ',data_test_y.shape)
print('Number of classes: ',class_count)

## **Visualization**
Some randomly selected images can be shown by executing the following code.

In [None]:
image_count=10

_, axs = plt.subplots(1, image_count,figsize=(15, 10))
for i in range(image_count):
  random_idx=random.randint(0,data_train_x.shape[0])
  axs[i].imshow(data_train_x[random_idx],cmap='gray')
  axs[i].axis('off')
  axs[i].set_title(class_names[data_train_y[random_idx]])

## **Split data into training and validation sets**
In order to avoid overfitting during training, it is necessary to have a separate dataset (called validation set), in addition to the training and test datasets, to choose the optimal value for the hyperparameters. 

For this reason, *data_train_x* is divided into two subsets: training and validation sets. 

Scikit-learn library provides the function [**train_test_split**](http://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html) to separate a dataset into two parts.

The *val_size* variable represents the percentage (or the absolute number) of patterns to include in the validation set.

By default, **train_test_split** mixes patterns in order to avoid that returned datasets contain patterns belonging only to a subset of the classes.

In [None]:
val_size=10000

train_x, val_x, train_y, val_y = train_test_split(data_train_x, data_train_y, test_size=val_size, random_state=42,shuffle=True)
train_x=np.array(train_x)
val_x=np.array(val_x)

test_x=data_test_x
test_y=data_test_y

print('Train shape: ',train_x.shape)
print('Validation shape: ',val_x.shape)
print('Test shape: ',test_x.shape)

## **Preprocessing**
The acquired data are usually messy and come from different sources. To feed them into a ML model, they need to be standardized and cleaned up. For example, fully connected layers in CNNs require that all images are of the same size.

Image preprocessing are the steps taken to format images before they are used by model training and inference.

### **Image shape**
In case of grayscale images, it is necessary to add a new unit axis to explicitly represent single channel images.

By executing the following code, the shape of the images is updated from WxH to WxHx1.

In [None]:
if (len(train_x.shape)==3):
  train_x=np.expand_dims(train_x,axis=3)
  val_x=np.expand_dims(val_x,axis=3)
  test_x=np.expand_dims(test_x,axis=3)
  print('Train shape: ',train_x.shape)
  print('Validation shape: ',val_x.shape)
  print('Test shape: ',test_x.shape)

### **Intensity range normalization**
Pixel intensity is usually represented as discrete values in the range [0;255]. 

In [None]:
print('Min value: ',train_x.min())
print('Max value: ',train_x.max())

Such values could produce math range errors with the activation function or make training unstable. To overcome these issues, a simple normalization step can be applied by dividing all values by 255 to get continuous values in the range [0;1].

In [None]:
train_x=train_x/255
val_x=val_x/255
test_x=test_x/255
print('Min value: ',train_x.min())
print('Max value: ',train_x.max())

### **Spatial size**
*LeNet-5* has been originally designed to receive as input a 32x32 image. Although the input shape can be changed during the creation of the model, we prefer to maintain the original input shape by adding a black border.

In [None]:
if train_x.shape[1]<32 or train_x.shape[2]<32:
  pad_h=int((32-train_x.shape[1])/2)
  pad_w=int((32-train_x.shape[2])/2)
  train_x=np.pad(train_x,((0,0),(pad_w,pad_w),(pad_h,pad_h),(0,0)),'constant',constant_values=0)
  val_x=np.pad(val_x,((0,0),(pad_w,pad_w),(pad_h,pad_h),(0,0)),'constant',constant_values=0)
  test_x=np.pad(test_x,((0,0),(pad_w,pad_w),(pad_h,pad_h),(0,0)),'constant',constant_values=0)
  print('Train shape: ',train_x.shape)
  print('Validation shape: ',val_x.shape)
  print('Test shape: ',test_x.shape)

# **LeNet-5**
*LeNet-5* is a CNN introduced to recognize handwritted digits in images.

It consists of:
- three **convolutional** layers (C1, C3 and C5);
- two **average pooling** layers (S2 and S4);
- two **fully-connected** layers (F6 and Output).

![alt text](https://biolab.csr.unibo.it/ferrara/Courses/DL/Tutorials/CNN/LeNet5.png)

## **Model definition**
The following function creates an *LeNet-5* model given:
- the shape of the input images (*input_shape*);
- the number of output classes (*output_class_count*).

In Keras, a sequential is a stack of layers where each layer has exactly one input and one output. It can be created by passing a list of layers to the  constructor [**keras.Sequential**](https://keras.io/guides/sequential_model/).

[**Keras layers API**](https://keras.io/api/layers/) offers a wide range of built-in layers ready for use, including:
- [**Input**](https://keras.io/api/layers/core_layers/input/) - the input of the model. Note that, you can also omit the Input layer. In that case the model doesn't have any weights until the first call to a training/evaluation method (since it is not yet built).
- [**Conv2D**](https://keras.io/api/layers/convolution_layers/convolution2d/) - a 2D convolution layer;
- [**AvgPool2D**](https://keras.io/api/layers/pooling_layers/average_pooling2d/) - a 2D average pooling layer;
- [**Flatten**](https://keras.io/api/layers/reshaping_layers/flatten/) - a simple layer used to flatten the input;
- [**Dense**](https://keras.io/api/layers/core_layers/dense/) - a fully-connected layer.

Using such layers we are able to define an LeNet-5 model.

In [None]:
def build_lenet5(input_shape=(32, 32, 1),output_class_count=10):
    model=keras.Sequential(
            [
                layers.Input(shape=input_shape,name='Input'),
                layers.Conv2D(filters=6, kernel_size=5, strides=1,activation='tanh',padding='valid',name='C1'),
                layers.AvgPool2D(pool_size=2, strides=2,name='S2'),
                layers.Conv2D(filters=16, kernel_size=5,strides=1,activation='tanh',padding='valid',name='C3'),
                layers.AvgPool2D(pool_size=2, strides=2,name='S4'),
                layers.Conv2D(filters=120, kernel_size=5,strides=1,activation='tanh',padding='valid',name='C5'),
                layers.Flatten(),
                layers.Dense(84, activation='tanh',name='F6'),
                layers.Dense(units=output_class_count,activation='softmax',name='Output')
            ]
        )
    return model

## **Model creation**
The following code creates an *LeNet-5* model by calling the **build_lenet5** function defined above.

In [None]:
model=build_lenet5(train_x[0].shape,class_count)

## **Model visualization**
A string summary of the network can be printed using the [**summary**](https://keras.io/api/models/model/#summary-method) method.

In [None]:
model.summary()

The summary is useful for simple models, but can be confusing for complex models.

Function [**keras.utils.plot_model**](https://keras.io/api/utils/model_plotting_utils/) creates a plot of the neural network graph that can make more complex models easier to understand.

In [None]:
keras.utils.plot_model(model,show_shapes=True)

## **Model compilation**
The compilation is the final step in configuring the model for training. Keras model provides a method, [**compile**](https://keras.io/api/models/model_training_apis/#compile-method) to compile the model.
The important arguments are:
- the optimization algorithm (*optimizer*);
- the loss function (*loss*);
- the metrics used to evaluate the performance of the model (*metrics*).

The most common [optimization algorithms](https://keras.io/api/optimizers/#available-optimizers), [loss functions](https://keras.io/api/losses/#available-losses) and [metrics](https://keras.io/api/metrics/#available-metrics) are already available in Keras. You can either pass them to **compile** as an instance or by the corresponding string identifier. In the latter case, the default parameters will be used.


In [None]:
optimizer=keras.optimizers.SGD()

model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])

## **Training**
Now we are ready to train our model by calling the [**fit**](https://keras.io/api/models/model_training_apis/#fit-method) method. It trains the model for a fixed number of epochs (*epoch_count*) using the training set (*train_x* and *train_y*) divided into mini-batches of *batch_size* elements. During the training process, the performances will be evaluated on both training and validation (*validation_x* and *validation_x*) sets.

In [None]:
batch_size=250
epoch_count=10

history =model.fit(train_x,train_y,batch_size,epoch_count,validation_data=(val_x,val_y))

### **Visualize the training process**
We can learn a lot about our model by observing the graph of its performance over time during training.

The **fit** method returns an object containing loss and metrics values at successive epochs for both training and validation sets.

The following code draws in a graph the loss and accuracy trend over epochs on both training and validation sets.

In [None]:
plot_history(history,metric='accuracy')

## **Save and load the entire model**
Once the model has been trained, it is a good idea to save it for later use without repeating the entire training phase again.

The [**save**](https://keras.io/api/models/model_saving_apis/#save-method) method (or alternatively the [**save_model**](https://keras.io/api/models/model_saving_apis/#savemodel-function) function) saves the entire model given the output file path.

The resulting file will include:
- the model's architecture;
- the model's weights;
- the compilation information (if **compile** was called);
- the optimizer and its state, if any (this enables you to restart training where you left).

In [None]:
model.save('LeNet5.h5')  

To load a model previously saved, the [**load_model**](https://keras.io/api/models/model_saving_apis/#loadmodel-function) function can be used.

In [None]:
model=keras.models.load_model('LeNet5.h5')

The model is re-instantiated in the exact same state, without the need of any code for model definition or compilation.

## **Save and load model weights**
Sometimes could be useful to save only the model's weights:
- if you only need the model for inference;
- if you are doing transfer learning. 

In such case, the [**save_weights**](https://keras.io/api/models/model_saving_apis/#saveweights-method) method can be used to save all layer weights given the path to the file to save them to.

In [None]:
model.save_weights('LeNet5_weights.h5')  

To load weights previously saved, the [**load_weights**](https://keras.io/api/models/model_saving_apis/#loadweights-method) method can be used. 

<u>Note that:</u>
- a model with the same architecture needs to be created in advance;
- if you want to use the model for training, it must be compiled because no information about the optimizer, the loss function and metrics have been stored in the weight file.

In [None]:
model=build_lenet5(train_x[0].shape,output_class_count=class_count)
model.load_weights('LeNet5_weights.h5')

## **Manual training**
To have more control on the training phase, it is possible to call the **fit** method setting the *epochs* parameter equal to 1.

In this case, the performance evaluation needs to be manually executed using the [**evaluate**](https://keras.io/api/models/model_training_apis/#evaluate-method) method. It returns the loss and metrics values on the dataset passed as input.

In this manner, we will be able to save the model not only at the end of the training phase but also during the training itself.

<u>Be sure to create and compile a new model before executing the following code otherwise the training will be performed on a model already trained.</u>

In [None]:
history_train_metrics=[]
history_valid_metrics=[]

for epoch in range(epoch_count):
    print('Epoch {}/{}'.format(epoch+1,epoch_count), end = '')
    
    model.fit(train_x, train_y, batch_size=batch_size, epochs=1, verbose = 0)

    train_metrics = model.evaluate(train_x, train_y, verbose = 0)
    valid_metrics = model.evaluate(val_x, val_y, verbose = 0)
    history_train_metrics.append(train_metrics)
    history_valid_metrics.append(valid_metrics)
    print('\tTRAIN', end = '')
    for i in range(len(model.metrics_names)):
      print(' {}={:.4f}'.format(model.metrics_names[i],train_metrics[i]), end = '')
    print(' VAL', end = '')
    for i in range(len(model.metrics_names)):
      print(' {}={:.4f}'.format(model.metrics_names[i],valid_metrics[i]), end = '')
    print()

## **Performance evaluation on the test set**
The performance on the test set can be easily measured by calling the **evaluate** method.

In [None]:
results = model.evaluate(test_x, test_y, batch_size=batch_size,verbose=0)
print('Loss: {:.3f} Accuracy: {:.3f}'.format(results[0],results[1]))

### **Confusion matrix**
To evaluate the classification accuracy could be useful to compute the [confusion matrix](https://en.wikipedia.org/wiki/Confusion_matrix).

The following code calls the [**predict**](https://keras.io/api/models/model_training_apis/#predict-method) method to generate output predictions (*test_conf_pred*) for the test set (*test_x*). The output predictions contain, for each evaluated image, the probability values that the image belongs to each of the classes of the problem.

In [None]:
test_conf_pred=model.predict(test_x)
print('Output predictions shape: ',test_conf_pred.shape)

The predicted class of each evaluated image (*test_y_pred*) can be obtained by selecting the class with the highest probability.

In [None]:
test_y_pred=np.argsort(test_conf_pred,axis=1)[:,-1]
print('Class predictions shape: ',test_y_pred.shape)

The evaluated images correctly classified can be identified by comparing the predicted classes (*test_y_pred*) with respect to the ground truth (*test_y*).

In [None]:
correct = np.equal(test_y_pred,test_y)
accuracy=correct.sum()/len(correct)
print('Test set accuracy: {:.3f}'.format(accuracy))

Scikit-learn library provides the function [**confusion_matrix**](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html) to compute the confusion matrix given the grouhd truth (*test_y*) and the predicted classes (*test_y_pred*) as input.

In [None]:
conf_matrix=confusion_matrix(test_y, test_y_pred, normalize='true')
print(conf_matrix)

The following code visualizes the 2D confusion matrix as a color-coded image.

In [None]:
show_confusion_matrix(conf_matrix,class_names)

### **Visualization of misclassified images**
To better understand a model limits and to try to overcome them, it is important to analyze its errors.

The following code shows randomly selected images erroneously classified by the model.
The ground truth is reported on top of each image while on the right side the most probable classes returned by the model are shown.

In [None]:
images_to_show=12

error_indices = np.where(correct == False)[0]

if error_indices.shape[0] > 0:
  image_per_row = 4
  top_class_count = 3

  selected_indices=[]
  for i in range(min(images_to_show,error_indices.shape[0])):
    random_idx=random.randint(0,error_indices.shape[0])
    selected_indices.append(random_idx)
  error_indices=error_indices[selected_indices]

  row_count=math.ceil(len(error_indices)/image_per_row)
  column_count=image_per_row
  plt.rcParams.update({'font.size': 12})
  _, axs = plt.subplots(row_count, column_count,figsize=(25, 4*row_count),squeeze=False)

  for i in range(row_count):
    for j in range(column_count):
      axs[i,j].axis('off')

  for i in range(len(error_indices)):
    q = i // image_per_row
    r = i % image_per_row
    idx = error_indices[i]

    axs[q,r].imshow(test_x[idx].squeeze(),cmap='gray')
    axs[q,r].set_title(class_names[test_y[idx]])

    sorted_conf_indices=np.argsort(test_conf_pred[idx])
    best_indices=sorted_conf_indices[-top_class_count:]
        
    text=''
    for j in range(len(best_indices)-1,-1,-1):
        text+='{}: {:.3f}\n'.format(class_names[best_indices[j]],test_conf_pred[idx][best_indices[j]])

    axs[q,r].text(35, 10, text, horizontalalignment='left', verticalalignment='center')
plt.show()

# **Exercise 1**
Train *LeNet-5* to classify **digits MNIST** images:
1. execute the training process multiple times to find the optimal hyperparameters;
2. save the best model;
3. compute the accuracy on the test set.

It is recommended to evaluate the following hyperparameters (listed in priority order):
1. the number of training epochs (*epoch_count*);
2. the optimization algorithm (*optimizer*);
3. the parameters of the optimization algorithm (e.g., learning rate);
4. the mini-batch size (*batch_size*).

# **Exercise 2**
Repeat *Exercise 1* on the other datasets:
- **fashion MNIST**;
- **CIFAR10**;
- **CIFAR100**.

# **Exercise 3**
Train *AlexNet* to classify **CIFAR10** and **CIFAR100** images:
1. load and prepare the dataset;
2. define the *AlexNet* model implementing the **build_alexnet** function;
3. execute the training process multiple times to find the optimal hyperparameters;
4. save the best model;
5. compute the accuracy on the test set. 

## **Dataset**
The following code loads in memory the selected dataset and creates the validation set by randomly selecting *val_size* patterns from the original training data.

In [None]:
dataset='cifar10' #   'cifar10'   'cifar100'
val_size=10000

if dataset=='cifar10':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.cifar10.load_data()
    class_names=('airplane','automobile','bird','cat','deer','dog','frog','horse','ship','truck')
    data_train_y=data_train_y.squeeze()
    data_test_y=data_test_y.squeeze()
elif dataset=='cifar100':
    (data_train_x,data_train_y), (data_test_x,data_test_y) = keras.datasets.cifar100.load_data()
    class_names=('apple', 'aquarium_fish', 'baby', 'bear', 'beaver', 'bed', 'bee', 'beetle','bicycle', 'bottle', 'bowl', 'boy', 'bridge', 'bus', 'butterfly', 'camel','can', 'castle', 'caterpillar', 'cattle', 'chair', 'chimpanzee', 'clock','cloud', 'cockroach', 'couch', 'crab', 'crocodile', 'cup', 'dinosaur','dolphin', 'elephant', 'flatfish', 'forest', 'fox', 'girl', 'hamster','house', 'kangaroo', 'keyboard', 'lamp', 'lawn_mower', 'leopard', 'lion','lizard', 'lobster', 'man', 'maple_tree', 'motorcycle', 'mountain', 'mouse','mushroom', 'oak_tree', 'orange', 'orchid', 'otter', 'palm_tree', 'pear','pickup_truck', 'pine_tree', 'plain', 'plate', 'poppy', 'porcupine','possum', 'rabbit', 'raccoon', 'ray', 'road', 'rocket', 'rose','sea', 'seal', 'shark', 'shrew', 'skunk', 'skyscraper', 'snail', 'snake','spider', 'squirrel', 'streetcar', 'sunflower', 'sweet_pepper', 'table','tank', 'telephone', 'television', 'tiger', 'tractor', 'train', 'trout','tulip', 'turtle', 'wardrobe', 'whale', 'willow_tree', 'wolf', 'woman','worm')
    data_train_y=data_train_y.squeeze()
    data_test_y=data_test_y.squeeze()

class_count=len(class_names)

train_x, val_x, train_y, val_y = train_test_split(data_train_x, data_train_y, test_size=val_size, random_state=42,shuffle=True)
train_x=np.array(train_x)
val_x=np.array(val_x)

test_x=data_test_x
test_y=data_test_y

print('Train shape: ',train_x.shape)
print('Validation shape: ',val_x.shape)
print('Test shape: ',test_x.shape)
print('Number of classes: ',class_count)

### **Preprocessing**
To normalize pixel intensity values in the range [0;1] all images are divided by 255.

In [None]:
train_x=train_x/255
val_x=val_x/255
test_x=test_x/255
print('Min value: ',train_x.min())
print('Max value: ',train_x.max())

*AlexNet* has been originally designed to receive as input a 227x227x3 image. To maintain the original input shape, all images need to be resized from 32x32x3 to 227x227x3.

Unfortunately, while 60000 images of 32x32x3 can be kept in memory (it requires about 700MB), this is not possible after the resize (it would require about 35GB).

To overcome this problem, TensorFlow provides functions and operations to easily manipulate and modify large datasets without the need to maintain them into memory. To use these methods and procedures, it is necessary to transform our dataset into an efficient TensorFlow data representation called [**Dataset**](https://www.tensorflow.org/api_docs/python/tf/data/Dataset).

In [None]:
train_dataset=tf.data.Dataset.from_tensor_slices((train_x,train_y))
val_dataset=tf.data.Dataset.from_tensor_slices((val_x,val_y))
test_dataset=tf.data.Dataset.from_tensor_slices((test_x,test_y))

Then the resize operation can be applyed (using the [**resize_with_pad**](https://www.tensorflow.org/api_docs/python/tf/image/resize_with_pad) function) to prepare the data.

The operation is not directly applied to all images but it will be applied only when an image is requested.

A detailed guide on how preprocess data using the class **Dataset** can be found [here](https://www.tensorflow.org/guide/data?hl=en#preprocessing_data).

In [None]:
resize_mapping=lambda X,y : (tf.image.resize_with_pad(X, 227, 227), y)
train_dataset=train_dataset.map(resize_mapping)
val_dataset=val_dataset.map(resize_mapping)
test_dataset=test_dataset.map(resize_mapping)

Finally mini-batches need to be created. 

In [None]:
batch_size=250

train_dataset=train_dataset.batch(batch_size)
val_dataset=val_dataset.batch(batch_size)
test_dataset=test_dataset.batch(batch_size)

Note that, the elements contained into a **Dataset** can be accessed only sequentially, no direct access using indices is permitted. For this reason, if it is necessary to split data into separate sets (e.g., train and validation sets) or to shuffle them, it is better to use Scikit-learn functionalities before creating a **Dataset** instance.

## **AlexNet**
*AlexNet* consists of:
- five **convolutional** layers;
- three **max pooling** layers;
- three **fully-connected** layers.

![alt text](https://biolab.csr.unibo.it/ferrara/Courses/DL/Tutorials/CNN/AlexNet.png)

### **Batch normalization**
In *AlexNet*, *local response normalization* has been applied after the first two convolutional layers to help speed up convergence. Nowadays, it is usually replaced by the [*batch normalization*](https://en.wikipedia.org/wiki/Batch_normalization) because it has been proven that it performs better.

Keras provides a [**BatchNormalization**](https://keras.io/api/layers/normalization_layers/batch_normalization/) layer that maintains the mean output close to 0 and the output standard deviation close to 1.

In the **build_alexnet** function, add this layer after the first two convolutional layers.

Note that, *batch normalization* is usually added between the output of a layer and its activation. To do this:
- set the *activation* parameter of the layer before the **BatchNormalization** layer as *None*. In this way no activation is applied;  
- add the **BatchNormalization** layer;
- add an [**Activation**](https://keras.io/api/layers/core_layers/activation/) layer to apply the desired activation function (*Relu* for *AlexNet*) to the output of the **BatchNormalization** layer.

### **Dropout**
In the first two fully-connected layers [*dropout*](https://en.wikipedia.org/wiki/Dilution_(neural_networks)#Dropout) is applied during training. It consists of setting to zero the output of each neuron in the layer with probability *p* (equals to 0.5 in *AlexNet*).

![alt text](https://biolab.csr.unibo.it/ferrara/Courses/DL/Tutorials/CNN/Dropout.png)

Keras provides a [**Dropout**](https://keras.io/api/layers/regularization_layers/dropout/) layer that randomly sets input units to 0 at each step during training time.

In the **build_alexnet** function, add this layer after the first two fully-connected layers with a *rate* parameter equal to 0.5.

### **Model definition**
Implement the following function to create an *AlexNet* model given:
- the shape of the input images (*input_shape*);
- the number of output classes (*output_class_count*).

To create 2D max pooling layers, the [**MaxPooling2D**](https://keras.io/api/layers/pooling_layers/max_pooling2d/) class provided by Keras can be used.

In [None]:
def build_alexnet(input_shape=(227, 227, 3),output_class_count=1000):
    #...

### **Model creation**
Call the **build_alexnet** function to create an *AlexNet* model with the default input shape.

In [None]:
model=build_alexnet(output_class_count=class_count)

### **Model visualization**
Visualize the created model to verify its correctness.

In [None]:
model.summary()

In [None]:
keras.utils.plot_model(model,show_shapes=True, show_layer_names=True)

### **Model compilation**
Compile the model for training by setting up:
- the optimization algorithm;
- the loss function;
- the metrics used to evaluate the performance of the model.

In [None]:
optimizer=keras.optimizers.SGD()

model.compile(optimizer=optimizer, loss='sparse_categorical_crossentropy', metrics=['accuracy'])