## Exercises
1. **ResNet**: Winner of ILSVRC 2015, the ResNet (Residual Network) managed to get way deeper then the previous networks. The key ingredient in ResNet is the residual block:
![residual_block](https://www.oreilly.com/library/view/advanced-deep-learning/9781788629416/graphics/B08956_02_10.jpg)
In this exercise you should implement a `Residual_Block` class by subclassing `Layer`. Then create a toy ResNet to train on CIFAR-10. The design of the network is up to you! My suggestion... Start with a couple of convolutional layers and a max pooling layer, then add 2 residual blocks and finish by flattening the tensor and a couple of dense layers.


2. **Custom training**: Define a simple model to classify fashion_MNIST and write explicitly the training loop. At the end of each epoch compute the accuracy of the model on the validation set (you should split it in batches, run the predictions on each batch, and collect the results). Display the collected statistics on tensorboard.


3. **Play around**: Check the [TensorFlow playground](http://playground.tensorflow.org/), it is a nice tool to have a visual representation of what is going on in a neural network.

## Exercise 1

In [1]:
# Define the residual block
from tensorflow.keras import layers
from tensorflow.keras import Sequential

class MyResBlock(layers.Layer):    
    def __init__(self, **kwargs):
        super().__init__(**kwargs)
        
    def build(self, input_size):
        self.conv1 = layers.Conv2D(input_size[-1], 3, padding='same', use_bias=False)
        self.bn1   = layers.BatchNormalization()
        self.relu1 = layers.ReLU()
        self.conv2 = layers.Conv2D(input_size[-1], 3, padding='same', use_bias=False)
        self.bn2   = layers.BatchNormalization()
        self.add   = layers.Add()
        self.relu2 = layers.ReLU()
        super().build(input_size)
        
    def call(self, inputs):
        x = self.conv1(inputs)
        x = self.bn1(x)
        x = self.relu1(x)
        x = self.conv2(x)
        x = self.bn2(x)
        y = self.add([x, inputs])
        return self.relu2(y)

In [2]:
# Define the model
from tensorflow.keras.layers import Conv2D, ReLU, BatchNormalization, Add, Layer
from tensorflow.keras.layers import MaxPool2D, GlobalAveragePooling2D, Dense, Dropout

def myResNet():
    my_res = Sequential()
    my_res.add(Conv2D(8, 3, activation='relu'))
    my_res.add(Conv2D(16, 3, use_bias=False))
    my_res.add(Conv2D(32, 3, use_bias=False))
    my_res.add(BatchNormalization())
    my_res.add(ReLU())
    my_res.add(MaxPool2D(2))
    my_res.add(MyResBlock())
    my_res.add(Conv2D(64, 3, use_bias=False))
    my_res.add(Conv2D(128, 3, use_bias=False))
    my_res.add(BatchNormalization())
    my_res.add(ReLU())
    my_res.add(MaxPool2D(2))
    my_res.add(MyResBlock())
    my_res.add(Conv2D(256, 3, use_bias=False))
    my_res.add(BatchNormalization())
    my_res.add(ReLU())
    my_res.add(GlobalAveragePooling2D())
    my_res.add(Dropout(0.7))
    my_res.add(Dense(64, activation='relu'))
    my_res.add(Dropout(0.7))
    my_res.add(Dense(10, activation='softmax'))
    return my_res

In [3]:
def get_compiled_ResNet():
    my_res = myResNet()
    my_res.compile(loss='sparse_categorical_crossentropy',
                    optimizer="RMSProp",
                    metrics=["accuracy"])
    return my_res

## Test time

In [4]:
# Get the Cifar10 Dataset
import tensorflow as tf 
(x_train_full, y_train_full), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
x_train_full, x_test = x_train_full/255., x_test/255.

In [5]:
# Split
from sklearn.model_selection import train_test_split
x_train, x_val,  y_train, y_val  = train_test_split(x_train_full, y_train_full)

In [6]:
# Get the resnet
model = myResNet()
model.build(x_train.shape)
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d (Conv2D)              multiple                  224       
_________________________________________________________________
conv2d_1 (Conv2D)            multiple                  1152      
_________________________________________________________________
conv2d_2 (Conv2D)            multiple                  4608      
_________________________________________________________________
batch_normalization (BatchNo multiple                  128       
_________________________________________________________________
re_lu (ReLU)                 multiple                  0         
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple                  0         
_________________________________________________________________
my_res_block (MyResBlock)    multiple                  1

## Training on CIFAR-10

In [7]:
epochs     = 10
batch_size = 16
loss_fn    = tf.keras.losses.sparse_categorical_crossentropy
optimizer  = tf.keras.optimizers.SGD()
acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()

In [8]:
from time import time
model = myResNet()

model.compile(loss=loss_fn, optimizer=optimizer, metrics=[acc_metric])
start = time()
history = model.fit(x=x_train, y=y_train,
                    batch_size=batch_size,
                    epochs=epochs, verbose=1,
                    validation_data=(x_val, y_val))
stop = time()
print("%d epochs in %.2fs"%(epochs, stop-start))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
10 epochs in 124.01s


## Exercise 2

In [24]:
# Bring the Fashion_MNIST
import numpy as np
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
x_train = np.float32(x_train/255.)
x_test       = np.float32(x_test/255.)
print(x_train.shape, y_train.shape)

(60000, 28, 28) (60000,)


In [36]:
# Tensor board
from datetime import datetime
import os
logdir = "logs/"
model_log_dir  = os.path.join(logdir, datetime.now().strftime('%Y%m%d_%H%M%S'))
cb_tensorboard = tf.keras.callbacks.TensorBoard(log_dir=model_log_dir)

# Creates a file writer for the log directory.
file_writer = tf.summary.create_file_writer(logdir)

In [37]:
# Custom training auxiliars
def get_batch(batch_size):
    idx = np.random.randint(low=0, high=len(x_train), size=batch_size)
    return x_train[idx], y_train[idx]

@tf.function
def train_step(model, loss_fn, optimizer, x_batch, y_batch):
    with tf.GradientTape() as tape:
        # Forward
        y_pred   = model(x_batch, training=True)
        out_loss = tf.reduce_mean(loss_fn(y_batch, y_pred))
        tot_loss = tf.add_n([out_loss] + model.losses)
    # Backward    
    gradients = tape.gradient(tot_loss, model.trainable_variables)
    # Update
    optimizer.apply_gradients(zip(gradients, model.trainable_variables))
    return y_pred, tot_loss

def predict(model, loss_fn, x_batch, y_batch):
    y_pred   = model(x_batch, training=False)
    out_loss = tf.reduce_mean(loss_fn(y_batch, y_pred))
    tot_loss = tf.add_n([out_loss] + model.losses)
    return y_pred, tot_loss

In [38]:
# Define the model
def myModel2():
    my_res = Sequential()
    my_res.add(layers.Reshape([28, 28, 1], input_shape=[28, 28]))
    my_res.add(Conv2D(8, 3, activation='relu'))
    my_res.add(Conv2D(16, 3, use_bias=False))
    my_res.add(BatchNormalization())
    my_res.add(ReLU())
    my_res.add(MaxPool2D(2))
    my_res.add(Dropout(0.7))
    my_res.add(Dense(64, activation='relu'))
    my_res.add(Dropout(0.7))
    my_res.add(layers.Flatten())
    my_res.add(Dense(10, activation='softmax'))
    return my_res

model = myModel2()
model.build(x_train.shape)
model.summary()

Model: "sequential_11"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
reshape_5 (Reshape)          (None, 28, 28, 1)         0         
_________________________________________________________________
conv2d_35 (Conv2D)           (None, 26, 26, 8)         80        
_________________________________________________________________
conv2d_36 (Conv2D)           (None, 24, 24, 16)        1152      
_________________________________________________________________
batch_normalization_14 (Batc (None, 24, 24, 16)        64        
_________________________________________________________________
re_lu_14 (ReLU)              (None, 24, 24, 16)        0         
_________________________________________________________________
max_pooling2d_12 (MaxPooling (None, 12, 12, 16)        0         
_________________________________________________________________
dropout_20 (Dropout)         (None, 12, 12, 16)      

In [39]:
# Metrics
mean_loss_metric = tf.keras.metrics.Mean()
train_acc_metric = tf.keras.metrics.SparseCategoricalAccuracy()

start_t = time()
for epoch in range(epochs):
    start = time()
    # Train
    for i in range(0,len(x_train),batch_size):
        x_batch, y_batch = get_batch(batch_size)
        y_pred, loss     = train_step(model, loss_fn, optimizer, x_batch, y_batch)
        train_acc_metric.update_state(y_batch, y_pred)
        mean_loss_metric.update_state(loss)
    train_acc = train_acc_metric.result()
    mean_loss = mean_loss_metric.result()
    stop = time()
    
    # Register with Tensorboard
    with file_writer.as_default():
        tf.summary.scalar("mean_loss", mean_loss, step=epoch)
        tf.summary.scalar("train_accuracy", train_acc, step=epoch)
        
    train_acc_metric.reset_states()
    mean_loss_metric.reset_states()
    
    # Validate
    for i in range(0,len(x_test),batch_size):
        x_batch, y_batch = x_test[i:i+batch_size], y_test[i:i+batch_size]
        y_pred, loss     = predict(model, loss_fn, x_batch, y_batch)
        train_acc_metric.update_state(y_batch, y_pred)
        mean_loss_metric.update_state(loss)
    val_acc  = train_acc_metric.result()
    loss_acc = mean_loss_metric.result()
    train_acc_metric.reset_states()    
    mean_loss_metric.reset_states()    
    print("Epoch %d/%d: %.4fs\tTrain: accuracy: %.3f - last loss: %.3f\tValidation: accuracy %.3f - mean loss %.3f"%
          (epoch, epochs, stop - start, train_acc, loss, val_acc, loss_acc))
stop_t = time()    
print("%d epochs in %.2fs"%(epochs, stop_t-start_t))

Epoch 0/10: 12.3501s	Train: accuracy: 0.756 - last loss: 0.428	Validation: accuracy 0.836 - mean loss 0.495
Epoch 1/10: 12.1639s	Train: accuracy: 0.810 - last loss: 0.317	Validation: accuracy 0.853 - mean loss 0.446
Epoch 2/10: 12.1977s	Train: accuracy: 0.827 - last loss: 0.331	Validation: accuracy 0.862 - mean loss 0.416
Epoch 3/10: 12.1523s	Train: accuracy: 0.836 - last loss: 0.259	Validation: accuracy 0.855 - mean loss 0.410
Epoch 4/10: 12.1784s	Train: accuracy: 0.839 - last loss: 0.262	Validation: accuracy 0.876 - mean loss 0.389
Epoch 5/10: 12.1881s	Train: accuracy: 0.844 - last loss: 0.269	Validation: accuracy 0.878 - mean loss 0.379
Epoch 6/10: 12.1623s	Train: accuracy: 0.848 - last loss: 0.213	Validation: accuracy 0.876 - mean loss 0.365
Epoch 7/10: 12.1606s	Train: accuracy: 0.851 - last loss: 0.248	Validation: accuracy 0.871 - mean loss 0.377
Epoch 8/10: 12.1512s	Train: accuracy: 0.854 - last loss: 0.223	Validation: accuracy 0.875 - mean loss 0.364
Epoch 9/10: 12.1804s	Train: 

About the results:

![](img/day1_tb.png)