# Xception
The following notebook implements the best transfer learning architecture we were able to achieve. Images are rescaled to 299x299 and data augmentation is applied on the training set with the best hyperparameters obtained after a parameter tuning study. The following table summarizes the hyperparameters of the network:

| Hyperparameter <img width=100/> | Value <img width=50/>|
|:-|:-:|
| batch_size <img width=100/> | 16 <img width=50/> |
| img_h <img width=100/> | 299  <img width=50/> |
| img_w <img width=100/> | 299 <img width=50/> |
| learning_rate  <img width=100/> | 0.0001 <img width=50/> |

The model obtained is compiled with __Adam__ optimizer and the accuracy __metric__. Finally, the techniques used in order to face __overfitting__ are:
* __Early Stopping__: called with ```(monitor='val_loss', patience=5)```
* __Weight Decay__: on all the layers of the architecture, both on Xception's and the last dense ones

In [None]:
import os
import tensorflow as tf
import numpy as np

## Directories Initialization
The notebook can be run independently referring to the __base__ folder created in the following cells. In colab, once mounted drive and unzipped the dataset, we can continue the whole experiment in the base directory only. Remember to assign to the ```zip_path``` variable the path to your MaskDataset.zip.

In [None]:
# DIRECTORIES
zip_path = 'INSERT_HERE_ZIP_PATH'
base_dir = '/content/base'
if not os.path.exists(base_dir):
  os.makedirs(base_dir)

os.chdir(base_dir)
cwd = os.getcwd()

# checkpointing and results directories
exps_dir = os.path.join(cwd, 'Checkpoints')
if not os.path.exists(exps_dir):
  os.makedirs(exps_dir)

res_dir = os.path.join(cwd, 'Results')
if not os.path.exists(res_dir):
  os.makedirs(res_dir)

# dataset directories
dataset_dir = os.path.join(cwd, 'MaskDataset')
training_dir = os.path.join(dataset_dir, 'train')
validation_dir = os.path.join(dataset_dir, 'val')
test_dir = os.path.join(dataset_dir, 'test')

### Directory Structure

Now we have to mount drive and unzip the dataset. Finally the directories will be structured in the following way:
    
    - base/
      - Checkpoints/
      - Results/
      - MaskDataset/
          - test/
              - img1, img2, …, imgN
          - train/
              - 0_none/
                  - img1, img2, …, imgN
              - 1_all/
                  - img1, img2, …, imgN
              - 2_some/ 
                  - img1, img2, ... , imgN
          - val/
              - 0_none/
                  - img1, img2, …, imgN
              - 1_all/
                  - img1, img2, …, imgN
              - 2_some/ 
                  - img1, img2, ... , imgN

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# MaskDataset unzipping, provide the path of MaskDataset.zip in zip_path
!unzip {zip_path}

## Network implementation
We now have to realize the network, first we decide the hyperparameters, apply data preprocessing and build the data generators to train the network. Then, we encode the architecture, compile the model and finally train it.

### Hyperparameters

In [None]:
# SEED setting
SEED = 1234
tf.random.set_seed(SEED)

# hyperparameters
bs = 16
img_h = 299
img_w = 299
lr = 1e-4

# model features
num_classes=3
model_name = 'XC_REG'

### Data Augmentation

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

prepr_func = tf.keras.applications.xception.preprocess_input

apply_data_augmentation = True

# Create training ImageDataGenerator object
if apply_data_augmentation:
    train_data_gen = ImageDataGenerator(rotation_range=10,
                                        width_shift_range=10,
                                        height_shift_range=10,
                                        zoom_range=0.3,
                                        shear_range=0.3,
                                        horizontal_flip=True,
                                        vertical_flip=False,
                                        fill_mode='constant',
                                        cval=0.0,
                                        preprocessing_function=prepr_func)  # to apply Xcpetion normalization
else:
    train_data_gen = ImageDataGenerator(preprocessing_function=prepr_func)

# Create validation ImageDataGenerator object
valid_data_gen = ImageDataGenerator(preprocessing_function=prepr_func)

### Data Management

In [None]:
# GENERATORS

# Training
training_dir = os.path.join(dataset_dir, 'train')
train_gen = train_data_gen.flow_from_directory(training_dir,
                                               batch_size=bs, 
                                               class_mode='categorical',
                                               shuffle=True,
                                               target_size=(img_h, img_w),
                                               seed=SEED)

# Validation
validation_dir = os.path.join(dataset_dir, 'val')
valid_gen = valid_data_gen.flow_from_directory(validation_dir,
                                               batch_size=bs, 
                                               class_mode='categorical',
                                               shuffle=False,
                                               target_size=(img_h, img_w),
                                               seed=SEED)

In [None]:
# DATASET OBJECTS

# Training
train_dataset = tf.data.Dataset.from_generator(lambda: train_gen,
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes]))

train_dataset = train_dataset.repeat()

# Validation
valid_dataset = tf.data.Dataset.from_generator(lambda: valid_gen, 
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes]))

valid_dataset = valid_dataset.repeat()

### Model Architecture
The architecture of the transfer learning model loads the Xception model provided by Keras and on top of it, it applies three dense layers. Regularization of the kernel weights is applied both on all the layers of Xception and also in the last dense layers. To do so we used the L2 normalization function providing as argument 0.0001. In the end we find the last dense layer with a number of units corresponding to the number of our classes, which allows to make the classification.

In [None]:
# Xception Loading
xc = tf.keras.applications.Xception(weights='imagenet', include_top=False, input_shape=(img_h, img_w,3))
xc.summary()

### Regularization
Regularizers were tested for kernels, activities and biases. In the end the most effective one was the __kernel_reg__.

In [None]:
# REGULARIZERS
def kernel_reg(model, regularizer=tf.keras.regularizers.l2(0.0001)):
    for layer in model.layers:
        for attr in ['kernel_regularizer']:
            if hasattr(layer, attr):
              setattr(layer, attr, regularizer)

def activity_reg(model, regularizer=tf.keras.regularizers.l2(0.0001)):
    for layer in model.layers:
        for attr in ['activity_regularizer']:
            if hasattr(layer, attr):
              setattr(layer, attr, regularizer)

def bias_reg(model, regularizer=tf.keras.regularizers.l2(0.0001)):
    for layer in model.layers:
        for attr in ['bias_regularizer']:
            if hasattr(layer, attr):
              setattr(layer, attr, regularizer)

### Architecture

In [None]:
import tempfile

# REGULARIZE
kernel_reg(xc)

model_json = xc.to_json()
tmp_weights_path = os.path.join(tempfile.gettempdir(), 'tmp_weights.h5')
xc.save_weights(tmp_weights_path)
xc = tf.keras.models.model_from_json(model_json)
xc.load_weights(tmp_weights_path, by_name=True)

# MODEL
model = tf.keras.Sequential()

# add first Xception and the missing GAP
model.add(xc)
model.add(tf.keras.layers.GlobalAveragePooling2D())

# finally the dense layers
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.0001)))
model.add(tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.0001)))
model.add(tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.0001)))

model.add(tf.keras.layers.Dense(units=num_classes, activation='softmax'))

# Visualize created model as a table
model.summary()

### Model Compile

In [None]:
# loss function
loss = tf.keras.losses.CategoricalCrossentropy()

# optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=lr)

# metrics
metrics = ['accuracy']

# compile model
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

### Training with Callbacks
The following callbacks are used during training for model checkpointing, data visualization and early stopping. 

In [None]:
from datetime import datetime

# Experiments dir
now = datetime.now().strftime('%b%d_%H-%M-%S')

exp_dir = os.path.join(exps_dir, model_name + '_' + str(now))

if not os.path.exists(exp_dir):
    os.makedirs(exp_dir)

callbacks = []

# Visualize Learning on Tensorboard
# ---------------------------------
tb_dir = os.path.join(exp_dir, 'tb_logs')
if not os.path.exists(tb_dir):
    os.makedirs(tb_dir)
    
# By default shows losses and metrics for both training and validation
tb_callback = tf.keras.callbacks.TensorBoard(log_dir=tb_dir,
                                             profile_batch=0,
                                             histogram_freq=1)
callbacks.append(tb_callback)

# Model checkpoint
# ---------------
ckpt_dir = os.path.join(exp_dir, 'ckpts')

if not os.path.exists(ckpt_dir):
    os.makedirs(ckpt_dir)

ckpt_callback = tf.keras.callbacks.ModelCheckpoint(filepath=os.path.join(ckpt_dir, 'cp.ckpt'), 
                                                   save_weights_only=False,
                                                   save_best_only=True)

callbacks.append(ckpt_callback)

# Early Stopping
# --------------
early_stop = True
if early_stop:
    es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)
    callbacks.append(es_callback)

### Model Training

In [None]:
# TRAINING
model.fit(x=train_dataset,
          epochs=100,  #### set repeat in training dataset
          steps_per_epoch=len(train_gen),
          validation_data=valid_dataset,
          validation_steps=len(valid_gen), 
          callbacks=callbacks)

## Model Testing
Once the model is trained we can test it by calling the predict method on the images contained in the test dataset. In the end we build the .csv file used for the submissions.

In [None]:
def create_csv(results, model_name):
    csv_fname = model_name + '_results_'
    csv_fname += datetime.now().strftime('%b%d_%H-%M-%S') + '.csv'

    with open(os.path.join(res_dir, csv_fname), 'w') as f:

        f.write('Id,Category\n')

        for key, value in results.items():
            f.write(key + ',' + str(value) + '\n')

In [None]:
from PIL import Image

image_filenames = next(os.walk(test_dir + '/test'))[2]

results = {}
for image_name in image_filenames:
  img = Image.open(test_dir + '/test/' + image_name).convert('RGB')
  img = img.resize((img_h, img_w))

  img_array = np.array(img)
  img_array = img_array * 1. / 255
  img_array = np.expand_dims(img_array, 0)

  predictions = model.predict(img_array)
  predicted_class = np.argmax(predictions, axis=-1)

  predicted_class = predicted_class[0]
  results[image_name] = predicted_class

create_csv(results, model_name)