# Convolutional Neural Network
The following notebook implements the best convolutional neural network we were able to achieve. Images are rescaled to 256x256 and data augmentation is applied on the training set with the best hyperparameters obtained after a parameter tuning study. The following table summarizes the hyperparameters of the network:

| Hyperparameter <img width=100/> | Value <img width=50/>|
|:-|:-:|
| batch_size <img width=100/> | 16 <img width=50/> |
| img_h <img width=100/> | 256  <img width=50/> |
| img_w <img width=100/> | 256 <img width=50/> |
| start_f <img width=100/> | 64 <img width=50/> |
| depth <img width=100/> | 5 <img width=50/> |
| learning_rate  <img width=100/> | 0.0001 <img width=50/> |

The model obtained is compiled with __Adam__ optimizer and the accuracy __metric__. Finally, the techniques used in order to face __overfitting__ are:
* __Early Stopping__: called with ```(monitor='val_loss', patience=5)```
* __Dropout__: only on the last dense layer with probability equal to 0.1 

In [None]:
import os
import tensorflow as tf
import numpy as np

## Directories Initialization
The notebook can be run independently referring to the __base__ folder created in the following cells. In colab, once mounted drive and unzipped the dataset, we can continue the whole experiment in the base directory only. Remember to assign to the ```zip_path``` variable the path to your MaskDataset.zip.

In [None]:
# DIRECTORIES
zip_path = 'INSERT_HERE_ZIP_PATH' 
base_dir = '/content/base'
if not os.path.exists(base_dir):
  os.makedirs(base_dir)

os.chdir(base_dir)
cwd = os.getcwd()

# checkpointing and results directories
exps_dir = os.path.join(cwd, 'Checkpoints')
if not os.path.exists(exps_dir):
  os.makedirs(exps_dir)

res_dir = os.path.join(cwd, 'Results')
if not os.path.exists(res_dir):
  os.makedirs(res_dir)

# dataset directories
dataset_dir = os.path.join(cwd, 'MaskDataset')
training_dir = os.path.join(dataset_dir, 'train')
validation_dir = os.path.join(dataset_dir, 'val')
test_dir = os.path.join(dataset_dir, 'test')

### Directory Structure

Now we have to mount drive and unzip the dataset. Finally the directories will be structured in the following way:
    
    - base/
      - Checkpoints/
      - Results/
      - MaskDataset/
          - test/
              - img1, img2, …, imgN
          - train/
              - 0_none/
                  - img1, img2, …, imgN
              - 1_all/
                  - img1, img2, …, imgN
              - 2_some/ 
                  - img1, img2, ... , imgN
          - val/
              - 0_none/
                  - img1, img2, …, imgN
              - 1_all/
                  - img1, img2, …, imgN
              - 2_some/ 
                  - img1, img2, ... , imgN
            

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
# MaskDataset unzipping, provide the path of MaskDataset.zip in zip_path
!unzip {zip_path}

## Network implementation
We now have to realize the network, first we decide the hyperparameters, apply data preprocessing and build the data generators to train the network. Then, we encode the architecture, compile the model and finally train it.

### Hyperparameters

In [None]:
# SEED setting
SEED = 1234
tf.random.set_seed(SEED)

# hyperparameters
bs = 16
img_h = 256
img_w = 256
start_f = 64
depth = 5
lr = 1e-4

# model features
num_classes = 3
model_name = 'CNN_5'

### Data Augmentation

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# data training is augmented
train_data_gen = ImageDataGenerator(rotation_range=10,
                                    width_shift_range=10,
                                    height_shift_range=10,
                                    zoom_range=0.3,
                                    shear_range=0.3,
                                    horizontal_flip=True,
                                    vertical_flip=False,
                                    fill_mode='constant',
                                    cval=0,
                                    rescale=1./255)

# validation just rescaled
valid_data_gen = ImageDataGenerator(rescale=1/255.)

### Data Management

In [None]:
# GENERATORS
classes = [
     '0_none',
     '1_all',
     '2_some'      
]

# Training
train_gen = train_data_gen.flow_from_directory(training_dir,
                                               batch_size=bs,
                                               classes=classes,
                                               class_mode='categorical',
                                               shuffle=True,
                                               seed=SEED)

# Validation
valid_gen = valid_data_gen.flow_from_directory(validation_dir,
                                               batch_size=bs, 
                                               classes=classes,
                                               class_mode='categorical',
                                               shuffle=False,
                                               seed=SEED)

In [None]:
# DATASET OBJECTS

# Training
train_dataset = tf.data.Dataset.from_generator(lambda: train_gen,
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes]))

train_dataset = train_dataset.repeat()

# Validation
valid_dataset = tf.data.Dataset.from_generator(lambda: valid_gen, 
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes]))
valid_dataset = valid_dataset.repeat()

### Model Architecture
The architecture of the Convolutional Neural Network is composed of the convolutional part of depth 5 followed by two dense layers where only the last one has dropout applied on it. In the end we find the last dense layer with a number of units corresponding to our classes, which allows to make the classification.

In [None]:
# ARCHITECTURE
model = tf.keras.Sequential()

# Convolutional Part
for i in range(depth):
  if i == 0:
    input_shape = [img_h, img_w, 3]
  else:
    input_shape=[None]

  model.add(tf.keras.layers.Conv2D(filters=start_f, 
                                   kernel_size=(3, 3),
                                   strides=(1, 1),
                                   padding='same',
                                   input_shape=input_shape))
  model.add(tf.keras.layers.ReLU())
  model.add(tf.keras.layers.MaxPool2D(pool_size=(2, 2)))

  start_f *= 2

# Bottom dense Part
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=512, activation='relu'))
model.add(tf.keras.layers.Dense(units=512, activation='relu'))
model.add(tf.keras.layers.Dropout(0.1))
model.add(tf.keras.layers.Dense(units=num_classes, activation='softmax'))

### Model Compile

In [None]:
# loss function
loss = tf.keras.losses.CategoricalCrossentropy()

# optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=lr)

# metrics
metrics = ['accuracy']

# compile model
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

### Training with Callbacks
The following callbacks are used during training for model checkpointing, data visualization and early stopping. 

In [None]:
from datetime import datetime

now = datetime.now().strftime('%b%d_%H-%M-%S')

# Experiments dir
exp_dir = os.path.join(exps_dir, model_name + '_' + str(now))
if not os.path.exists(exp_dir):
    os.makedirs(exp_dir)
    
callbacks = []

# Model checkpoint
# ----------------
ckpt_dir = os.path.join(exp_dir, 'ckpts')
if not os.path.exists(ckpt_dir):
    os.makedirs(ckpt_dir)

ckpt_callback = tf.keras.callbacks.ModelCheckpoint(filepath=os.path.join(ckpt_dir, 'cp.ckpt'), 
                                                   save_weights_only=True)
callbacks.append(ckpt_callback)

# Visualize Learning on Tensorboard
# ---------------------------------
tb_dir = os.path.join(exp_dir, 'tb_logs')
if not os.path.exists(tb_dir):
    os.makedirs(tb_dir)
    
# By default shows losses and metrics for both training and validation
tb_callback = tf.keras.callbacks.TensorBoard(log_dir=tb_dir,
                                             profile_batch=0,
                                             histogram_freq=1)  # if 1 shows weights histograms
callbacks.append(tb_callback)

# Early Stopping
# --------------
early_stop = True
if early_stop:
    es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=6)
    callbacks.append(es_callback)

### Model Training
We can now visualize the model by inspecting its summary and finally call the fit method in order to start the training.

In [None]:
# Model Summary
model.summary()

In [None]:
# TRAINING
model.fit(x=train_dataset,
          epochs=100,
          steps_per_epoch=len(train_gen),
          validation_data=valid_dataset,
          validation_steps=len(valid_gen), 
          callbacks=callbacks)

## Model Testing
Once the model is trained we can test it by calling the predict method on the images contained in the test dataset. In the end we build the .csv file used for the submissions.

In [None]:
def create_csv(results, model_name):
    csv_fname = model_name + '_results_'
    csv_fname += datetime.now().strftime('%b%d_%H-%M-%S') + '.csv'

    with open(os.path.join(res_dir, csv_fname), 'w') as f:

        f.write('Id,Category\n')

        for key, value in results.items():
            f.write(key + ',' + str(value) + '\n')

In [None]:
from PIL import Image

image_filenames = next(os.walk(test_dir + '/test'))[2]

results = {}
for image_name in image_filenames:
  img = Image.open(test_dir + '/test/' + image_name).convert('RGB')
  img = img.resize((img_h, img_w))

  img_array = np.array(img)
  img_array = img_array * 1. / 255
  img_array = np.expand_dims(img_array, 0)

  predictions = model.predict(img_array)
  predicted_class = np.argmax(predictions, axis=-1)

  predicted_class = predicted_class[0]
  results[image_name] = predicted_class

create_csv(results, model_name)