<a href="https://colab.research.google.com/github/nicologhielmetti/AN2DL-challenges/blob/master/challenge1/challenge_1_submission.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Data Preprocessing
The dataset provided for this challenge is composed by 6065 images of different height and width. In order to obtain a standard shape for all images, a particular function we used, namely `tf.keras.preprocessing.image.smart_resize()`.

This function resizes images to a specified target size without affecting the aspect ratio. This step is due to the successive batching of all images that requires a common format for all of them.
In order to avoid a loss of information that would have been caused by applying a crop, it has been computed what the max width/height of the images was by implementing `getMaxImageSize(dataset_dir)`.

This function returns the maximum (width,height), considering all the images of the provided dataset.
We also considered different shapes ranging from the mean and a standard one (256x256).
For what regards data augmentation and the splitting of the data into two distinct sets, namely train and validation, it has been used the following script which assigns a fraction of 30% of the imgaes to the validation set; the rest will be assigned to the training set.

```
train_data_gen = ImageDataGenerator ( rotation_range=10,            
                                      width_shift_range=10,
                                      height_shift_range=10,
                                      zoom_range=0.3, 
                                      horizontal_flip=True,
                                      fill_mode='reflect',
                                      rescale=1. / 255,
                                      validation_split=0.3,  
                                      preproc_funct=smart_resize()  
                                    )
```
Considering that a strict division of the images in subdirectories representing the target classes is required for the functions involved in the creation of augmented images, we implemented a specific function that requires a ```json_definition``` for the subdirectory construction and the path where such images are located:


# Model Design

Two main different approaches have been used in order to address the problem proposed for this challenge: the creation of custom models from scratch and the exploitation of already existing models with transfer learning and fine tuning. In the former solution we started with a very simple network composed only by a sequence of convolutional layers and relu activation functions. To overcome the poor results obtained it has been decided to increase the complexity of the model all along with different regularization procedures such as dropout and l2regularization. Despite our effort it has not been possible to achieve a satisfying score. So we moved to more suitable solutions that is fine tuning and transfer learning. Several different architectures have been tried with an increasing level of acceptability of the results in the range between 80 and 89% with respect to the validation accuracy. We noticed that with larger classifiers put on top of the backbone network a better val accuracy was achieved but we also noticed the divergence of the loss on the train and validation sets; this fact is a sign of overfitting and should be taken into account in the model selection phase. We perfomed a wide exploration for what regards learning rates in order to overcome the problem of local minima in the loss minimization.


# Model Selection

To select the best model among all the designed ones we firstly considered the score on the validation set. But this metric is generally not enough to choose the model that generalize better over the test set. To overcome this problem we considered also the difference between the validation and the training loss. Considering also this metric we chose five candidates for the submission on kaggle. Among the scores obtained with those models we picked the one with the highest public score for the final submission.

#Results
The best model resulted to be ____ with a score of ____

### The following part will be dedicated to show the implementation of our two best models

In [None]:
!pip install gdown
!gdown https://drive.google.com/uc?id=1Mv7vKoI-QL6kV-1TIDE7N67_L0LXvJAg
!unzip /content/ANDL2.zip

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import json
import os
import shutil
from datetime import datetime
from functools import partial

from PIL import Image

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorboard import program

SEED = 1996

In [None]:
def divideDatasetInTargetFolders(json_definition, dataset_path):
    for elem in json_definition:
        dest_dir = os.path.join(dataset_path, str(json_definition[elem]))
        if not os.path.isdir(dest_dir):
            os.mkdir(dest_dir)
        try:
            shutil.move(os.path.join(dataset_path, elem),
                        os.path.join(dest_dir, elem)
                        )
        except FileNotFoundError as e:
            print("File not found: " + str(e))
            continue
    os.mkdir(os.path.join(dataset_path, "augmented"))
    os.mkdir(os.path.join(dataset_path, "augmented/training"))
    os.mkdir(os.path.join(dataset_path, "augmented/validation"))


def getMaxImageSize(dataset_dir):
    max_w = 0
    max_h = 0
    path = os.path.join(os.getcwd(), dataset_dir)
    for filename in os.listdir(path):
        if filename.endswith(".jpg"):
            image = Image.open(os.path.join(path, filename))
            width, height = image.size
            max_w = width if width > max_w else max_w
            max_h = height if height > max_h else max_h
        else:
            print("This file -> " + filename + " is not .jpg")
    return max_w, max_h


def getMinImageSize(dataset_dir, max_w, max_h):
    min_w = max_w
    min_h = max_h
    for filename in os.listdir(dataset_dir):
        if filename.endswith(".jpg"):
            image = Image.open(os.path.join(dataset_dir, filename))
            width, height = image.size
            min_w = width if width < min_w else min_w
            min_h = height if height < min_h else min_h
        else:
            print("This file -> " + filename + " is not .jpg")
    return min_w, min_h

In [None]:
train_path = os.path.join(os.getcwd(), 'MaskDataset/training')
test_path  = os.path.join(os.getcwd(), 'MaskDataset/test')

In [None]:
division_dict = json.load(
  open(os.path.join(os.getcwd(), 'MaskDataset/train_gt.json'))
)

divideDatasetInTargetFolders(division_dict, train_path)

In [None]:
# remember to check both train and test datasets to be sure of max dimensions
max_w, max_h = max(getMaxImageSize(os.path.join(train_path, '0')),
                   getMaxImageSize(os.path.join(train_path, '1')),
                   getMaxImageSize(os.path.join(train_path, '2')))
print("Maximum width and height: " + str((max_w, max_h)))

min_w, min_h = min(getMinImageSize(os.path.join(train_path, '0'), max_w, max_h),
                   getMinImageSize(os.path.join(train_path, '1'), max_w, max_h),
                   getMinImageSize(os.path.join(train_path, '2'), max_w, max_h))
print("Minimum width and height:  " + str((min_w, min_h)))
print("Maximum width  expansion:  " + str(max_w - min_w) + ", increase ratio: " +
      str(float(min_w) / float(max_w - min_w)))
print("Maximum height expansion:  " + str(max_h - min_h) + ", increase ratio: " +
      str(float(min_h) / float(max_h - min_h)))
img_w = int(min_w + (max_w - min_w) / 2)
img_h = int(min_h + (max_h - min_h) / 2)

In [None]:
preproc_fun_fixed = partial(tf.keras.preprocessing.image.smart_resize, size=(img_w, img_h))

train_data_gen = ImageDataGenerator(rotation_range=10,
                                    width_shift_range=10,
                                    height_shift_range=10,
                                    zoom_range=0.3,
                                    horizontal_flip=True,
                                    fill_mode='reflect',
                                    rescale=1. / 255,
                                    validation_split=0.3,
                                    preprocessing_function=preproc_fun_fixed
                                    )

test_data_gen = ImageDataGenerator(rotation_range=10,
                                    width_shift_range=10,
                                    height_shift_range=10,
                                    zoom_range=0.3,
                                    horizontal_flip=True,
                                    fill_mode='reflect',
                                    rescale=1. / 255,
                                    preprocessing_function=preproc_fun_fixed
                                  )

classes = ['0', '1', '2']
save_dir = os.path.join(train_path, 'augmented')

import pandas as pd
images = [f for f in os.listdir(test_path)]
images = pd.DataFrame(images)
images.rename(columns = {0:'filename'}, inplace = True)
images["class"] = 'test'

bs = 32

train_gen = train_data_gen.flow_from_directory(train_path,
                                               target_size=(img_w, img_h),
                                               seed=SEED,
                                               classes=classes,
                                               #save_prefix='training_aug',
                                               #save_to_dir=os.path.join(save_dir, 'training'),
                                               subset='training',
                                               shuffle=True,
                                               batch_size=bs
                                               )

valid_gen = train_data_gen.flow_from_directory(train_path,
                                               target_size=(img_w, img_h),
                                               seed=SEED,
                                               classes=classes,
                                               #save_prefix='validation',
                                               #save_to_dir=os.path.join(save_dir, 'validation'),
                                               subset='validation',
                                               shuffle=False,
                                               batch_size=bs
                                               )

test_gen = test_data_gen.flow_from_dataframe(images,
                                             test_path,
                                             batch_size=bs,
                                             target_size=(img_w, img_h),
                                             class_mode='categorical',
                                             shuffle=False,
                                             seed=SEED
                                            )

# set the right order for predictions
test_gen.reset()

train_set = tf.data.Dataset.from_generator(lambda: train_gen,
                                           output_types=(tf.float32, tf.float32),
                                           output_shapes=(
                                               [None, img_w, img_h, 3],
                                               [None, len(classes)]
                                           ))

validation_set = tf.data.Dataset.from_generator(lambda: valid_gen,
                                                output_types=(tf.float32, tf.float32),
                                                output_shapes=(
                                                    [None, img_w, img_h, 3],
                                                    [None, len(classes)]
                                                ))

test_set = tf.data.Dataset.from_generator(lambda: test_gen,
                                          output_types=(tf.float32, tf.float32),
                                          output_shapes=(
                                              [None, img_w, img_h, 3],
                                              [None, len(classes)]
                                          ))

train_set.repeat()
validation_set.repeat()
test_set.repeat()

In [None]:
resnet = tf.keras.applications.ResNet50V2(weights='imagenet', include_top=False, input_shape=(img_w, img_h, 3))
#resnet.summary()
#freeze_until = 50

#for layer in resnet.layers[:freeze_until]:
#  layer.trainable = False

model_resnet = tf.keras.Sequential()

model_resnet.add(resnet)
model_resnet.add(tf.keras.layers.Flatten())
#model_resnet.add(tf.keras.layers.Dense(units=128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001)))
model_resnet.add(tf.keras.layers.Dense(units=64, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.0001)))
model_resnet.add(tf.keras.layers.Dense(units=32, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.0001)))
model_resnet.add(tf.keras.layers.Dense(units=len(classes), activation='softmax'))

# Visualize created model as a table
model_resnet.summary()

In [None]:
callbacks = []
early_stop = True
weights_path = os.getcwd() + '/drive/My Drive/models_ANN/weights_resnet_50_v2_ft_avg_img_size_fc3_smaller.h5'
if early_stop:
    es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=1, restore_best_weights=True)
    callbacks.append(es_callback)
    cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath = weights_path,
      verbose=1, save_best_only=True, save_weights_only=True)
    callbacks.append(cp_callback)

In [None]:
loss = tf.keras.losses.CategoricalCrossentropy()
# maybe explore learning rate solutions
lr = 5e-5
optimizer = tf.keras.optimizers.Adam(learning_rate=lr)
metrics = ['accuracy']
model_resnet.compile(optimizer=optimizer, loss=loss, metrics=metrics)

In [None]:
train = True
retrain = False
if train:
  if retrain:
    model_resnet.load_weights(weights_path)
  model_resnet.fit(x=train_set,
            epochs=100,  #### set repeat in training dataset
            steps_per_epoch=len(train_gen),
            validation_data=validation_set,
            validation_steps=len(valid_gen),
            callbacks=callbacks)
else:
  model_resnet.load_weights(weights_path)

In [None]:
finetuning = True

if finetuning:
    freeze_until = 1 # layer from which we want to fine-tune

    for layer in InceptionV3.layers[:freeze_until]:
        layer.trainable = False
else:
    InceptionV3.trainable = False

model_inception = tf.keras.Sequential()
model_inception.add(InceptionV3)
model_inception.add(tf.keras.layers.Flatten())
model_inception.add(tf.keras.layers.Dense(units=1024, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001)))
model_inception.add(tf.keras.layers.Dropout(0.2))
model_inception.add(tf.keras.layers.Dense(units=512, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001)))
model_inception.add(tf.keras.layers.Dropout(0.2))
model_inception.add(tf.keras.layers.Dense(units=256, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.001)))
model_inception.add(tf.keras.layers.Dropout(0.2))
model_inception.add(tf.keras.layers.Dense(units=num_classes, activation='softmax'))

# Visualize created model as a table
model_inception.summary()

# Visualize initialized weights
model_inception.weights
loss = tf.keras.losses.CategoricalCrossentropy()
# maybe explore learning rate solutions
lr = 1e-4
optimizer = tf.keras.optimizers.Adam(learning_rate=lr)
metrics = ['accuracy']
model_inception.compile(optimizer=optimizer, loss=loss, metrics=metrics)


train = True
retrain = False
if train:
  if retrain:
    model_inception.load_weights('/content/drive/My Drive/Inception_v3.h5')
  model_inception.fit(x=train_set,
            epochs=10,  #### set repeat in training dataset
            steps_per_epoch=len(train_gen),
            validation_data=validation_set,
            validation_steps=len(valid_gen),
            callbacks=callbacks)
else:
  model_inception.load_weights('/content/drive/My Drive/Inception_v3.h5')