# Artifical Neural Networks & Deep Learning
# Homework 1 - Image Classification

**Developement Team:**
- Acquati Marco - 10583134 
- Brugali Giorgio - 10794550
- Puoti Francesco - 10595640 


# *1. Data acquisition and augmentation*

> This topic has been issued with annotations as the code flows down, to better clarify the correspondence between the explanations and the code snippets 


# *2. Model overview*


> ***2.1. Features' Extraction***

>> The feature extractor is composed by *5 blocks*, each composed by:
  - one convolutional part
  - one  MaxPooling layer.
  
>>The convolutional part has a variable number of convolutional layers: 2 Conv2D layers for the first two blocks and 3 Conv2D layers for the remaining ones.
This choice was dictated with the intention of reducing the parameters' number in the network.Our idea was to start with small filters 3x3 because of the need to detail the features' extraction at the beginning. Afterwards, instead of increasing the filters' size, to avoid having possible distorted features that could result too effective in the learning process, we decided to augment the number of convolutional layers.
Regarding the activation function, we chose the ReLU activation function as it is more suitable for the classification problem.

>>BatchNormalization has been involved in the features' extraction part in order to both improve the stability of our network and to reduce covariance shift, the latter resulting in improving the training velocity as well. 

>>The pool size has been set to 3x3 with stride 2x2 to favor the overlapping: it has been demonstrated that the overlapping pooling areas reduce the likelihood of the network to overfit. 

> ***2.2. BottleNeck Layer***
>> In order to reduce the computational load of the network, 
the number of the features extractor’s output channels is reduced by adding a 1x1 convolutional layer before feeding the output to the classifier.

> ***2.3. Classifier***
>> - one flatten layer
>> - two dense hidden layers with 2048 neurons each
>> - the output layer with the SOFTMAX activation function and three classes
>> It's worth highlighting the use of weight initialization (HeNormal distribution), which aims at improving the network speed, avoiding too many zeroes in the kernels at the beginning of the learning process.
>> Moreover, weight decay has been implemented to reduce overfitting in the Dense layers.
>> Both BatchNormalization and ReLU have been used for the same purpose as in the features' extraction part.

> ***2.4. Optimizer & LossFunction***
>> - Adam, with a starting learning rate of 1e-3 and amsgrad = True to have an adaptive learning rate, so as to prevent the network from being stuck on a suboptimal solution.
>> - Loss function : Categorical Crossentropy.

> ***2.5. Further information about the implemention process***
>> No EarlyStopping has been used in the final model as, after some trials, such model got stopped even though the learning process would have subsequently led to noteworthy improvements.
>> Model checkpoints could not be implemented due to memory shortage.


In [1]:
import numpy as np 
import pandas as pd 
import tensorflow as tf

SEED = 1234
tf.random.set_seed(SEED)

In [2]:
import os
import json
import operator

# Since the training images were not divided in subfolders, 
#     we had to manage the data acquisition by means of data frame. 
# First, we acquired the images' paths an we sorted them in order to create 
#     a correspondence between each image and its label stored in the json file.
#-------------------------------------------------------------------------------

X = [] #list of images' paths
for dirname, _, filenames in os.walk('../input/artificial-neural-networks-and-deep-learning-2020/MaskDataset/training'):
    filenames.sort()
    for filename in filenames:
           X.append(os.path.join(dirname, filename))

with open('/kaggle/input/artificial-neural-networks-and-deep-learning-2020/MaskDataset/train_gt.json') as f:

 data = json.load(f)

data = sorted(data.items(), key=operator.itemgetter(0))

y = [] #list of target labels
for i in range(len(data)):
    y.append(str(data[i][1])) #il dataframe vuole delle string

In [3]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# We decided to apply data augmentation on the training set not because of the data shortage
# but in order to make our model more flexibile on recognizing objects in different positions and dimensions.
#------------------------------------------------------------------------------------------------------------

train_data_gen = ImageDataGenerator(rotation_range=10,
                                    width_shift_range=10,
                                    height_shift_range=10,
                                    zoom_range=0.3,
                                    horizontal_flip=True,
                                    vertical_flip=True,
                                    fill_mode='nearest',
                                    rescale=1./255)


# No data aumentation has been applied on validation set, since we want to have the images meant for validation 
# similar to the test images to find out the features of our model
#--------------------------------------------------------------------------------------------------------------
valid_data_gen = ImageDataGenerator(rescale=1./255)

In [4]:
from sklearn.model_selection import train_test_split

# Data set split in training and validation sets 
# with the latter having a size equal to the 20% of the entire data set.
#-----------------------------------------------------------------------
X_train, X_valid, y_train, y_valid = train_test_split(X, y, test_size=0.25)


# Generation of training and validation dataframes using pandas' library
#------------------------------------------------------------------------
dataframe_train = pd.DataFrame({"input": X_train, "target": y_train})
dataframe_valid = pd.DataFrame({"input": X_valid, "target": y_valid})


In [5]:
# Batch size
bs = 32

# img shape
img_h = 256
img_w = 256

num_classes = 3

classes = ['0', '1', '2']
clmode = "rgb"

# Creation of the DataFrameIterators yielding tuples of (x, y) 
# where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) 
# and y is a numpy array of corresponding labels.
#------------------------------------------------------------------------------------------------------

train_datagen = train_data_gen.flow_from_dataframe(
      dataframe = dataframe_train,
      directory = './',
      x_col = "input",
      y_col = "target",
      target_size = (img_h, img_w),
      color_mode = clmode,
      classes = classes,
      class_mode = "categorical",
      batch_size = bs,
      shuffle = True,
      seed = SEED
)

valid_datagen = valid_data_gen.flow_from_dataframe(
      dataframe = dataframe_valid,
      directory = './',
      x_col = "input",
      y_col = "target",
      target_size = (img_h, img_w),
      color_mode = clmode,
      classes = classes,
      class_mode = "categorical",
      batch_size = bs,
      shuffle = True,
      seed = SEED
)

Found 4210 validated image filenames belonging to 3 classes.
Found 1404 validated image filenames belonging to 3 classes.


In [8]:
# Create Dataset objects
# ----------------------

# Training
train_dataset = tf.data.Dataset.from_generator(lambda: train_datagen,
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes])) #None for dynamic bs, 
train_dataset = train_dataset.repeat()

# Validation
# ----------
valid_dataset = tf.data.Dataset.from_generator(lambda: valid_datagen, 
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([None, img_h, img_w, 3], [None, num_classes]))
valid_dataset = valid_dataset.repeat()

In [18]:
#----------CALLBACKS' SELECTION--------
#--------------------------------------

from datetime import datetime
callbacks = []
now = datetime.now().strftime('%b%d_%H-%M-%S')

exp_dir = './logs'
if not os.path.exists(exp_dir):
    os.makedirs(exp_dir)

exp_dir = os.path.join(exp_dir, '_' + str(now))
if not os.path.exists(exp_dir):
    os.makedirs(exp_dir)

# Model checkpoint
# ----------------
ckpt_dir = os.path.join(exp_dir, 'checkpoint')
if not os.path.exists(ckpt_dir):
    os.makedirs(ckpt_dir)    
ckpt_callback = tf.keras.callbacks.ModelCheckpoint(filepath=(ckpt_dir + '/model_' + 'cp_{epoch:02d}.h5'), 
                                            monitor='val_accuracy', verbose=1,
                                            save_best_only=True, mode='max') 
callbacks.append(ckpt_callback)


# Early Stopping
# --------------
es = False
if es:
    es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
    callbacks.append(es_callback)

In [19]:
import operator
    
model = tf.keras.Sequential()

depth = 5
start_f = 64
ker_sz = (3,3)
ker_str = (1,1)
pool_sz = (3,3)

initializer = tf.keras.initializers.he_normal(seed=SEED)
regularizer = tf.keras.regularizers.l2(0.001) #weight decay
dpout = 0.5

convNumb = 2

# Features extraction
# -------------------
for i in range(depth):

    if i == 0:
        input_shape = [img_h, img_w, 3]
    else:
        input_shape=[None]
        
    if i == 2:
        convNumb += 1 #increasing number of Conv2D layer in a block
        
    for j in range(convNumb):
        model.add(tf.keras.layers.Conv2D(filters=start_f, 
                                 kernel_size=ker_sz,
                                 strides=ker_str,
                                 padding='same',
                                 input_shape=input_shape))        
        model.add(tf.keras.layers.BatchNormalization())
        model.add(tf.keras.layers.ReLU())
    model.add(tf.keras.layers.MaxPool2D(pool_sz, strides = 2))
    if start_f < 512:
        start_f *= 2

#Bottle neck layer
#-----------------
model.add(tf.keras.layers.Conv2D(filters=start_f/2, 
                              kernel_size=(1,1),
                              strides=ker_str,
                              padding='same',
                              input_shape=input_shape))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU())     

#Classifier
#----------
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=2048, 
                                kernel_regularizer = regularizer,
                                kernel_initializer = initializer))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.ReLU()) 
model.add(tf.keras.layers.Dropout(dpout))
model.add(tf.keras.layers.Dense(units=2048,
                                kernel_regularizer=regularizer, 
                                kernel_initializer = initializer))
model.add(tf.keras.layers.ReLU())
model.add(tf.keras.layers.Dense(units=num_classes, activation='softmax'))

In [20]:
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3, amsgrad=True), 
              loss=tf.keras.losses.CategoricalCrossentropy(), 
              metrics=['accuracy'])

model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_14 (Conv2D)           (None, 256, 256, 64)      1792      
_________________________________________________________________
batch_normalization_15 (Batc (None, 256, 256, 64)      256       
_________________________________________________________________
re_lu_16 (ReLU)              (None, 256, 256, 64)      0         
_________________________________________________________________
conv2d_15 (Conv2D)           (None, 256, 256, 64)      36928     
_________________________________________________________________
batch_normalization_16 (Batc (None, 256, 256, 64)      256       
_________________________________________________________________
re_lu_17 (ReLU)              (None, 256, 256, 64)      0         
_________________________________________________________________
max_pooling2d_5 (MaxPooling2 (None, 127, 127, 64)     

In [None]:
model.fit(x=train_dataset,
          epochs=200,
          steps_per_epoch=len(train_datagen),
          validation_data=valid_dataset,
          validation_steps=len(valid_datagen),
          callbacks=callbacks
         )

Epoch 1/200
Epoch 00001: val_accuracy improved from -inf to 0.33120, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_01.h5
Epoch 2/200
Epoch 00002: val_accuracy improved from 0.33120 to 0.34544, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_02.h5
Epoch 3/200
Epoch 00003: val_accuracy did not improve from 0.34544
Epoch 4/200
Epoch 00004: val_accuracy did not improve from 0.34544
Epoch 5/200
Epoch 00005: val_accuracy improved from 0.34544 to 0.35256, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_05.h5
Epoch 6/200
Epoch 00006: val_accuracy did not improve from 0.35256
Epoch 7/200
Epoch 00007: val_accuracy did not improve from 0.35256
Epoch 8/200
Epoch 00008: val_accuracy improved from 0.35256 to 0.65242, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_08.h5
Epoch 9/200
Epoch 00009: val_accuracy did not improve from 0.65242
Epoch 10/200
Epoch 00010: val_accuracy improved from 0.65242 to 0.66239, saving model to ./logs/_Nov22_10-44-52/checkpoi

Epoch 00027: val_accuracy did not improve from 0.81695
Epoch 28/200
Epoch 00028: val_accuracy did not improve from 0.81695
Epoch 29/200
Epoch 00029: val_accuracy improved from 0.81695 to 0.82906, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_29.h5
Epoch 30/200
Epoch 00030: val_accuracy did not improve from 0.82906
Epoch 31/200
Epoch 00031: val_accuracy did not improve from 0.82906
Epoch 32/200
Epoch 00032: val_accuracy improved from 0.82906 to 0.83048, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_32.h5
Epoch 33/200
Epoch 00033: val_accuracy did not improve from 0.83048
Epoch 34/200
Epoch 00034: val_accuracy did not improve from 0.83048
Epoch 35/200
Epoch 00035: val_accuracy did not improve from 0.83048
Epoch 36/200
Epoch 00036: val_accuracy did not improve from 0.83048
Epoch 37/200
Epoch 00037: val_accuracy did not improve from 0.83048
Epoch 38/200
Epoch 00038: val_accuracy did not improve from 0.83048
Epoch 39/200
Epoch 00039: val_accuracy improved from 0.83

Epoch 55/200
Epoch 00055: val_accuracy did not improve from 0.87749
Epoch 56/200
Epoch 00056: val_accuracy did not improve from 0.87749
Epoch 57/200
Epoch 00057: val_accuracy did not improve from 0.87749
Epoch 58/200
Epoch 00058: val_accuracy did not improve from 0.87749
Epoch 59/200
Epoch 00059: val_accuracy improved from 0.87749 to 0.88177, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_59.h5
Epoch 60/200
Epoch 00060: val_accuracy did not improve from 0.88177
Epoch 61/200
Epoch 00061: val_accuracy did not improve from 0.88177
Epoch 62/200
Epoch 00062: val_accuracy did not improve from 0.88177
Epoch 63/200
Epoch 00063: val_accuracy did not improve from 0.88177
Epoch 64/200
Epoch 00064: val_accuracy did not improve from 0.88177
Epoch 65/200
Epoch 00065: val_accuracy did not improve from 0.88177
Epoch 66/200
Epoch 00066: val_accuracy did not improve from 0.88177
Epoch 67/200
Epoch 00067: val_accuracy did not improve from 0.88177
Epoch 68/200
Epoch 00068: val_accuracy did not

Epoch 00083: val_accuracy did not improve from 0.88818
Epoch 84/200
Epoch 00084: val_accuracy did not improve from 0.88818
Epoch 85/200
Epoch 00085: val_accuracy did not improve from 0.88818
Epoch 86/200
Epoch 00086: val_accuracy did not improve from 0.88818
Epoch 87/200
Epoch 00087: val_accuracy did not improve from 0.88818
Epoch 88/200
Epoch 00088: val_accuracy did not improve from 0.88818
Epoch 89/200
Epoch 00089: val_accuracy did not improve from 0.88818
Epoch 90/200
Epoch 00090: val_accuracy did not improve from 0.88818
Epoch 91/200
Epoch 00091: val_accuracy did not improve from 0.88818
Epoch 92/200
Epoch 00092: val_accuracy did not improve from 0.88818
Epoch 93/200
Epoch 00093: val_accuracy did not improve from 0.88818
Epoch 94/200
Epoch 00094: val_accuracy did not improve from 0.88818
Epoch 95/200
Epoch 00095: val_accuracy did not improve from 0.88818
Epoch 96/200
Epoch 00096: val_accuracy did not improve from 0.88818
Epoch 97/200
Epoch 00097: val_accuracy did not improve from 0

Epoch 112/200
Epoch 00112: val_accuracy did not improve from 0.89316
Epoch 113/200
Epoch 00113: val_accuracy improved from 0.89316 to 0.89530, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_113.h5
Epoch 114/200
Epoch 00114: val_accuracy did not improve from 0.89530
Epoch 115/200
Epoch 00115: val_accuracy did not improve from 0.89530
Epoch 116/200
Epoch 00116: val_accuracy did not improve from 0.89530
Epoch 117/200
Epoch 00117: val_accuracy did not improve from 0.89530
Epoch 118/200
Epoch 00118: val_accuracy did not improve from 0.89530
Epoch 119/200
Epoch 00119: val_accuracy did not improve from 0.89530
Epoch 120/200
Epoch 00120: val_accuracy did not improve from 0.89530
Epoch 121/200
Epoch 00121: val_accuracy did not improve from 0.89530
Epoch 122/200
Epoch 00122: val_accuracy did not improve from 0.89530
Epoch 123/200
Epoch 00123: val_accuracy did not improve from 0.89530
Epoch 124/200
Epoch 00124: val_accuracy did not improve from 0.89530
Epoch 125/200
Epoch 00125: val_a

Epoch 140/200
Epoch 00140: val_accuracy improved from 0.90100 to 0.90670, saving model to ./logs/_Nov22_10-44-52/checkpoint/model_cp_140.h5
Epoch 141/200
Epoch 00141: val_accuracy did not improve from 0.90670
Epoch 142/200
Epoch 00142: val_accuracy did not improve from 0.90670
Epoch 143/200
Epoch 00143: val_accuracy did not improve from 0.90670
Epoch 144/200
Epoch 00144: val_accuracy did not improve from 0.90670
Epoch 145/200
Epoch 00145: val_accuracy did not improve from 0.90670
Epoch 146/200
Epoch 00146: val_accuracy did not improve from 0.90670
Epoch 147/200
Epoch 00147: val_accuracy did not improve from 0.90670
Epoch 148/200
Epoch 00148: val_accuracy did not improve from 0.90670
Epoch 149/200
Epoch 00149: val_accuracy did not improve from 0.90670
Epoch 150/200
Epoch 00150: val_accuracy did not improve from 0.90670
Epoch 151/200
Epoch 00151: val_accuracy did not improve from 0.90670
Epoch 152/200
Epoch 00152: val_accuracy did not improve from 0.90670
Epoch 153/200
Epoch 00153: val_a

Epoch 169/200
Epoch 00169: val_accuracy did not improve from 0.90670
Epoch 170/200
Epoch 00170: val_accuracy did not improve from 0.90670
Epoch 171/200
Epoch 00171: val_accuracy did not improve from 0.90670
Epoch 172/200
Epoch 00172: val_accuracy did not improve from 0.90670
Epoch 173/200
Epoch 00173: val_accuracy did not improve from 0.90670
Epoch 174/200
Epoch 00174: val_accuracy did not improve from 0.90670
Epoch 175/200
Epoch 00175: val_accuracy did not improve from 0.90670
Epoch 176/200
Epoch 00176: val_accuracy did not improve from 0.90670
Epoch 177/200
Epoch 00177: val_accuracy did not improve from 0.90670
Epoch 178/200
Epoch 00178: val_accuracy did not improve from 0.90670
Epoch 179/200
Epoch 00179: val_accuracy did not improve from 0.90670
Epoch 180/200
Epoch 00180: val_accuracy did not improve from 0.90670
Epoch 181/200
Epoch 00181: val_accuracy did not improve from 0.90670
Epoch 182/200
Epoch 00182: val_accuracy did not improve from 0.90670
Epoch 183/200
Epoch 00183: val_acc

In [None]:
#--------------Saving the model---------------
#---------------------------------------------


from datetime import datetime

savedir = os.path.join('./savedModel'+ datetime.now().strftime('%b%d_%H-%M-%S'))

if not os.path.exists(savedir):
    os.makedirs(savedir) 

model.save(savedir)

In [6]:
clmode = "rgb"
source = '../input/artificial-neural-networks-and-deep-learning-2020/MaskDataset'

test_data_gen = ImageDataGenerator(rescale = 1./255)

test_datagen = test_data_gen.flow_from_directory(
    source,
    target_size = (256, 256),
    color_mode = clmode,
    classes =  ["test"],
    class_mode = "categorical",
    batch_size = 1,
    shuffle = False
)

test_datagen.reset()

Found 450 images belonging to 1 classes.


In [None]:
predictions = model.predict_generator(test_datagen, len(test_datagen), verbose = 1)
result = {}

In [None]:
import ntpath

images = test_datagen.filenames
i = 0

for p in predictions:
    prediction = np.argmax(p)
    image_name = ntpath.basename(images[i])
    result[image_name] = str(prediction)
    i = i+1


In [None]:
def create_csv(results):

    csv_fname = 'results_'
    csv_fname += datetime.now().strftime('%b%d_%H-%M-%S') + '.csv'

    with open(os.path.join('./', csv_fname), 'w') as f:

        f.write('Id,Category\n')

        for key, value in results.items():
            f.write(key + ',' + str(value) + '\n')

In [None]:
create_csv(result)