# Image Semantic Segmentation using Pretrained model - FCN8-VGG16

Semantic Segmentation models has two phases in their architecture: a feature extraction phase and a Dense phase.
The feature extraction phase is generally composed of convolution and polling layers. Through these, we estimate many filters weights through numerous layers to extract as many feature as possible. We end up with a very deep NN and a good representation of features in a smaller space. And then by adding Dense layer like the Fully connected Dense Neural Networks(here with 8 layers), we not only flatten those shrinked featuress vectors, but we add an upsampling phase that helps in some ways to adapt and update the featurization phase. We end up having a good representation of the model.

Here we followed thse steps in our way of training our model: 

- **READING DATA**: we uploaded our data using a colab pro account with 25 GB of memory. The colab pro account really worh it. We encountered less disconnections and more memory usage. 25 GB is not enough, but we manage to. deal with it by re-assigning our data variable. Good trick though. We split our data with 4000 samples for training, 500 for validating and 500 for testing.

- **SEGMENT GENERATION**: we are recommended to only use 8 categories. For that we build a function that helps us gnerate new segments, by association some of the former ones. It is important to say that is only done on mask images and not on input images.

- **DATA AUGMENTATION**: as big fan of tensorflow, we wanted to use tensorflow ImageDataGeneration library, but it doesn't give us a hand or let's us see our generated files (we could save the generated files, but not efficient). And often we ended up have predictions that were not exploitatble. For taht reason end up using ALBUMENTIONS. A very efficient library that just double your data, by transforming it efficiently. We used horizontal flip, a little brightness change, and 10 degree rotation(just an inclinason of the image). 

- **TRAINING**: 50 epochs training using ADAM optimizer and early stopping based on the validattion loss and patience of 10 epochs. The model choose 5th epochs as the best fit.  

- **PREDICTION**: The prediction came naturally by imitating the batching process of the training set on the testing set. We ended up using a batch equal 1 in Training since the compute can handle it. And we did that  after having problem in return prediction value from a the servie gateway. Because with a batch size of 32, 32 images has to be scored at once, na dthey have to pass through the API geteway. And therefore, we had to increase the API gateway, 2GB of memory was not enough. For that we just decided to decrease then batch size to one and let the model to pass the whole dataset at each epoch.  

In [82]:
# LIBS
import tensorflow as tf
import numpy as np

from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import MaxPooling2D
from tensorflow.keras.layers import Dropout 
from tensorflow.keras.layers import Conv2DTranspose
from tensorflow.keras.layers import concatenate

from tensorflow.keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator

from tensorflow.keras.callbacks import  EarlyStopping

import imageio
import os
import seaborn as sns
import shutil

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
import os
import zipfile

local_zip = 'drive/MyDrive/AI_LAB/gtFine.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall()
zip_ref.close()

In [None]:
local_zip = 'drive/MyDrive/AI_LAB/utils.zip'
zip_ref = zipfile.ZipFile(local_zip, 'r')
zip_ref.extractall()
zip_ref.close()

In [2]:
# UTILS 

from utils_fcn_vgg_model import *
from utils_funct_vgg import *

# READING THE DATA

In [66]:
main_path = './gtFine/'
train_path_image='train_images/'
train_path_mask='train_masks/'

val_path_image='val_images/'
val_path_mask='val_masks/'

test_path_image = 'test_images/'
test_path_mask = 'test_masks/'

In [67]:
train_image_list, train_mask_list  = getFilesPathsList(main_path, train_path_image, train_path_mask )


val_image_list, val_mask_list = getFilesPathsList(main_path, val_path_image, val_path_mask )


test_image_list , test_mask_list  = getFilesPathsList(main_path, test_path_image, test_path_mask)


# add 1025 images files list to the train set from test sets
train_image_list , test_image_list  = train_image_list + test_image_list[:1025] , test_image_list[1025:]
train_mask_list , test_mask_list  = train_mask_list + test_mask_list[:1025] , test_mask_list[1025:]

In [69]:
len(train_image_list) , len(train_mask_list)

(2975, 2975)

In [None]:
# https://cs230.stanford.edu/blog/datapipeline/

In [None]:
# READ FROM RAW
val_image, val_mask = readDataImages(val_image_list, 
                             val_mask_list)

test_image, test_mask = readDataImages(test_image_list, 
                             test_mask_list)

train_image, train_mask = readDataImages(train_image_list, 
                             train_mask_list)

In [88]:
# SEG GEN
val_image, val_mask = stackDataSets(val_image, val_mask)

train_image, train_mask= stackDataSets(train_image, train_mask)

In [None]:
# AUGMENT AND MERGE 
# OPTION TO AUG OR NOT

val_image, val_mask = augConcat(val_image.numpy(), val_mask.numpy()) 

train_image, train_mask = augConcat(train_image.numpy(), train_mask.numpy() ) 

In [None]:
train_image.shape , val_image.shape

In [94]:
class_names=[x for x in cats.keys()]
class_names

['void', 'flat', 'construction', 'object', 'nature', 'sky', 'human', 'vehicle']

In [89]:
val_image.shape, val_mask.shape

(TensorShape([500, 224, 224, 3]), TensorShape([500, 224, 224, 1]))

In [90]:
# Convert Numpy data to DataSets for training (a choice for convenience)
val_dataset=tf.data.Dataset.from_tensor_slices((val_image, val_mask ))

train_dataset=tf.data.Dataset.from_tensor_slices((train_image, train_mask ))

# TRAINING

In [None]:
# LOAD WEIGHT IF NOT
#import tensorflow as tf
#from tensorflow.keras import layers, Model
#!wget https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5

#from utils_fcn_vgg_model import *

model_vgg_fcn = segmentation_model(n_classes=8)

In [None]:
#model_vgg_fcn.summary()


tf.config.run_functions_eagerly(True)

sgd = tf.keras.optimizers.SGD(learning_rate=1E-2,
                              momentum=0.9, 
                              nesterov=True)

model_vgg_fcn.compile(loss='categorical_crossentropy',
                      optimizer=sgd,
                      metrics=['accuracy'] )

In [None]:
train_dataset1=train_dataset.take(2000)
val_dataset1=val_dataset.take(300)


In [None]:
EPOCHS = 5
VAL_SUBSPLITS = 5
BUFFER_SIZE = 250
BATCH_SIZE = 32

#train_data_unet1 = train_data_unet.batch(BATCH_SIZE)
#val_data_unet1 = val_data_unet.batch(BATCH_SIZE)

train_data_vgg = train_dataset1.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE)
val_data_vgg = val_dataset1.cache().shuffle(BUFFER_SIZE).batch(BATCH_SIZE)



print(train_data_vgg.element_spec)

model_history = model_vgg_fcn.fit(train_data_vgg, 
          epochs=EPOCHS, 
          validation_data=val_data_vgg, 
          verbose=2,
          callbacks=[EarlyStopping(
              patience=3,
              min_delta=0.05,
              baseline=0.8,
              mode='min',
              monitor='val_loss',
              restore_best_weights=True,
              verbose=1)
          ])

# PREDICTION

In [None]:
for i, j in val_data_vgg.take(1):
    y_true_img = i
    y_true_seg = j
    
print(y_true_img.shape,y_true_seg.shape )

In [None]:
tf.data.experimental.enable_debug_mode()

In [None]:
# Make Prediction
# get the model prediction
results = model_vgg_fcn.predict(y_true_img)
                        #, steps=validation_steps)

# for each pixel, get the slice number which has the highest probability
results = np.argmax(results, axis=3)

# collapse the 8D to 2D plane for true values
y_true_seg = np.argmax(y_true_seg, axis=3)

In [None]:
results.shape , results[0].shape, np.max(results[0])

In [None]:
y_true_seg.shape , y_true_seg[0].shape, np.max(y_true_seg[0])

In [None]:

# generate a list that contains one color for each class
#colors = sns.color_palette(None, len(cats))


In [None]:
integer_slider=10

In [None]:
# visualize the output and metrics
colors = sns.color_palette(None, 8)

# compute metrics
iou, dice_score = compute_metrics(y_true_seg[integer_slider], 
                                  results[integer_slider], n_classes=8) 

show_predictions(y_true_img[integer_slider], 
                 [results[integer_slider], y_true_seg[integer_slider]],
                 ["Image", "Predicted Mask", "True Mask"], 
                 iou, 
                 dice_score)