# Semantic Segmentation
The task comprises segmenting input images so as to identify crops, weeds and soil (background). Four different picture types, called teams, are available in the dataset and for each type of picture there is a further distinction between Haricot and Mais.   
We chose to work on Mais pictures from the Weedelec team.
The pictures from this team are very high resolution which is one of the main challenges due to the limitation of computational capabilities in training Deep Neural Networks.  
Our work builds on the Unet architecture by adding different modules and experimenting with different depths. On top of the architectures we devised we also created an infrastructure to train those architectures either by patching or by downsampling.  
Patching was tested with and without overlapping patches. Overlaps were solved by taking the minimum because the background is the most frequent class in the images so taking the minimum can be beneficial when guessing the value of a pixel.  
Downsampling instead was used so as to comply with GPU memory limit by reducing both image size and batch size and finding a balance between the two. In prediction phase we upsample the predicted masks with the nearest neighbor interpolation.  
As a loss we chose the Sparse Focal loss because our targets are not one hot encoded so we needed a sparse loss and on top of this we also need to take into account class imbalance (see https://arxiv.org/abs/2006.14822 for further details on this).  
As an optimizer we chose Lazy Adam which is well versed when dealing with sparse updates (see https://www.tensorflow.org/addons/tutorials/optimizers_lazyadam).  
Network Architectures:
For some of our network architectures we implemented the Renet layer (as seen in https://arxiv.org/abs/1511.07053) by substantially adapting the implementations found in https://github.com/fvisin/reseg and https://github.com/hydxqing/ReNet-pytorch-keras-chapter3 . Our implementation employs tensorflow and keras specific methods to deliver a performant and compatible implementation. Generally speaking we used this module so as to break down in smaller patches the incoming feature maps and get local information.  
- UNet: depth parameter controls the number of convolutions of decoder-encoder path so with depth 4 we would have a 4 convolution decoder and a 4 convolution encoder. start_f parameter controls the size of the first convolution which is doubled at each subsequenct convolution.   
Results
  - input=256x256 downsampled, depth=4 start_f=32: 0.37
  - input=512x512 tiling (patching with no overlap), depth=5 start_f=32: 0.69
- DuccioNet: this a Unet which in the section between encoder and decoder uses three renet layers so as to better identify weed which is very high resolution information by dividing feature maps in smaller patches.  
Results:
  - input=1152x768 downsampled, depth=4 start_f=32: 0.64
  - input=864x864 patching with overlap=54, depth=5 start_f=32: 0.67
- FrebianiNet: this is a DuccioNet which has in parallel with the renet layers  three atrous convolutions which output is concatenated to that of the three renet layers. Atrous convolutions at the bottom of the net are inspired from https://ieeexplore.ieee.org/document/8999616.   
Results:
  - input=1152x768 downsampled, depth=4 start_f=32: 0.64
  - input=864x864 patching with overlap=54, depth=5 start_f=32: 0.69 (should have ran for more epochs but training is very computationally expensive and hard to carry out on colab) 

In [None]:
%tensorflow_version 2.x
from google.colab import drive
drive.mount('/content/drive')
!cp "/content/drive/My Drive/Development_Dataset.zip" .
!mkdir Results
!unzip -q Development_Dataset.zip
!pip install focal-loss

In [None]:
# Cell output set up for Jupyter
from pathlib import Path
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
from itertools import cycle
from focal_loss import SparseCategoricalFocalLoss

In [None]:
import tensorflow as tf
import tensorflow_addons as tfa
import os
from PIL import Image
import numpy as np
from tensorflow.keras.layers import Conv2D,MaxPool2D,Cropping2D,Concatenate,Conv2DTranspose, Activation, BatchNormalization, Dropout
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow_addons.layers import WeightNormalization


SEED = 1234
tf.random.set_seed(SEED)  

#Config


In [None]:
dataset_name = "Development_Dataset"
folder_name = "Dataset"
crop_list = ["Haricot", "Mais"]
team_list = ["Bipbip", "Pead", "Roseau", "Weedelec"]
curr_crop = "Mais"
curr_team = "Weedelec"
curr_patch_size = 864
img_h = 768
img_w = 1152
img_folder = "Images"
ann_folder = "Annotations"
ratio = 0.8
batch_size = 4

def change_config(new_width = img_w , new_height = img_h ,
                  new_team = curr_team, new_crop = curr_crop , new_patch_size = curr_patch_size,
                  new_ratio = ratio, new_batch_size = batch_size):
  if new_width:
    global img_w
    img_w = new_width
  if new_height:  
    global img_h 
    img_h = new_height
  if new_team:
    global curr_team 
    curr_team = new_team
  if new_crop:
    global curr_crop
    curr_crop = new_crop
  if new_patch_size:
    global curr_patch_size 
    curr_patch_size = new_patch_size
    img_h = curr_patch_size 
    img_w = curr_patch_size 
  global img_folder
  global ann_folder
  img_folder = "Image_Patches" if new_patch_size else "Images"
  ann_folder = "Annotation_Patches" if new_patch_size else "Annotations"
  if new_ratio:
    global ratio
    ratio = new_ratio
  if new_batch_size:
    global batch_size
    batch_size = new_batch_size
change_config()
def print_config():
  print("The current batch size is: " + str(batch_size) + "\n"
      "The shape of the image is: (" + str(img_w) + ',' + str(img_h) + ")\n"
      "The current team is: " + curr_team + "\n"
      "The current crop is: " + curr_crop + "\n"
      "The current patch size is: " + str(curr_patch_size) + "\n"
      "The current split ratio is: "+ str(ratio) + "\n"
        )
  
print_config()

The current batch size is: 4
The shape of the image is: (864,864)
The current team is: Weedelec
The current crop is: Mais
The current patch size is: 864
The current split ratio is: 0.8



# Building Directory structure

In [None]:
def build_directory_structure():
  Path().joinpath(folder_name).mkdir(parents=True, exist_ok=True)
  Path().joinpath(folder_name,"Annotations").mkdir(parents=True, exist_ok=True)
  Path().joinpath(folder_name,"Images").mkdir(parents=True, exist_ok=True)
  Path().joinpath(folder_name,"Splits").mkdir(parents=True, exist_ok=True)
  Path().joinpath(folder_name,"Splits","train.txt").open("w+").close()
  Path().joinpath(folder_name,"Splits","val.txt").open("w+").close()

def fill_directory(team = curr_team ,crop = curr_crop ,patch = None):
  #Checking if team and crop are specified, if not it is considered global
  if not team:
    if not crop:
      for t,c in zip(team_list, cycle(crop_list)):
        fill_directory(t,c)
      return
    for t in team_list:
      fill_directory(t,crop)
    return
  if not crop:
    for c in crop_list:
      fill_directory(team,c)
    return
  #Implementing the method for crop and team specified
  team_crop_directory = Path().joinpath(dataset_name,"Training",team,crop)
  for i,j in [("Images","Images"),("Masks","Annotations")]:
    for ext in ["*.jpg","*.png"]:
      for path in team_crop_directory.joinpath(i).glob(ext):
        file_destination = str(Path().joinpath(folder_name,j,path.name))
        path.rename(file_destination)
  if patch:
    patchify(patch,
             [(Path().joinpath(folder_name,"Images"),Path().joinpath(folder_name,"Image_Patches")),
              (Path().joinpath(folder_name,"Annotations"),Path().joinpath(folder_name,"Annotation_Patches"))])

def split(ratio: float = ratio,image_folder = img_folder):
  images = Path().joinpath(folder_name,image_folder)
  num_images = len([x for x in images.glob("*")])
  training_num = int(num_images * ratio)
  train = Path().joinpath(folder_name,"Splits","train.txt").open("a")
  val = Path().joinpath(folder_name,"Splits","val.txt").open("a")
  i = 0
  for img in images.glob("*"):
    if i < training_num:
      train.write(img.stem + "\n")
    else:
      val.write(img.stem + "\n")
    i += 1

def patchify(patch_size,transfer_list):
   if not patch_size:
     return
   for i,j in transfer_list:
    Path().joinpath(j).mkdir(parents=True, exist_ok=True)
    for ext in ["*.jpg","*.png"]:
      for path in Path().joinpath(i).glob(ext):
        k = 0
        img = Image.open(path)
        if img.size[0] < patch_size or img.size[1] < patch_size:
          print("Image from " + str(path) + " of shape " + str(img.size) + " too small for patch size " + str(patch_size) + " resizing...")
          img = img.resize((patch_size, patch_size))
          
        new_patches =patchify_overlap(np.array(img),(patch_size,patch_size),54)
        for m in new_patches:
          m = Image.fromarray(m)
          m.save(Path().joinpath(j, path.stem + '_' + str(k) + path.suffix))
          k +=  1
        
       # for k in range(0,img.width // patch_size):
       #   for s in range(0,img.height // patch_size):
       #     left = k*patch_size
       #     upper = s*patch_size
       #     patch = img.crop((left,upper,left+patch_size,upper+patch_size))
       #     patch.save(Path().joinpath(j, path.stem + '_' + str(k) + '-' + str(s) + path.suffix))


In [None]:
import time
import numpy as np
from numpy.lib.stride_tricks import as_strided


def window_nd(a, window, steps = None, axis = None, outlist = False):
    ashp = np.array(a.shape)
    if a.shape[0] < window[0] or a.shape[1] < window[1]:
      print("Patch size" + str(window) + " too large for this picture of size " + str(a.shape))
      out = np.empty((1,a.shape[0],a.shape[1],3),dtype=np.uint8)
      out[0] = a
      return out
      

    if axis != None:
        axs = np.array(axis, ndmin = 1)
    else:
        axs = np.arange(ashp.size)

    window = np.array(window, ndmin = 1)
    wshp = ashp.copy()
    wshp[axs] = window

    stp = np.ones_like(ashp)
    if steps:
        steps = np.array(steps, ndmin = 1)
        stp[axs] = steps

    astr = np.array(a.strides)

    shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
    strides = tuple(astr * stp) + tuple(astr)

    as_strided = np.lib.stride_tricks.as_strided
    a_view = np.squeeze(as_strided(a, 
                                 shape = shape, 
                                 strides = strides, writeable=False))
    if outlist:
        return list(a_view.reshape((-1,) + tuple(wshp)))
    else:
        # return view (N, p_h, p_w, channels)
        return a_view.reshape((-1,) + tuple(wshp)) #a_view


def patchify_overlap(img, patch_shape=(1000,1000), overlap=10):

    img_h, img_w = img.shape[:2]
    p_h, p_w = patch_shape[:2]

    return window_nd(img, (p_h, p_w), 
        steps=(p_h-overlap,p_w-overlap), axis=(0,1))

# simple loop to collect image back with overlap

def unpatchify_overlap(patches, image_size, overlap=10):    
    img_h, img_w = image_size[:2]


    n_p, p_h, p_w = patches.shape[:3]

    n_h = (img_h - overlap) // (p_h - overlap) 

    if (img_h - overlap) % (p_h - overlap) > 0:
        n_h += 1

    n_w = (img_w - overlap) // (p_w - overlap)
    
    img = np.zeros((img_h, img_w, image_size[2]), dtype=patches.dtype)

    patch_idx = 0

    pos_h = 0

    pos_w = 0

    for i in range(n_h):
        patch_offset_h = overlap//2 if i > 0 else 0

        height_left = img_h - pos_h 

        h_to_insert = np.min([p_h - patch_offset_h, height_left])

        for j in range(n_w):

            p = patches[patch_idx]

            patch_offset_w = overlap//2 if j > 0 else 0

            width_left = img_w - pos_w 

            w_to_insert = np.min([p_w - patch_offset_w, width_left])

            print('h:{}, w:{}, h_i:{}, w_i:{}'.format(pos_h, pos_w,

                h_to_insert, w_to_insert))

            img[pos_h:(pos_h+h_to_insert),pos_w:(pos_w+w_to_insert),:] = (

                    p[patch_offset_h:(h_to_insert + patch_offset_h ),    

                      patch_offset_w:(w_to_insert + patch_offset_w), :])

            pos_w += w_to_insert - overlap // 2

            patch_idx += 1

            print('patch {}/{}'.format(patch_idx, len(patches)))

            if patch_idx > len(patches) - 1:

                return img

        pos_w = 0    

        pos_h += h_to_insert - overlap // 2

    return img




In [None]:
def run_data_config(ratio: float = ratio,team = curr_team ,crop = curr_crop ,patch = None,image_folder = img_folder):
  build_directory_structure()
  fill_directory(team = team, crop = crop, patch = patch)
  split(ratio = ratio, image_folder= img_folder)
run_data_config(patch = 864 )

In [None]:
#Path_folder is the folder where the predicted small masks are
#prefix is the prefix of the names of the mask to search
def unpatchify(full_size,patch_size,path_folder,prefix):
  if not patch_size:
    return
  #img_new = Image.new(size=full_size,mode="RGB")
  patches = list(Path(path_folder).glob(prefix + '*'))
  print(len(patches))
  patch_array = np.empty((len(patches),patch_size,patch_size,3),dtype=np.uint8)
  print(patch_array.shape)
  print("Current unpatchify prefix: " + prefix)
  for patch in patches:
    print(patches)
    index = int(patch.stem.split('_')[-1])
    patch_array[index] = np.array(Image.open(patch))
  img_new = unpatchify_overlap(patch_array,full_size,54)
  return img_new

# Defining augmentation/custom dataset


In [None]:
# ImageDataGenerator
# ------------------

from tensorflow.keras.preprocessing.image import ImageDataGenerator

def image_augmentation(rotation=0,w_shift=0,h_shift=0,zoom=0,h_flip=False,v_flip=False, fill_mode="", rescale=1, preprocess_input=None):
  return ImageDataGenerator(rotation_range=rotation,
                                      width_shift_range=w_shift,
                                      height_shift_range=h_shift,
                                      zoom_range=zoom,
                                      horizontal_flip=h_flip,
                                      vertical_flip=v_flip,
                                      fill_mode=fill_mode,
                                      rescale=rescale,
                                      preprocessing_function=preprocess_input)



In [None]:
def read_rgb_mask(img_path):
    '''
    img_path: path to the mask file
    Returns the numpy array containing target values
    '''

    mask_img = Image.open(img_path)
    mask_arr = np.array(mask_img)

    new_mask_arr = np.zeros(mask_arr.shape[:2], dtype=mask_arr.dtype)

    # Use RGB dictionary in 'RGBtoTarget.txt' to convert RGB to target
    new_mask_arr[np.where(np.all(mask_arr == [216, 124, 18], axis=-1))] = 0
    new_mask_arr[np.where(np.all(mask_arr == [255, 255, 255], axis=-1))] = 1
    new_mask_arr[np.where(np.all(mask_arr == [216, 67, 82], axis=-1))] = 2

    return new_mask_arr


In [None]:
from PIL import Image


class CustomDataset(tf.keras.utils.Sequence):

  """
    CustomDataset inheriting from tf.keras.utils.Sequence.

    3 main methods:
      - __init__: save dataset params like directory, filenames..
      - __len__: return the total number of samples in the dataset
      - __getitem__: return a sample from the dataset

    Note: 
      - the custom dataset return a single sample from the dataset. Then, we use 
        a tf.data.Dataset object to group samples into batches.
      - in this case we have a different structure of the dataset in memory. 
        We have all the images in the same folder and the training and validation splits
        are defined in text files.

  """

  def __init__(self, dataset_dir, which_subset, img_generator=None, mask_generator=None, 
               preprocessing_function=None, out_shape=[img_h, img_w],img_folder = img_folder ,ann_folder = ann_folder):
    if which_subset == 'training':
      subset_file = os.path.join(dataset_dir, 'Splits', 'train.txt')
    elif which_subset == 'validation':
      subset_file = os.path.join(dataset_dir, 'Splits', 'val.txt')
    
    with open(subset_file, 'r') as f:
      lines = f.readlines()
    
    subset_filenames = []
    for line in lines:
      subset_filenames.append(line.strip()) 

    self.which_subset = which_subset
    self.dataset_dir = dataset_dir
    self.subset_filenames = subset_filenames
    self.img_generator = img_generator
    self.mask_generator = mask_generator
    self.preprocessing_function = preprocessing_function
    self.out_shape = out_shape
    self.img_folder = img_folder
    self.ann_folder = ann_folder

  def __len__(self):
    return len(self.subset_filenames)

  def __getitem__(self, index):
    # Read Image
    curr_filename = self.subset_filenames[index]
    img = Image.open(os.path.join(self.dataset_dir, self.img_folder, curr_filename + '.jpg'))
    mask = Image.fromarray(read_rgb_mask(os.path.join(self.dataset_dir, self.ann_folder , curr_filename + '.png')))

    # Resize image and mask
    img = img.resize(self.out_shape)
    mask = mask.resize(self.out_shape)
    
    img_arr = np.array(img)
    mask_arr = np.array(mask)

    mask_arr = np.expand_dims(mask_arr, -1)

    if self.which_subset == 'training':
      if self.img_generator is not None and self.mask_generator is not None:
        # Perform data augmentation
        # We can get a random transformation from the ImageDataGenerator using get_random_transform
        # and we can apply it to the image using apply_transform
        img_t = self.img_generator.get_random_transform(img_arr.shape, seed=SEED)
        mask_t = self.mask_generator.get_random_transform(mask_arr.shape, seed=SEED)
        img_arr = self.img_generator.apply_transform(img_arr, img_t)
        # ImageDataGenerator use bilinear interpolation for augmenting the images.
        # Thus, when applied to the masks it will output 'interpolated classes', which
        # is an unwanted behaviour. As a trick, we can transform each class mask 
        # separately and then we can cast to integer values (as in the binary segmentation notebook).
        # Finally, we merge the augmented binary masks to obtain the final segmentation mask.
        out_mask = np.zeros_like(mask_arr)
        for c in np.unique(mask_arr):
          if c > 0:
            curr_class_arr = np.float32(mask_arr == c)
            curr_class_arr = self.mask_generator.apply_transform(curr_class_arr, mask_t)
            # from [0, 1] to {0, 1}
            curr_class_arr = np.uint8(curr_class_arr)
            # recover original class
            curr_class_arr = curr_class_arr * c 
            out_mask += curr_class_arr
    else:
      out_mask = mask_arr
    
    if self.preprocessing_function is not None:
        img_arr = self.preprocessing_function(img_arr)

    return img_arr, np.float32(out_mask)

#Instantiating custom dataset

In [None]:
from tensorflow.keras.applications.vgg19 import preprocess_input
dataset = CustomDataset(Path().joinpath(folder_name), 'training', 
                        img_generator=image_augmentation(rotation=30, 
                                                         w_shift=30, 
                                                         h_shift=30, 
                                                         zoom=0.1, 
                                                         h_flip=True, 
                                                         v_flip=True, 
                                                         fill_mode='reflect',
                                                         preprocess_input=tf.keras.applications.vgg16.preprocess_input), 
                        mask_generator=image_augmentation(rotation=30, 
                                                          w_shift=30, 
                                                          h_shift=30, 
                                                          zoom=0.1, 
                                                          h_flip=True, 
                                                          v_flip=True, 
                                                          fill_mode='reflect',
                                                          preprocess_input=tf.keras.applications.vgg16.preprocess_input))
dataset_valid = CustomDataset(Path().joinpath(folder_name),
                              'validation',)

In [None]:
train_dataset = tf.data.Dataset.from_generator(lambda: dataset,
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([img_w, img_h, 3], [img_w, img_h, 1]))

train_dataset = train_dataset.batch(batch_size)

train_dataset = train_dataset.repeat()

valid_dataset = tf.data.Dataset.from_generator(lambda: dataset_valid,
                                               output_types=(tf.float32, tf.float32),
                                               output_shapes=([img_w, img_h, 3], [img_w,img_h, 1]))
valid_dataset = valid_dataset.batch(batch_size)

valid_dataset = valid_dataset.repeat()

# Renet Layer

In [None]:
from tensorflow.keras.layers import Permute, LSTM, Bidirectional, Lambda, UpSampling2D

class RenetLayer(tf.keras.layers.Layer):
  
  def __init__(self,patch_size,n_hidden,stack_sublayers,batch_size,shape,regularizer):
    super(RenetLayer,self).__init__()
    self.patch_size = patch_size
    self.n_hidden = n_hidden
    self.stack_sublayers = stack_sublayers
    self.batch_size = batch_size
    self.input_layer = None
    self.regularizer = regularizer
    self.shape = shape
    _, cwidth, cheight, cchannels = self.shape
    pwidth, pheight, pchannels = self.patch_size
    psize = pheight * pwidth * pchannels
    npatchesH = cheight // pheight
    npatchesW = cwidth // pwidth
    self.patch_size = list(self.patch_size)
    self.patch_size.insert(0, 1)
    self.patch_size[-1] = 1
    self.tf_patches = Lambda(lambda x: tf.image.extract_patches(x,sizes=self.patch_size, strides=self.patch_size, rates=[1, 1, 1, 1], padding="SAME"))
    self.l_sub0 = Lambda(lambda x: tf.reshape(x,(self.batch_size,npatchesW, psize))) 
    # LSTM takes 3d tensor (batch, timesteps, features)
    
    self.l_bidir0_forward = LSTM(n_hidden, return_sequences=True, kernel_regularizer=self.regularizer)
    self.l_bidir0_backward = LSTM(n_hidden, return_sequences=True, kernel_regularizer=self.regularizer)
    self.l_bidir0_concat = Concatenate(axis = 2)
    
    self.l_sub1_l = Lambda(lambda x: tf.reshape(x,(self.batch_size, npatchesH, 2*n_hidden)))
    
    self.l_bidir1_forward = LSTM(n_hidden, return_sequences=True, kernel_regularizer=self.regularizer)
    self.l_bidir1_backward = LSTM(n_hidden, return_sequences=True, kernel_regularizer=self.regularizer)
    self.l_bidir1_concat = Concatenate(axis = 2)

    self.l_sub1_bil = Lambda(lambda x: tf.reshape(x,(self.batch_size, npatchesW, npatchesH, 2 * self.n_hidden)))
  def call(self, inputs):
    x = self.tf_patches(inputs)
    x = self.l_sub0(x)
    
    y = self.l_bidir0_forward(x)
    z = self.l_bidir0_backward(y)
    x = self.l_bidir0_concat([y, z])
    
    x = self.l_sub1_l(x)

    y = self.l_bidir1_forward(x)
    z = self.l_bidir1_backward(y)
    x = self.l_bidir1_concat([y, z])

    x = self.l_sub1_bil(x)
    return x
  
  def get_config(self):
    return {"patch_size":self.patch_size,
            "n_hidden": self.n_hidden,
            "stack_sublayers": self.stack_sublayers,
            "batch_size": self.batch_size,
            "shape": self.shape,
            "regularizer":self.regularizer}

# Unet Backbone

In [None]:
def get_norm_activation(function, y):
  t = BatchNormalization()(y)
  t = Activation(function)(t)
  return t

def get_convolution_layer(x, filters, kernel_regularizer=None):
  z = Conv2D(filters=filters,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer)(x)
  z = get_norm_activation("relu", z)
  z = Conv2D(filters=filters,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer)(z)
  z = get_norm_activation("relu", z)
  return z

def get_decoding_layer(x, filters, kernel_regularizer=None):
  y = Conv2D(filters=filters,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer)(x)
  y = get_norm_activation("relu", y)
  y = Conv2D(filters=filters,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer)(y)
  y = get_norm_activation("relu", y)
  y = Conv2DTranspose(filters=filters//2,kernel_size=(2,2),padding='same',strides=2)(y)
  return y

def get_down_and_crop(x, cropped):
  cropped.append(Cropping2D(cropping=((0,0)))(x))
  s = MaxPool2D(pool_size=(2,2))(x)
  return s, cropped

def get_encoding_layer(x, filters, cropped, kernel_regularizer=None):
  m = get_convolution_layer(x, filters, kernel_regularizer)
  m, cropped = get_down_and_crop(m, cropped)
  return m, cropped

def build_unet(start_f, depth,size, kernel_regularizer=None, p_dropout=0):
  cropped = []
  inputs = tf.keras.Input(shape=(size[0],size[1],3))

  # Encoding
  for i in range(0, depth):
    x, cropped = get_encoding_layer(inputs if i==0 else x, start_f, cropped, kernel_regularizer)
    start_f *= 2
  x = get_convolution_layer(x, start_f, kernel_regularizer)
  
  x = Dropout(p_dropout)(x) #performs implicit data augmentation

  # Decoding
  start_f = start_f //2
  print(cropped)
  x = Conv2DTranspose(filters=start_f,kernel_size=(2,2),padding='same',strides=2)(x)
  x = Concatenate()([cropped[-1],x])

  for j in range(1, depth):
    x = get_decoding_layer(x, start_f, kernel_regularizer)
    x = Concatenate()([cropped[-(j + 1)],x])
    start_f = start_f // 2
  
  x = get_convolution_layer(x, start_f, kernel_regularizer)

  outputs = Conv2D(filters=3,kernel_size=(1,1),use_bias=False,activation='softmax')(x)
  return tf.keras.Model(inputs,outputs),inputs,outputs

# UNet

In [None]:
regularizer = tf.keras.regularizers.l2(1e-4)
model,_,_ = build_unet(32, 5,[288,288], tf.keras.regularizers.l2(1e-4), 0)

model.summary()

[<KerasTensor: shape=(None, 288, 288, 32) dtype=float32 (created by layer 'cropping2d')>, <KerasTensor: shape=(None, 144, 144, 64) dtype=float32 (created by layer 'cropping2d_1')>, <KerasTensor: shape=(None, 72, 72, 128) dtype=float32 (created by layer 'cropping2d_2')>, <KerasTensor: shape=(None, 36, 36, 256) dtype=float32 (created by layer 'cropping2d_3')>, <KerasTensor: shape=(None, 18, 18, 512) dtype=float32 (created by layer 'cropping2d_4')>]
Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 288, 288, 3) 0                                            
__________________________________________________________________________________________________
conv2d (Conv2D)                 (None, 288, 288, 32) 864         input_1[0][0]                    
_______________________________________

#DuccioNet: Apri tutto

In [None]:
def build_unet_renet(start_f, depth,size, kernel_regularizer=None, p_dropout=0):
  cropped = []
  inputs = tf.keras.Input(shape=(size[0],size[1],3))

  # Encoding
  for i in range(0, depth):
    x, cropped = get_encoding_layer(inputs if i==0 else x, start_f, cropped, kernel_regularizer)
    start_f *= 2
  x = get_convolution_layer(x, start_f, kernel_regularizer)
  x = Dropout(p_dropout)(x) 
  x = RenetLayer((2, 2,512),256,True,-1,x.shape,kernel_regularizer)(x)
  x = RenetLayer((1,1,512),256,True,-1,x.shape,kernel_regularizer)(x)
  x = RenetLayer((1,1,512),256,True,-1,x.shape,kernel_regularizer)(x)
                                              
  # Decoding
  start_f = start_f //2
  print(cropped)
  x = Conv2DTranspose(filters=start_f,kernel_size=(2,2),padding='same',strides=2)(x)
  x = Conv2DTranspose(filters=start_f,kernel_size=(2,2),padding='same',strides=2)(x)
  x = Concatenate()([cropped[-1],x])

  for j in range(1, depth):
    x = get_decoding_layer(x, start_f, kernel_regularizer)
    x = Concatenate()([cropped[-(j + 1)],x])
    start_f = start_f // 2
  
  x = get_convolution_layer(x, start_f, kernel_regularizer)

  outputs = Conv2D(filters=3,kernel_size=(1,1),use_bias=False,activation='softmax')(x)
  return tf.keras.Model(inputs,outputs),inputs,outputs

regularizer = tf.keras.regularizers.l2(1e-4)
patch_learning = True
if patch_learning:
  change_config(new_patch_size=864, new_batch_size=4)
  model,_,_ = build_unet_renet(32, 4,[curr_patch_size,curr_patch_size], tf.keras.regularizers.l2(1e-4), 0.5)
else:
  model,_,_ = build_unet_renet(32, 4,[img_h,img_w], tf.keras.regularizers.l2(1e-4), 0.5)
model.summary()

[<KerasTensor: shape=(None, 864, 864, 32) dtype=float32 (created by layer 'cropping2d_9')>, <KerasTensor: shape=(None, 432, 432, 64) dtype=float32 (created by layer 'cropping2d_10')>, <KerasTensor: shape=(None, 216, 216, 128) dtype=float32 (created by layer 'cropping2d_11')>, <KerasTensor: shape=(None, 108, 108, 256) dtype=float32 (created by layer 'cropping2d_12')>]
Model: "model_2"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_3 (InputLayer)            [(None, 864, 864, 3) 0                                            
__________________________________________________________________________________________________
conv2d_42 (Conv2D)              (None, 864, 864, 32) 864         input_3[0][0]                    
__________________________________________________________________________________________________
batch_normalization

# FrebianiNet

In [None]:
def build_unet_frebiani(start_f, depth,size, kernel_regularizer=None, p_dropout=0):
  cropped = []
  inputs = tf.keras.Input(shape=(size[0],size[1],3))

  # Encoding
  for i in range(0, depth):
    x, cropped = get_encoding_layer(inputs if i==0 else x, start_f, cropped, kernel_regularizer)
    start_f *= 2
  x = get_convolution_layer(x, start_f, kernel_regularizer)
  x = Dropout(p_dropout)(x) 
  y = Conv2D(filters=start_f,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer,dilation_rate=(2,2))(x)
  y = Conv2D(filters=start_f ,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer,dilation_rate=(2,2))(y)
  y = Conv2D(filters=start_f,kernel_size=(3,3),use_bias=False,padding='same', kernel_regularizer=kernel_regularizer,dilation_rate=(2,2))(y)
  x = RenetLayer((2, 2,512),256,True,-1,x.shape,kernel_regularizer)(x)
  x = RenetLayer((1,1,512),256,True,-1,x.shape,kernel_regularizer)(x)
  x = RenetLayer((1,1,512),256,True,-1,x.shape,kernel_regularizer)(x)
  x = Conv2DTranspose(filters=start_f,kernel_size=(2,2),padding='same',strides=2)(x)
  x = Concatenate()([y,x])                                           
  # Decoding
  start_f = start_f //2
  print(cropped)
  x = Conv2DTranspose(filters=start_f,kernel_size=(2,2),padding='same',strides=2)(x)
  x = Concatenate()([cropped[-1],x])

  for j in range(1, depth):
    x = get_decoding_layer(x, start_f, kernel_regularizer)
    x = Concatenate()([cropped[-(j + 1)],x])
    start_f = start_f // 2
  
  x = get_convolution_layer(x, start_f, kernel_regularizer)

  outputs = Conv2D(filters=3,kernel_size=(1,1),use_bias=False,activation='softmax')(x)
  return tf.keras.Model(inputs,outputs),inputs,outputs

regularizer = tf.keras.regularizers.l2(1e-4)
patch_learning = True
change_config(new_patch_size=None)
if patch_learning:
  model,_,_ = build_unet_frebiani(32, 4,[curr_patch_size,curr_patch_size], tf.keras.regularizers.l2(1e-4), 0.5)
else:
  model,_,_ = build_unet_frebiani(32, 4,[img_w,img_h], tf.keras.regularizers.l2(1e-4), 0.5)
model.summary()

[<KerasTensor: shape=(None, 864, 864, 32) dtype=float32 (created by layer 'cropping2d_18')>, <KerasTensor: shape=(None, 432, 432, 64) dtype=float32 (created by layer 'cropping2d_19')>, <KerasTensor: shape=(None, 216, 216, 128) dtype=float32 (created by layer 'cropping2d_20')>, <KerasTensor: shape=(None, 108, 108, 256) dtype=float32 (created by layer 'cropping2d_21')>]
Model: "model_4"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_5 (InputLayer)            [(None, 864, 864, 3) 0                                            
__________________________________________________________________________________________________
conv2d_90 (Conv2D)              (None, 864, 864, 32) 864         input_5[0][0]                    
__________________________________________________________________________________________________
batch_normalizatio

# Model Compilation

In [None]:
# Optimization params
# -------------------

# Loss
loss = SparseCategoricalFocalLoss(gamma=2)
# learning rate
lr = 1e-3
optimizer = tfa.optimizers.LazyAdam(
        learning_rate=lr)
# -------------------

# Here we define the intersection over union for each class in the batch.
# Then we compute the final iou as the mean over classes
def meanIoU(y_true, y_pred):
    # get predicted class from softmax
    y_pred = tf.expand_dims(tf.argmax(y_pred, -1), -1)

    per_class_iou = []

    for i in range(1,3): # exclude the background class 0
      # Get prediction and target related to only a single class (i)
      class_pred = tf.cast(tf.where(y_pred == i, 1, 0), tf.float32)
      class_true = tf.cast(tf.where(y_true == i, 1, 0), tf.float32)
      intersection = tf.reduce_sum(class_true * class_pred)
      union = tf.reduce_sum(class_true) + tf.reduce_sum(class_pred) - intersection
    
      iou = (intersection + 1e-7) / (union + 1e-7)
      per_class_iou.append(iou)

    return tf.reduce_mean(per_class_iou)

# Validation metrics
# ------------------
metrics = ['accuracy', meanIoU]
# ------------------

# Compile Model
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)

# Callbacks

In [None]:
from datetime import datetime


exps_dir = Path("/").joinpath("content", "drive", "MyDrive", "Colab Notebooks", "Homework2","Results")
exps_dir.mkdir(parents=True, exist_ok=True)

now = datetime.now().strftime('%b%d_%H-%M-%S')

model_name = 'UNet'

exp_dir = Path(exps_dir).joinpath(model_name + '_' + str(now))
exp_dir.mkdir(parents=True, exist_ok=True)
    
callbacks = []

# Model checkpoint
ckpt_dir = Path(exp_dir).joinpath('ckpts')
ckpt_dir.mkdir(parents=True, exist_ok=True)

ckpt_callback = tf.keras.callbacks.ModelCheckpoint(filepath=os.path.join(ckpt_dir, 'cp_{epoch:02d}.ckpt'), 
                                                   save_weights_only=True)  # False to save the model directly
callbacks.append(ckpt_callback)

# Visualize Learning on Tensorboard
# ---------------------------------
tb_dir = Path(exp_dir).joinpath('tb_logs')
tb_dir.mkdir(parents=True, exist_ok=True)
    
# By default shows losses and metrics for both training and validation
tb_callback = tf.keras.callbacks.TensorBoard(log_dir=tb_dir,
                                             profile_batch=0,
                                             histogram_freq=0)  # if 1 shows weights histograms
callbacks.append(tb_callback)

# Early Stopping
# --------------
early_stop = True
if early_stop:
    es_callback = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
    callbacks.append(es_callback)

# Learning Rate Annhealing
learning_rate_reduction=ReduceLROnPlateau(monitor='val_meanIoU', patience=7, verbose=1, factor=0.5, min_lr=1e-6)

lr_scheduling = True
if lr_scheduling:
    callbacks.append(learning_rate_reduction)

# Model fit

In [None]:
print_config()
steps_train = len(dataset) // batch_size 
steps_val = len(dataset_valid) // batch_size 
model.fit(x=train_dataset,
          epochs=100,  #### set repeat in training dataset
          steps_per_epoch=steps_train,
          validation_data=valid_dataset,
          validation_steps=steps_val, 
          callbacks=callbacks)

# Load from checkpoint, set correct checkpoint number

In [None]:
model.load_weights(Path("/").joinpath("content", "drive", "MyDrive", "Colab Notebooks", "Homework2", "Results", "UNet_Dec11_18-32-03", "ckpts", "cp_11.ckpt"))


# Prediction

In [None]:
import json
def rle_encode(img):
    '''
    img: numpy array, 1 - foreground, 0 - background
    Returns run length as string formatted
    '''
    pixels = img.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

def read_rgb_mask(img_path):
    '''
    img_path: path to the mask file
    Returns the numpy array containing target values
    '''

    mask_img = Image.open(img_path)
    mask_arr = np.array(mask_img)

    new_mask_arr = np.zeros(mask_arr.shape[:2], dtype=mask_arr.dtype)

    # Use RGB dictionary in 'RGBtoTarget.txt' to convert RGB to target
    new_mask_arr[np.where(np.all(mask_arr == [216, 124, 18], axis=-1))] = 0
    new_mask_arr[np.where(np.all(mask_arr == [255, 255, 255], axis=-1))] = 1
    new_mask_arr[np.where(np.all(mask_arr == [216, 67, 82], axis=-1))] = 2

    return new_mask_arr
def predict(img,width,height):
  image_array = np.array(img, dtype=np.float32).transpose((1, 0, 2))
  out_sigmoid = model.predict(x=tf.expand_dims(image_array, 0))
  predicted_class = tf.argmax(out_sigmoid, -1)
  predicted_class = predicted_class[0, ...]
  prediction = np.zeros([width, height, 3])
  prediction[np.where(predicted_class == 0)] = [0, 0, 0]
  prediction[np.where(predicted_class == 1)] = [255, 255, 255]
  prediction[np.where(predicted_class == 2)] = [216, 67, 82]
  prediction = Image.fromarray(np.array(prediction, dtype=np.uint8).transpose((1, 0, 2))) 
  return prediction

def build_json(skip_teams = {}):
    #Saving one mask to check it
    counter = 0
    submission_dict = {}
    #Creating folder for downsampled masks
    downsampled_masks = Path().joinpath('Results',"Downsampled_Masks")
    downsampled_masks.mkdir(parents=True, exist_ok=True)
    img_patches = Path().joinpath('Test_Dev',"Image_Patches")
    img_patches.mkdir(parents=True, exist_ok=True)
    for team_folder in [f for f in Path().joinpath("Development_Dataset", "Test_Dev").iterdir() if f.is_dir()]:
        for crop_folder in [f for f in Path(team_folder).iterdir() if f.is_dir()]:
          img_folder = Path(crop_folder).joinpath("Images")
          patchify(curr_patch_size,[(img_folder,img_patches)])
          for img_file in img_folder.iterdir():
              submission_dict[img_file.stem] = {}
              if curr_patch_size:
                width, height = Image.open(img_file).size
                if width < curr_patch_size or height < curr_patch_size:
                  prediction = np.zeros((width, height, 3), dtype=np.uint8)
                else:
                  for img_patch in img_patches.glob(img_file.stem + "*"):
                    img_patch_opened = Image.open(img_patch)
                    width_p, height_p = img_patch_opened.size
                    prediction = predict(img_patch_opened, width_p, height_p)
                    prediction.save(downsampled_masks.joinpath(img_patch.stem + ".png"))
                  prediction = unpatchify((height,width, 3),curr_patch_size,downsampled_masks,img_file.stem)
              else:
                img = Image.open(img_file)
                o_width , o_height = img.size
                img = img.resize((img_w,img_h))
                width,height = img.size
                print("Image size in input to predict is: " + str(img.size))
                prediction = predict(img,width,height)
                prediction = prediction.resize((o_width,o_height))
                prediction = np.array(prediction)
              print("Prediction shape: " + str(prediction.shape))
              new_mask_arr = np.zeros(prediction.shape[:2], dtype=prediction.dtype)
              # Use RGB dictionary in 'RGBtoTarget.txt' to convert RGB to target
              new_mask_arr[np.where(np.all(prediction == [216, 124, 18], axis=-1))] = 0
              new_mask_arr[np.where(np.all(prediction == [255, 255, 255], axis=-1))] = 1
              new_mask_arr[np.where(np.all(prediction == [216, 67, 82], axis=-1))] = 2
              submission_dict[img_file.stem]["shape"] = (width, height)
              submission_dict[img_file.stem]["team"] = team_folder.name
              submission_dict[img_file.stem]["crop"] = crop_folder.name
              submission_dict[img_file.stem]['segmentation'] = {}
              # RLE encoding
              # crop
              rle_encoded_crop = rle_encode(new_mask_arr == 1)
              # weed
              rle_encoded_weed = rle_encode(new_mask_arr == 2)
              submission_dict[img_file.stem]['segmentation']['crop'] = rle_encoded_crop
              submission_dict[img_file.stem]['segmentation']['weed'] = rle_encoded_weed

    Path().joinpath("predictions").mkdir(parents=True, exist_ok=True)
    submission_path = Path().joinpath("predictions", "submission.json")
    json.dump(submission_dict, Path(submission_path).open("w+"))

build_json()