# U-Net architecture definition

We have to define the structure of our FCNN. Let's import the packages. NB: we are not importing Keras directly but we import keras through tensorflow! We do this to be sure that we are using always the same packages.

In [1]:
import numpy as np
import os
from tensorflow.keras import backend as K
from tensorflow.keras import regularizers, initializers
from tensorflow.keras.activations import softmax
from tensorflow.keras.layers import Input, BatchNormalization, SpatialDropout2D, Conv2D, Conv2DTranspose, Concatenate, Activation, Add
from tensorflow.keras.models import Sequential, Model
import tensorflow as tf

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' 
os.environ["CUDA_VISIBLE_DEVICES"]="0"

**Regularizer**: we can define a penality on layer parameters or layer activity during optimization. In the following, we will use kernel_regularizer. This means we are applying a penality on the kernel layers.

**Initializer**: when we begin the training of a NN we can set the initial weights distribution.
In this exercise, both regularizers and initializers are declared inside the convolutional layer.

**Input**: Input is a function that is used to instantiate a keras tensor. A Keras tensor is a symbolic tensor-like object, which we augment with certain attributes that allow us to build a Keras model just by knowing the inputs and outputs of the model. For example attributes are: input shape, batch size... You can find the full attributes list [here](https://keras.io/api/layers/core_layers/input/).

**BatchNormalization**: [Batch normalization](https://arxiv.org/abs/1502.03167) is a way to regularize the model by reducing the internal covariate shift. This layer normalizes its output using the mean and standard deviation of the current batch of inputs and maintain the mean close to zero and the standard deviation close to 1.

**SpatialDropout2D**: Dropout is a method used to regularize the NN. It randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting. [SpatialDropout2D](https://arxiv.org/abs/1411.4280) is instead applied on an entire feature map. This helps the network to learn independent feature maps.

**Conv2D**: [2D Convolution Layer](https://keras.io/api/layers/convolution_layers/convolution2d/). In the link you can find the parameters we can set in the convolutional layers.

**Conv2DTranspose**: as discussed in the lesson of this morning, the U-Net has a U structure: in the first left part (strided) convolution, pooling and activations are used to compress the information while in the last right part the compressed information is upsampled to the input size. We can mainly perform upsampling in two ways:
1) A simple parameters-free upsampling: 

<center><img src="../images/upsampling.png" alt="fishy" class="bg-primary mb-1" width="500px"/></center>
    

2) We can use instead Transpose Convolution with the advantage that the network will have kernels that can learn the better way to upsample the information: this is how it works.
    
<center><img src="../images/deconvolution.png" alt="fishy" class="bg-primary mb-1" width="700px"></center>

**Concatenate**: it simply concatenates two tensors. In this case, we are using the concatenate layer to create the skip connections between the left and the right part of the U-Net.

**Activation**: the activation layer is the one used to define non-linear function in order to activate the kernel. In this case we are going to use a simple ReLU. 

**Add**: the add layer simply sums two input tensors. In this architecture it is used to compute a residual connection inside each block of both compression and decompression path.


# Compression path definition
In this part of the code, we define the structure of each block of the compression path.

In [2]:
def unet_layer_left(previous_layer, n_filters, ker_size_1, strides_1, ker_size_2, strides_2, ker_size_3, strides_3, reg):
    '''Definition of layers of the left part of the u-net. 
    Parameters
    ------------------
    previous_layer : input layer for this block;
    n_filters : number of filters of all the 3D convolutional layers;
    ker_size_1 : kernel dimension of the first conv3D layer of this block;
    strides_1 : strides size to be used for the convolution of the first Conv3D layer;
    ker_size_2/3 : kernel dimension of the second/third Conv3D layers;
    strides_2/3 : strides size to be used for the convolution of the second/third Conv3D layer;
    '''
    layer_L=Conv2D(filters=n_filters,kernel_size=ker_size_1,strides=strides_1,kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(previous_layer)
    layer_L=BatchNormalization(axis=-1)(layer_L)  
    layer_L_shortcut=layer_L
    layer_L=Activation('relu')(layer_L)
    layer_L=SpatialDropout2D(0.2)(layer_L)
    layer_L=Conv2D(filters=n_filters,kernel_size=ker_size_2,strides=strides_2,padding='same',kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(layer_L)
    layer_L=BatchNormalization(axis=-1)(layer_L)  
    layer_L=Activation('relu')(layer_L)
    layer_L=SpatialDropout2D(0.2)(layer_L)
    layer_L=Conv2D(filters=n_filters,kernel_size=ker_size_3,strides=strides_3,padding='same',kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(layer_L)
    layer_L=BatchNormalization(axis=-1)(layer_L)  
    layer_L=Add()([layer_L,layer_L_shortcut])
    layer_L=Activation('relu')(layer_L)
    layer_L=SpatialDropout2D(0.2)(layer_L)
    return layer_L

**Questions**:
1) How many convolution layers are there? 
2) Why we stored the variable 'layer_L' in 'layer_L_shorcut'?
3) What is the padding? What does 'same' mean?

# Bottleneck block

The following cell contains the code for the bottom block of the U-Net

In [3]:
def unet_layer_bottleneck(previous_layer, n_filters, ker_size_1, strides_1, ker_size_2, strides_2, reg):
    '''Definition of the last layer of the network. 
    Parameters
    ------------------
    previous_layer : input layer for this block;
    n_filters : number of filters of all the 3D convolutional layers;
    ker_size_1 : kernel dimension of the first conv3D layer of this block;
    strides_1 : strides size to be used for the convolution of the first Conv3D layer;
    ker_size_2 : kernel dimension of the second Conv3D layers;
    strides_2 : strides size to be used for the convolution of the second Conv3D layer;
    '''
    layer_L=Conv2D(filters=n_filters,kernel_size=ker_size_1,strides=strides_1,kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(previous_layer)
    layer_L=BatchNormalization(axis=-1)(layer_L)  
    layer_L_shortcut=layer_L 
    layer_L=Activation('relu')(layer_L)
    layer_L=Conv2D(filters=n_filters,kernel_size=ker_size_2,strides=strides_2,padding='same',kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(layer_L)
    layer_L=BatchNormalization(axis=-1)(layer_L)    
    layer_L=Add()([layer_L,layer_L_shortcut])
    layer_L=Activation('relu')(layer_L)
    return layer_L

# Decompression path definition
Here, we define the decompression path. As you can see each block of the decompression path takes as input the corresponding left layer in order to build the skip connections. 

In [4]:
def unet_layer_right(previous_layer, layer_left, n_filters, ker_size_1, strides_1, output_pad, ker_size_2, strides_2, ker_size_3, strides_3, reg):
    '''Definition of layers of the right part of the u-net. 
    Parameters
    ------------------
    previous_layer : input layer for this block;
    layer_left : output layer of the left part at the same depth;
    n_filters : int, number of filters of all the 3D convolutional layers;
    ker_size_1 : int, kernel dimension of the first conv3D layer of this block;
    strides_1 : int, strides size to be used for the convolution of the first Conv3D layer;
    output_pad : tuple, padding for the output;
    ker_size_2/3 : int, kernel dimension of the second/third Conv3D layers;
    strides_2/3 : int, strides size to be used for the convolution of the second/third Conv3D layer;
    '''
    layer_R=Conv2DTranspose(filters=n_filters,kernel_size=ker_size_1,strides=strides_1,output_padding=output_pad,kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(previous_layer)
    layer_R=BatchNormalization(axis=-1)(layer_R)
    layer_R_shortcut=layer_R
    layer_R=Activation('relu')(layer_R)
    merge=Concatenate(axis=-1)([layer_left,layer_R])
    layer_R=Conv2D(filters=n_filters,kernel_size=ker_size_2,strides=strides_2,padding='same',kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(merge)
    layer_R=BatchNormalization(axis=-1)(layer_R)  
    layer_R=Activation('relu')(layer_R)
    layer_R=SpatialDropout2D(0.2)(layer_R)
    layer_R=Conv2D(filters=n_filters,kernel_size=ker_size_3,strides=strides_3,padding='same',kernel_regularizer=regularizers.l2(reg), kernel_initializer='random_normal')(layer_R)
    layer_R=BatchNormalization(axis=-1)(layer_R)  
    layer_R=Add()([layer_R,layer_R_shortcut])
    layer_R=Activation('relu')(layer_R)
    layer_R=SpatialDropout2D(0.2)(layer_R)

    return layer_R

**Questions**
1) Why is axis set to -1 in the concatenate layer? What does it mean?
2) What condition(s) have to be satisfied in order to perform the concatenation?

**Very hard question**
3) If the input image size is not a power of 2, what parameters we should check in the convolutional layers in order to not have errors? 

# U-NET architecture
Here we put together all the pieces we wrote above in order to write the full U-Net architecture.
The structure is: first we put the compression path (L -> Left), then the bottleneck block and lastly the decompression path (R -> Right).

In [5]:
def U_net(input_size):
    '''This function builds the network without compiling it.
    Parameters
    ---------------------
    input_size : tuple , size of the input
    reg : float , regularization parameters (L2)
    '''
    inputs=Input(shape=(input_size)) ## 
    Level_1_L = unet_layer_left(inputs, n_filters=16, ker_size_1=1,strides_1=1,ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_2_L = unet_layer_left(Level_1_L, n_filters=32,ker_size_1=2, strides_1=2, ker_size_2=3,strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_3_L = unet_layer_left(Level_2_L, n_filters=64, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_4_L = unet_layer_left(Level_3_L, n_filters=128, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_5_L = unet_layer_left(Level_4_L, n_filters=256, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_6_L = unet_layer_bottleneck(Level_5_L, n_filters= 512, ker_size_1=1, strides_1=1, ker_size_2=3, strides_2=1, reg=0.1)
    Level_5_R = unet_layer_right(Level_6_L, Level_5_L, n_filters=256, ker_size_1=1, strides_1=1, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=3, strides_3=1, reg=0.1)
    Level_4_R = unet_layer_right(Level_5_R, Level_4_L, n_filters=128, ker_size_1=2, strides_1=2, output_pad=None,ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_3_R = unet_layer_right(Level_4_R, Level_3_L, n_filters=64, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_2_R = unet_layer_right(Level_3_R, Level_2_L, n_filters=32, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_1_R = unet_layer_right(Level_2_R, Level_1_L, n_filters=16, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    output=Conv2D(filters=1,kernel_size=1,strides=1, activation = 'sigmoid', kernel_regularizer=regularizers.l2(0.1))(Level_1_R)
    model=Model(inputs=inputs,outputs=output)
    return model

# TEST
Let's test the architecture:
- we generate a set of mock data
- we train the U-Net

In [6]:
#mock data
test_data = np.zeros((2,256,256,1))
test_label = np.zeros((2,256,256,1))

In [7]:
# we call the U-Net function
model = U_net((256,256,1))

In [8]:
met = tf.keras.metrics.MeanIoU(num_classes=2) # we choose a built-in metric in keras that is Intersection Over Union
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy") # we define an optimizer and a loss function
model.summary() # print the model
history=model.fit(test_data, test_label, epochs=3,verbose=1) # here we fit the model

Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_1 (InputLayer)           [(None, 256, 256, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d (Conv2D)                (None, 256, 256, 16  32          ['input_1[0][0]']                
                                )                                                                 
                                                                                                  
 batch_normalization (BatchNorm  (None, 256, 256, 16  64         ['conv2d[0][0]']                 
 alization)                     )                                                             

**What if the in input size is not a power of 2?**

In [14]:
test_data = np.zeros((2,200,200,1))
test_label = np.zeros((2,200,200,1))

In [15]:
model = U_net((200,200,1))

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concatenation axis. Received: input_shape=[(None, 25, 25, 128), (None, 24, 24, 128)]

What does this error means? Why the input shapes of the layer_4_L have different values? What should we do? 

In [16]:
def U_net_adj(input_size):
    '''This function builds the network without compiling it.
    Parameters
    ---------------------
    input_size : tuple , size of the input
    reg : float , regularization parameters (L2)
    '''
    inputs=Input(shape=(input_size)) ## 
    Level_1_L = unet_layer_left(inputs, n_filters=16, ker_size_1=1,strides_1=1,ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_2_L = unet_layer_left(Level_1_L, n_filters=32,ker_size_1=2, strides_1=2, ker_size_2=3,strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_3_L = unet_layer_left(Level_2_L, n_filters=64, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_4_L = unet_layer_left(Level_3_L, n_filters=128, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_5_L = unet_layer_left(Level_4_L, n_filters=256, ker_size_1=2, strides_1=2, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_6_L = unet_layer_bottleneck(Level_5_L, n_filters= 512, ker_size_1=1, strides_1=1, ker_size_2=3, strides_2=1, reg=0.1)
    Level_5_R = unet_layer_right(Level_6_L, Level_5_L, n_filters=256, ker_size_1=1, strides_1=1, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=3, strides_3=1, reg=0.1)
    Level_4_R = unet_layer_right(Level_5_R, Level_4_L, n_filters=128, ker_size_1=2, strides_1=2, output_pad=(1,1),ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_3_R = unet_layer_right(Level_4_R, Level_3_L, n_filters=64, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_2_R = unet_layer_right(Level_3_R, Level_2_L, n_filters=32, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    Level_1_R = unet_layer_right(Level_2_R, Level_1_L, n_filters=16, ker_size_1=2, strides_1=2, output_pad=None, ker_size_2=3, strides_2=1, ker_size_3=1, strides_3=1, reg = 0.1)
    output=Conv2D(filters=1,kernel_size=1,strides=1, activation = 'sigmoid', kernel_regularizer=regularizers.l2(0.1))(Level_1_R)
    model=Model(inputs=inputs,outputs=output)
    return model

In [17]:
model = U_net_adj((200,200,1))

In [18]:
met = tf.keras.metrics.MeanIoU(num_classes=2) # we choose a built-in metric in keras that is Intersection Over Union
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy") # we define an optimizer and a loss function
model.summary() # print the model
history=model.fit(test_data, test_label, epochs=3,verbose=1) # here we fit the model

Model: "model_2"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_5 (InputLayer)           [(None, 200, 200, 1  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_94 (Conv2D)             (None, 200, 200, 16  32          ['input_5[0][0]']                
                                )                                                                 
                                                                                                  
 batch_normalization_106 (Batch  (None, 200, 200, 16  64         ['conv2d_94[0][0]']              
 Normalization)                 )                                                           

# As we made for the data generator, we need to save this code to a .py in order to make it callable by another piece of code. Please call the file UNET_architecture.py