# Chapter 15 - Classifying Images with Deep Convolutional Neural Networks

## Image classification tasks
### Image classification
Identify main object in an image
<img src="./images/Seg_ImageClassification.png" alt="Classification" style="width: 600px;"/>

### Classification and localization
Identify main object in an image and calculate a bounding box (single object)
<img src="./images/Seg_ObjectLocation.png" alt="Location" style="width: 600px;"/>

### Object detection
Detection of multiple objects in an image
<img src="./images/Seg_ObjectDetection.png" alt="Detection" style="width: 600px;"/>

### Semantic segmentation
- Assign a class to every pixel in an image
  - Identity of objects is disregarded
- Also referred to as dense prediction
- Labelling is also pixel-wise instead of image-wise
<img src="./images/Seg_Semantic.png" alt="Semnatic" style="width: 600px;"/>  
("Fully Convolutional Networks for Semantic Segmentation", Long et al. 2015)

### One more level
<img src="./images/Seg_Comparison.png" alt="Comparison" style="width: 800px;"/>

#### One missing building block:
## Transposed convolutions
Convolutions for:
- detecting features/patterns (stride = 1)
- down-sampling, reducing resolution (stride > 1)
- up-sampling, increasing resolution (transposed convolutions or fractional convolutions)  
  
Figures from: https://medium.com/activating-robotic-minds/up-sampling-with-transposed-convolution-9ae4f2df52d0  
Ordinary convolution:  
<img src="./images/Convolution.png" alt="Convolution" style="width: 600px;"/>  

Transposed convolution:
<img src="./images/Deconvolution.png" alt="Deconvolution" style="width: 600px;"/>

Stretch image:  
<img src="./images/tConv_stretch.png" alt="Stretched image" style="width: 250px;"/>

Create kernel matrix matching number of pixels:
<img src="./images/tConv_kernel.png" alt="Kernel" style="width: 150px;"/>
<img src="./images/tConv_kernel_rearrange.png" alt="Rearranged kernel" style="width: 600px;"/>

| Matrix product:  | Rearrange:  |
| --- | --- |
| <img src="./images/tConv_product.png" alt="Convolved" style="width: 700px;"/> | <img src="./images/tConv_rearrange.png" alt="Rearranged" style="width: 150px;"/> |

| Transposed kernel matrix and stretched image: | Rearranged result: |
| --- | --- |
| <img src="./images/tConv_tkernel_rearrange.png" alt="Transposed kernel" style="width: 600px;"/> | <img src="./images/tConv_trearrange.png" alt="Rearranged" style="width: 200px;"/> |

## Semantic Segmentation models
- Typically based on the building blocks mentioned so far in this course
- Important step forward with "Fully Convolutional Networks for Semantic Segmentation", Long et al. 2015
  - Series of convolutions and pooling blocks
  - Deconvolution/strided convolution or bilinear upsampling at the end, (possibly combining information from two or more levels) to upscale to full image size
    - tradeoff between spatially fine details and semantic precission
  - E.g. Inception V3 as basis
    - Exchange dense layers with Conv2d
    - Upscaling at the end

### Fully Convolutional Network
Long et al. 2015
<img src="./images/Seg_FCN.png" alt="FCN" style="width: 800px;"/>

### U-Net
Olaf Ronneberger et al. original figure (many other variations exist):
<img src="./images/Seg_Unet.png" alt="U-net" style="width: 800px;"/>

In [None]:
"""
Version of U-Net with dropout and size preservation (padding= 'same')
""" 
def conv2d_block(input_tensor, n_filters, kernel_size = 3, batchnorm = True):
    """Function to add 2 convolutional layers with the parameters passed to it"""
    # first layer
    x = Conv2D(filters = n_filters, kernel_size = (kernel_size, kernel_size),\
              kernel_initializer = 'he_normal', padding = 'same')(input_tensor)
    if batchnorm:
        x = BatchNormalization()(x)
    x = Activation('relu')(x)
    
    # second layer
    x = Conv2D(filters = n_filters, kernel_size = (kernel_size, kernel_size),\
              kernel_initializer = 'he_normal', padding = 'same')(x)
    if batchnorm:
        x = BatchNormalization()(x)
    x = Activation('relu')(x)
    
    return x


def get_unet(input_img, n_filters = 16, dropout = 0.1, batchnorm = True, n_classes = 2):
    # Contracting Path
    c1 = conv2d_block(input_img, n_filters * 1, kernel_size = 3, batchnorm = batchnorm)
    p1 = MaxPooling2D((2, 2))(c1)
    p1 = Dropout(dropout)(p1)
    
    c2 = conv2d_block(p1, n_filters * 2, kernel_size = 3, batchnorm = batchnorm)
    p2 = MaxPooling2D((2, 2))(c2)
    p2 = Dropout(dropout)(p2)
    
    c3 = conv2d_block(p2, n_filters * 4, kernel_size = 3, batchnorm = batchnorm)
    p3 = MaxPooling2D((2, 2))(c3)
    p3 = Dropout(dropout)(p3)
    
    c4 = conv2d_block(p3, n_filters * 8, kernel_size = 3, batchnorm = batchnorm)
    p4 = MaxPooling2D((2, 2))(c4)
    p4 = Dropout(dropout)(p4)
    
    c5 = conv2d_block(p4, n_filters = n_filters * 16, kernel_size = 3, batchnorm = batchnorm)
    
    # Expansive Path
    u6 = Conv2DTranspose(n_filters * 8, (3, 3), strides = (2, 2), padding = 'same')(c5)
    u6 = concatenate([u6, c4])
    u6 = Dropout(dropout)(u6)
    c6 = conv2d_block(u6, n_filters * 8, kernel_size = 3, batchnorm = batchnorm)
    
    u7 = Conv2DTranspose(n_filters * 4, (3, 3), strides = (2, 2), padding = 'same')(c6)
    u7 = concatenate([u7, c3])
    u7 = Dropout(dropout)(u7)
    c7 = conv2d_block(u7, n_filters * 4, kernel_size = 3, batchnorm = batchnorm)
    
    u8 = Conv2DTranspose(n_filters * 2, (3, 3), strides = (2, 2), padding = 'same')(c7)
    u8 = concatenate([u8, c2])
    u8 = Dropout(dropout)(u8)
    c8 = conv2d_block(u8, n_filters * 2, kernel_size = 3, batchnorm = batchnorm)
    
    u9 = Conv2DTranspose(n_filters * 1, (3, 3), strides = (2, 2), padding = 'same')(c8)
    u9 = concatenate([u9, c1])
    u9 = Dropout(dropout)(u9)
    c9 = conv2d_block(u9, n_filters * 1, kernel_size = 3, batchnorm = batchnorm)
    
    outputs = Conv2D(n_classes, (1, 1), activation='softmax')(c9)
    model = Model(inputs=[input_img], outputs=[outputs])
    return model

In [None]:
from tensorflow.keras.layers import Input, Conv2D, BatchNormalization, Activation, MaxPooling2D, Dropout, Conv2DTranspose, concatenate
from tensorflow.keras.models import Model
input_img = Input(shape=(128,128,3))
model = get_unet(input_img, n_filters = 32, dropout = 0.0, batchnorm = True, n_classes = 1)
model.summary()

### Other semantic segmentation networks
- V-Net for 3D imaging data
- Various architectures, not all very intuitive
  - Some take the scene into account
  - ... or even more advanced stuff
  - Many use ROIs (region of interest) as intermediate steps
  - Some use sets of atrous convolutions  
  
https://medium.com/@arthur_ouaknine/review-of-deep-learning-algorithms-for-image-semantic-segmentation-509a600f7b57

## Augmentation in semantic segmenatition
- Masks must match images
- No native support for "double" augmentation in Keras
  - Possibility: Two parallel augmentations -> zip -> yield
  - Two ImageDataGenerator-s with same seed
    - One for images
    - One for masks

## Loss functions
- Pixel-wise correctness/overlap
    - Binary/categorical cross-entropy
    - Dice coefficient
    - Binary F$_\beta$
- Boundary based
    - Hausdorff distance
- Losses for semantic segmentation: https://github.com/JunMa11/SegLoss
- https://neptune.ai/blog/image-segmentation-tips-and-tricks-from-kaggle-competitions#loss-functions