# Machine Learning Engineer Nanodegree - Final Project
This Notebook contains the final project for Udacity's Machine Learning Engineer Nanodegree.

## Project: TGS Salt Identification Challenge - Kaggle Competition
Link: https://www.kaggle.com/c/tgs-salt-identification-challenge

# Table of Contents

#### **0.[References](#References)**

#### **1.[Objectives and Motivation](#Objectives)**

#### **2.[Introduction to Semantic Segmentation](#Introduction)**

#### **3.[Project Description](#Description)**

#### **4.[Methodology](#Methodology)**

#### **5.[Neural Network Architecture](#Architecture)**

#### **6.[Here is where the code Starts](#Code)**

#### **7.[Discussion](#Discussion)**

# References

   1 - Really good Semantic Segmentation material <br>
        https://github.com/tangzhenyu/SemanticSegmentation_DL <br>
        
   2 - Segmentation models with pre loaded weights and different backbones <br>
        https://github.com/qubvel/segmentation_models/tree/master/segmentation_models <br>
        
   3 - Introduction to geophysis and base code for this model <br>
        https://www.kaggle.com/jesperdramsch/intro-to-seismic-salt-and-how-to-geophysics <br>
   
   4 - Semantic Segmentation explained <br>
       https://www.mathworks.com/help/vision/ug/semantic-segmentation-basics.html <br>
       
   5 - Unet model <br>
       http://blog.qure.ai/notes/semantic-segmentation-deep-learning-review <br>
       
   6 - Inception V3 model <br>
       https://codelabs.developers.google.com/codelabs/cpb102-txf-learning/index.html?index=..%2F..%2Findex#1 <br>
       
   7 - Sebastian Thrun's article <br>
       https://cs.stanford.edu/people/esteva/nature/ <br>

# Objectives

This Notebook contains the final project for Udacity's Machine Learning Nanodegree. The motivation to work with Deep Learning in the final project is to dig deep into Machine Learning Solutions.

This project has the following main objective:
- Develop a Neural Network architecture and implement it

## Specific Objectives
The Specific Objectives for this project are:
- To participate of my first Kaggle challenge
- Attain knowledge on Deep Learning and Semantic Segmentation
- Understand Salt Segmentation
- Generate insights of how work with geophysics
- Test the Neural Net

# Introduction
## Semantic Segmentation Algorithms and Papers

Segmentation is essential for image analysis tasks. Semantic segmentation describes the process of associating each pixel of an image with a class label, (such as flower, person, road, sky, ocean, or car).
<img src="https://www.mathworks.com/help/vision/ug/semanticsegmentation_transferlearning.png">


Applications for semantic segmentation include:

- Autonomous driving
- Industrial inspection
- Classification of terrain visible in satellite imagery
- Medical imaging analysis

VOC2012 and MSCOCO are the most important datasets for semantic segmentation.

#### Some implementation of semantic segmentation for DL model</br>
+ [voc2012](http://host.robots.ox.ac.uk/pascal/VOC/voc2012/)
+ [CitySpaces](https://www.cityscapes-dataset.com/)
+ [Mapillary](https://www.mapillary.com/dataset/vistas)
+ [ADE20K](http://groups.csail.mit.edu/vision/datasets/ADE20K/)
+ [PASCAL Context](http://www.cs.stanford.edu/~roozbeh/pascal-context/)

#### 2D Semantic Segmentation Papers:
+ Arxiv-2018 ExFuse: Enhancing Feature Fusion for Semantic Segmentation 87.9% mean Iou->voc2012 [[Paper]](https://arxiv.org/pdf/1804.03821.pdf)
+ CVPR-2018 spotlight Learning to Adapt Structured Output Space for Semantic Segmentation  [[Paper]](https://arxiv.org/abs/1802.10349) [[Code]](https://github.com/wasidennis/AdaptSegNet)
+ Arfix-2018 Adversarial Learning for Semi-supervised Semantic Segmentation [[Paper]](https://arxiv.org/abs/1802.07934) [[Code]](https://github.com/hfslyc/AdvSemiSeg)
+ Arxiv-2018 Context Encoding for Semantic Segmentation [[Paper]](https://arxiv.org/pdf/1803.08904.pdf) [[Code]](https://github.com/zhanghang1989/MXNet-Gluon-SyncBN)
+ CVPR-2018 Learning to Adapt Structured Output Space for Semantic Segmentation [[Paper]](https://arxiv.org/abs/1802.10349)[[Code]](https://github.com/wasidennis/AdaptSegNet)
+ CVPR-2018 Dynamic-structured Semantic Propagation Network [[Paper]](https://arxiv.org/abs/1803.06067)
+ Deeplab v4: Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation [[Paper]](https://arxiv.org/pdf/1802.02611.pdf) [[Code]](https://github.com/tensorflow/models/tree/master/research/deeplab)
+ Deep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs [[Paper]](https://arxiv.org/pdf/1703.04363.pdf)[[Code]](https://github.com/gyglim/dvn)
+ ICCV-2017 Semantic Line Detection and Its Applications [[Paper]](http://openaccess.thecvf.com/content_ICCV_2017/papers/Lee_Semantic_Line_Detection_ICCV_2017_paper.pdf)

#### 3D Semantic Segmentation Papers
- PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation [[Paper]](http://stanford.edu/%7Erqi/pointnet/)
- PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space (2017) [[Paper]](https://arxiv.org/pdf/1706.02413.pdf)
- Learning 3D Mesh Segmentation and Labeling (2010)</b> [[Paper]](https://people.cs.umass.edu/~kalo/papers/LabelMeshes/LabelMeshes.pdf)
- Unsupervised Co-Segmentation of a Set of Shapes via Descriptor-Space Spectral Clustering (2011)</b> [[Paper]](https://www.cs.sfu.ca/~haoz/pubs/sidi_siga11_coseg.pdf)
- Single-View Reconstruction via Joint Analysis of Image and Shape Collections (2015)</b> [[Paper]](https://www.cs.utexas.edu/~huangqx/modeling_sig15.pdf)


#### Medical Image Semantic Segmentation Papers
+ Arxiv-2018 Deep learning and its application to medical image segmentation [[Paper]](https://arxiv.org/pdf/1803.08691)
- Deep neural networks segment neuronal membranes in electron microscopy images
- Semantic Image  Segmentation with Deep Learning [[Paper]](http://www.robots.ox.ac.uk/~sadeep/files/crfasrnn_presentation.pdf)</br>
- Automatic Liver and Tumor Segmentation of CT and MRI Volumes Using Cascaded Fully Convolutional Neural Networks [[Paper]](https://arxiv.org/pdf/1702.05970.pdf)</br>
- DeepNAT: Deep Convolutional Neural Network for Segmenting Neuroanatomy [[Paper]](https://arxiv.org/pdf/1702.08192.pdf)</br>



#### Popular Methods and Implementations
- U-Net [https://arxiv.org/pdf/1505.04597.pdf][Pytorch](https://github.com/tangzhenyu/SemanticSegmentation_DL/tree/master/U-net)
- SegNet [https://arxiv.org/pdf/1511.00561.pdf][Caffe](https://github.com/alexgkendall/caffe-segnet)
- DeepLab [https://arxiv.org/pdf/1606.00915.pdf][Caffe](https://bitbucket.org/deeplab/deeplab-public/)
- FCN [https://arxiv.org/pdf/1605.06211.pdf][tensorflow](https://github.com/tangzhenyu/SemanticSegmentation_DL/tree/master/FCN)
- ENet [https://arxiv.org/pdf/1606.02147.pdf][Caffe](https://github.com/TimoSaemann/ENet)
- LinkNet [https://arxiv.org/pdf/1707.03718.pdf][Torch](https://github.com/e-lab/LinkNet)
- DenseNet [https://arxiv.org/pdf/1608.06993.pdf]
- Tiramisu [https://arxiv.org/pdf/1611.09326.pdf]
- DilatedNet [https://arxiv.org/pdf/1511.07122.pdf]
- PixelNet [https://arxiv.org/pdf/1609.06694.pdf][Caffe](https://github.com/aayushbansal/PixelNet)
- ICNet [https://arxiv.org/pdf/1704.08545.pdf][Caffe](https://github.com/hszhao/ICNet )
- ERFNet [http://www.robesafe.uah.es/personal/eduardo.romera/pdfs/Romera17iv.pdf][Torch](https://github.com/Eromera/erfnet )
- RefineNet [https://arxiv.org/pdf/1611.06612.pdf][tensorflow](https://github.com/tangzhenyu/SemanticSegmentation_DL/tree/master/RefineNet)
- PSPNet [https://arxiv.org/pdf/1612.01105.pdf,https://hszhao.github.io/projects/pspnet/][Caffe](https://github.com/hszhao/PSPNet )
- Dilated convolution [https://arxiv.org/pdf/1511.07122.pdf][Caffe](https://github.com/fyu/dilation )
- DeconvNet [https://arxiv.org/pdf/1505.04366.pdf][Caffe](http://cvlab.postech.ac.kr/research/deconvnet/ )
- FRRN [https://arxiv.org/pdf/1611.08323.pdf][Lasagne](https://github.com/TobyPDE/FRRN )
- GCN [https://arxiv.org/pdf/1703.02719.pdf][PyTorch](https://github.com/ZijunDeng/pytorch-semantic-segmentation )
- LRR [https://arxiv.org/pdf/1605.02264.pdf][Matconvnet](https://github.com/golnazghiasi/LRR )
- DUC, HDC [https://arxiv.org/pdf/1702.08502.pdf][PyTorch](https://github.com/ZijunDeng/pytorch-semantic-segmentation )
- MultiNet [https://arxiv.org/pdf/1612.07695.pdf] [tensorflow1](https://github.com/MarvinTeichmann/MultiNet)[tensorflow2](https://github.com/MarvinTeichmann/KittiSeg)
- Segaware [https://arxiv.org/pdf/1708.04607.pdf][Caffe](https://github.com/aharley/segaware )
- Semantic Segmentation using Adversarial Networks [https://arxiv.org/pdf/1611.08408.pdf] [Chainer](+ https://github.com/oyam/Semantic-Segmentation-using-Adversarial-Networks )
- In-Place Activated BatchNorm:obtain #1 positions [https://arxiv.org/abs/1712.02616] [Pytorch](https://github.com/mapillary/inplace_abn)


# Description
Several areas of Earth with large accumulations of oil and gas also have huge deposits of salt below the surface.

But unfortunately, knowing where large salt deposits are precisely is very difficult. Professional seismic imaging still requires expert human interpretation of salt bodies. This leads to very subjective, highly variable renderings. More alarmingly, it leads to potentially dangerous situations for oil and gas company drillers.

To create the most accurate seismic images and 3D renderings, TGS (the world’s leading geoscience data company) is hoping Kaggle’s machine learning community will be able to build an algorithm that automatically and accurately identifies if a subsurface target is salt or not.

# Methodology

The methodology followed the steps below:

**Data Preparation:** The images had to be processed in order to feed the neural net and match the required input.

**Train:** Training with 3600 images and 400 kept for validation

**Test:** Test the model for all the 18000 images

**Submission:** The result were converted into a csv file with a RLE mask and submitte to Kaggle

The architecture was tested on a p2.xlarge instance of Amazon EC2 (AMI).

# Architecture

The architecture was a U-net model with Inception v3 backbone and imagenet weights.
 
**U-Net** <br>
In this architecture, the encoder gradually reduces the spatial dimension with pooling layers and decoder gradually recovers the object details and spatial dimension.
<img src="http://blog.qure.ai/assets/images/segmentation-review/unet.png">

U-net was choosen by its broad range of application in semantic segmentation problems

**Inception v3** <br>
This model is consisted of two parts:
- Feature extraction part with a convolutional neural network.
- Classification part with fully-connected and softmax layers.

The pre-trained Inception-v3 model achieves state-of-the-art accuracy for recognizing general objects with 1000 classes, like "Zebra", "Dalmatian", and "Dishwasher". The model extracts general features from input images in the first part and classifies them based on those features in the second part.
<img src="https://codelabs.developers.google.com/codelabs/cpb102-txf-learning/img/bfea25ba557fbffc.png">

Inception v3 was tested as a backbone with imagenet weights. Its choice was inspired by Sebastian Thrun's work:https://cs.stanford.edu/people/esteva/nature/

# Code

In [2]:
#import all modules
import os
import sys
import random
import warnings

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from tqdm import tqdm_notebook, tnrange
from skimage.io import imread, imshow, concatenate_images
from skimage.transform import resize
from skimage.morphology import label
from sklearn.model_selection import train_test_split

from keras.utils import plot_model
from keras.models import Model, load_model
from keras.layers import Input
from keras.layers.core import Lambda, RepeatVector, Reshape
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
from keras import models
from keras import layers
from keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau
from keras import backend as K
from keras.models import Model
from keras.layers import Input, Dense, Reshape, concatenate, Conv2D, Flatten, MaxPooling2D
from keras.layers import BatchNormalization, Dropout, GlobalMaxPooling2D
from keras import optimizers

import tensorflow as tf
import warnings
warnings.filterwarnings("ignore")

from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

In [3]:
# Set some parameters
im_width = 224
im_height = 224
border = 5
im_chan = 3 # Number of channels: first is original and second cumsum(axis=0)
n_features = 1 # Number of extra features, like depth
path_train = './train/'
path_test = './test/'
df_depths = pd.read_csv('./depths.csv', index_col='id')
train_ids = next(os.walk(path_train+"images"))[2]
test_ids = next(os.walk(path_test))[2]

In [4]:
# Get and resize train images and masks
X_train = np.zeros((len(train_ids), im_height, im_width, im_chan), dtype=np.uint8)
Y_train = np.zeros((len(train_ids), im_height, im_width, 1), dtype=np.bool)
print('Getting and resizing train images and masks ... ')
sys.stdout.flush()
for n, id_ in tqdm_notebook(enumerate(train_ids), total=len(train_ids)):
    path = path_train
    img = load_img(path + '/images/' + id_)
    x = img_to_array(img)[:,:,1]
    x = resize(x, (224, 224, 1), mode='constant', preserve_range=True)
    X_train[n] = x
    mask = img_to_array(load_img(path + '/masks/' + id_))[:,:,1]
    Y_train[n] = resize(mask, (224, 224, 1), mode='constant', preserve_range=True)

print('Done!')

Getting and resizing train images and masks ... 


HBox(children=(IntProgress(value=0, max=4000), HTML(value='')))


Done!


In [5]:
# Define IoU metric
# This metric is used from Kaggle to evaluate the submission
def mean_iou(y_true, y_pred):
    prec = []
    for t in np.arange(0.5, 1.0, 0.05):
        y_pred_ = tf.to_int32(y_pred > t)
        score, up_opt = tf.metrics.mean_iou(y_true, y_pred_, 2)
        K.get_session().run(tf.local_variables_initializer())
        with tf.control_dependencies([up_opt]):
            score = tf.identity(score)
        prec.append(score)
    return K.mean(K.stack(prec), axis=0)

In [6]:
""" Utility functions for segmentation models """
from functools import wraps

def get_layer_number(model, layer_name):
    """
    Help find layer in Keras model by name
    Args:
        model: Keras `Model`
        layer_name: str, name of layer

    Returns:
        index of layer

    Raises:
        ValueError: if model does not contains layer with such name
    """
    for i, l in enumerate(model.layers):
        if l.name == layer_name:
            return i
    raise ValueError('No layer with name {} in  model {}.'.format(layer_name, model.name))


def extract_outputs(model, layers, include_top=False):
    """
    Help extract intermediate layer outputs from model
    Args:
        model: Keras `Model`
        layer: list of integers/str, list of layers indexes or names to extract output
        include_top: bool, include final model layer output

    Returns:
        list of tensors (outputs)
    """
    layers_indexes = ([get_layer_number(model, l) if isinstance(l, str) else l
                      for l in layers])
    outputs = [model.layers[i].output for i in layers_indexes]

    if include_top:
        outputs.insert(0, model.output)

    return outputs

def reverse(l):
    """Reverse list"""
    return list(reversed(l))

# decorator for models aliases, to add doc string
def add_docstring(doc_string=None):
    def decorator(fn):
        if fn.__doc__:
            fn.__doc__ += doc_string
        else:
            fn.__doc__ = doc_string

        @wraps(fn)
        def wrapper(*args, **kwargs):
            return fn(*args, **kwargs)
        return wrapper
    return decorator

In [7]:
#import the model
from segmentation_models.unet import models

model = models.UInceptionV3()

In [8]:
#displays model summary
model.summary()
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[mean_iou])

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, None, None, 3 0                                            
__________________________________________________________________________________________________
conv2d_1 (Conv2D)               (None, None, None, 3 864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization_1 (BatchNor (None, None, None, 3 96          conv2d_1[0][0]                   
__________________________________________________________________________________________________
activation_1 (Activation)       (None, None, None, 3 0           batch_normalization_1[0][0]      
__________________________________________________________________________________________________
conv2d_2 (

In [10]:
#Runs the model
earlystopper = EarlyStopping(patience=5, verbose=1)
checkpointer = ModelCheckpoint('model-tgs-salt-1.h5', verbose=1, save_best_only=True)
results = model.fit(X_train, Y_train, validation_split=0.1, batch_size=8, epochs=30, 
                    callbacks=[earlystopper, checkpointer])


Train on 3600 samples, validate on 400 samples
Epoch 1/30

Epoch 00001: val_loss improved from inf to 0.45920, saving model to model-tgs-salt-1.h5
Epoch 2/30

Epoch 00002: val_loss improved from 0.45920 to 0.40777, saving model to model-tgs-salt-1.h5
Epoch 3/30

Epoch 00003: val_loss improved from 0.40777 to 0.28230, saving model to model-tgs-salt-1.h5
Epoch 4/30

Epoch 00004: val_loss did not improve from 0.28230
Epoch 5/30

Epoch 00005: val_loss did not improve from 0.28230
Epoch 6/30

Epoch 00006: val_loss improved from 0.28230 to 0.25024, saving model to model-tgs-salt-1.h5
Epoch 7/30

Epoch 00007: val_loss did not improve from 0.25024
Epoch 8/30

Epoch 00008: val_loss did not improve from 0.25024
Epoch 9/30

Epoch 00009: val_loss did not improve from 0.25024
Epoch 10/30

Epoch 00010: val_loss improved from 0.25024 to 0.19512, saving model to model-tgs-salt-1.h5
Epoch 11/30

Epoch 00011: val_loss did not improve from 0.19512
Epoch 12/30

Epoch 00012: val_loss did not improve from 0

In [15]:
# Get and resize test images
X_test = np.zeros((len(test_ids), im_height, im_width, im_chan), dtype=np.uint8)
sizes_test = []
print('Getting and resizing test images ... ')
sys.stdout.flush()
for n, id_ in tqdm_notebook(enumerate(test_ids), total=len(test_ids)):
    path = path_test
    img = load_img(path + id_)
    x = img_to_array(img)[:,:,1]
    sizes_test.append([x.shape[0], x.shape[1]])
    x = resize(x, (224, 224, 1), mode='constant', preserve_range=True)
    X_test[n] = x

print('Done!')

Getting and resizing test images ... 


HBox(children=(IntProgress(value=0, max=18000), HTML(value='')))


Done!


In [16]:
# Predict on train, val and test
model = load_model('model-tgs-salt-1.h5', custom_objects={'mean_iou': mean_iou})
preds_train = model.predict(X_train[:int(X_train.shape[0]*0.9)], verbose=1)
preds_val = model.predict(X_train[int(X_train.shape[0]*0.9):], verbose=1)
preds_test = model.predict(X_test, verbose=1)

# Threshold predictions
preds_train_t = (preds_train > 0.5).astype(np.uint8)
preds_val_t = (preds_val > 0.5).astype(np.uint8)
preds_test_t = (preds_test > 0.5).astype(np.uint8)



In [17]:
preds_test_upsampled = []
for i in tnrange(len(preds_test)):
    preds_test_upsampled.append(resize(np.squeeze(preds_test[i]), 
                                       (sizes_test[i][0], sizes_test[i][1]), 
                                       mode='constant', preserve_range=True))

HBox(children=(IntProgress(value=0, max=18000), HTML(value='')))




In [18]:
preds_test_upsampled[0].shape

(101, 101)

In [19]:
# RLE mask
def RLenc(img, order='F', format=True):
    """
    img is binary mask image, shape (r,c)
    order is down-then-right, i.e. Fortran
    format determines if the order needs to be preformatted (according to submission rules) or not

    returns run length as an array or string (if format is True)
    """
    bytes = img.reshape(img.shape[0] * img.shape[1], order=order)
    runs = []  ## list of run lengths
    r = 0  ## the current run length
    pos = 1  ## count starts from 1 per WK
    for c in bytes:
        if (c == 0):
            if r != 0:
                runs.append((pos, r))
                pos += r
                r = 0
            pos += 1
        else:
            r += 1

    # if last run is unsaved (i.e. data ends with 1)
    if r != 0:
        runs.append((pos, r))
        pos += r
        r = 0

    if format:
        z = ''

        for rr in runs:
            z += '{} {} '.format(rr[0], rr[1])
        return z[:-1]
    else:
        return runs

pred_dict = {fn[:-4]:RLenc(np.round(preds_test_upsampled[i])) for i,fn in tqdm_notebook(enumerate(test_ids))}

HBox(children=(IntProgress(value=1, bar_style='info', max=1), HTML(value='')))




In [20]:
#Prepare for submission
sub = pd.DataFrame.from_dict(pred_dict,orient='index')
sub.index.names = ['id']
sub.columns = ['rle_mask']
sub.to_csv('submission.csv')

# Discussion

The model was successfully implemented. The submission got a IoU (Intersection over Union) score of 0.634 on Kaggle. 

Further studies might include image augmentation, use of depths and stratification. The model could also be tested with no pre-loaded weights to check if the IoU will increase.

All the objectives were fully satisfied. This project was quite challeging since it involved deep learning with image segmentation.