<a href="https://colab.research.google.com/github/secutron/steel-defect-detection/blob/master/Training_ipynb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

github.com 대신 https://colab.research.google.com/github/ 입력
-->
https://colab.research.google.com/github/secutron/steel-defect-detection/blob/master/Training.ipynb

<img src='https://github.com/secutron/steel-defect-detection/blob/master/severstal.jpg?raw=1' width=400px align=right style="float:right">
<br>
<br>

## Severstal: Steel Defect Detection
### - Detect and classify defects in steel

### by Karthik Kumar Billa    
### [GitHub](https://github.com/rook0falcon)  |   [LinkedIn](https://www.linkedin.com/in/karthik-kumar-billa/)  |  [Kaggle](https://www.kaggle.com/knightwisdom)  |  [Medium](https://medium.com/@guildbilla)

---

## Notebook for Training

<img src='https://github.com/secutron/steel-defect-detection/blob/master/image_defects.png?raw=1' width="300" align=right style="float:right" >

## 1. Business Problem

### 1.1 Introduction

Steel is one of the most important building materials of modern times. Steel buildings are <br> resistant to natural and man-made wear which has made the material ubiquitous around <br>  the world. Identifying defects will help make production of steel more efficient. Severstal <br> is leading the charge in efficient steel mining and production.

Credits: 
https://www.kaggle.com/c/severstal-steel-defect-detection/overview

### 1.2 Problem description
Severstal is now looking to machine learning to improve automation, increase efficiency, <br> and maintain high quality in their production.

The production process of flat sheet steel is especially delicate. From heating and rolling, to drying and cutting, several machines touch flat steel by the time it’s ready to ship. Today, Severstal uses images from high frequency cameras to power a defect detection algorithm.

This notebook will help engineers improve the algorithm by localizing and classifying surface defects on a steel sheet.

### 1.3 Source/Useful Links

Data Source: 
https://www.kaggle.com/c/severstal-steel-defect-detection/data

Competition hosting company: 
https://www.severstal.com/

For Classification: Xception: 
https://keras.io/applications/#xception

For Segmentation: Unet - EfficientNetB1: 
https://github.com/qubvel/segmentation_models

Training and predictions: 
Google Colab https://colab.research.google.com/

Installing segmentation_models packages in Kaggle Kernels (useful for making an Inference kernel on Kaggle Platform): 
https://www.kaggle.com/c/severstal-steel-defect-detection/discussion/113195

Sample submission on kaggle:
https://www.kaggle.com/knightwisdom/13012020-sever-submission

### 1.4 Business objectives and constraints: 

1. Maximize dice score
2. Multi-label probability estimates
3. Defect identification and localization should not take much time. In an ideal situation it is desirable to match with the frequency of cameras. It should finish in a few seconds. Inference kernel should take <= 1 hours run-time.
4. Save model weights to make inference possible anytime.


### Keywords: Steel, Defect, Identification, Localization, Dice coefficient, segmentation models, Tensorflow, Run Length Encoding

## 2. Machine Learning Problem

### 2.1 Data Description
Source: https://www.kaggle.com/c/severstal-steel-defect-detection/data

<pre>
Folder/<br>
    sample_submission.csv    3 columns <br>
    train.csv                3 columns <br>
    test_images/             5506 .jpg images <br>
    train_images/            12568 .jpg images <br>
</pre>
Each image is of **256x1600** resolution.

train.csv contains defect present image details. Its columns are: 

<pre>
ImageId, Class, EncodedPixels
</pre>
Test data ImageIds can be found in sample_submission.csv or can be directly accessed from Image file names.

Corresponding images can be accessed from train and test folders with the help of ImageIds.



Number of Defect Classes: **4**

### 2.2 Translating to Machine Learning Problem

#### 2.2.1 Type of Machine Learning Problem

There are 4 different classes of steel surface defects and we need to locate the defect => **Multi-label Image Segmentation**

#### 2.2.2 Performance Metric

**Dice coefficient:** https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient

This metric is used to gauge similarity of two samples. The Dice coefficient can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by:
<pre>
    <img src='https://github.com/secutron/steel-defect-detection/blob/master/dice.jpg?raw=1' width=150px align=left>
</pre>
where X is the predicted set of pixels and Y is the ground truth. The Dice coefficient is defined to be 1 when both X and Y are empty. The leaderboard score is the mean of the Dice coefficients for each [ImageId, ClassId] pair in the test set.<br>
Source: https://www.kaggle.com/c/severstal-steel-defect-detection/overview/evaluation

#### 2.2.3 Machine Learning Objectives and Constraints

**Objective:** 
1. Maximize Dice coefficient
2. Identify and locate the type of defect present in the image. Masks generated after predictions should be converted into EncodedPixels.

**EncodedPixels:**<br>
In order to reduce the submission file size, our metric uses run-length encoding on the pixel values. Instead of submitting an exhaustive list of indices for your segmentation, you will submit pairs of values that contain a start position and a run length. E.g. '1 3' implies starting at pixel 1 and running a total of 3 pixels (1,2,3).

The competition format requires a space delimited list of pairs. For example, '1 3 10 5' implies pixels 1,2,3,10,11,12,13,14 are to be included in the mask. The metric checks that the pairs are sorted, positive, and the decoded pixel values are not duplicated. The pixels are numbered from top to bottom, then left to right: 1 is pixel (1,1), 2 is pixel (2,1), etc.

**Preparing environment for Deep Learning**

In [0]:
# Loading google drive to access the dataset
from google.colab import drive
drive.mount('/content/drive')

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


In [0]:
# Using segmentation_models for image segmentation task, https://github.com/qubvel/segmentation_models
! pip install segmentation-models

Collecting segmentation-models
  Downloading https://files.pythonhosted.org/packages/da/b9/4a183518c21689a56b834eaaa45cad242d9ec09a4360b5b10139f23c63f4/segmentation_models-1.0.1-py3-none-any.whl
Collecting image-classifiers==1.0.0
  Downloading https://files.pythonhosted.org/packages/81/98/6f84720e299a4942ab80df5f76ab97b7828b24d1de5e9b2cbbe6073228b7/image_classifiers-1.0.0-py3-none-any.whl
Collecting efficientnet==1.0.0
  Downloading https://files.pythonhosted.org/packages/97/82/f3ae07316f0461417dc54affab6e86ab188a5a22f33176d35271628b96e0/efficientnet-1.0.0-py3-none-any.whl
Installing collected packages: image-classifiers, efficientnet, segmentation-models
Successfully installed efficientnet-1.0.0 image-classifiers-1.0.0 segmentation-models-1.0.1


In [0]:
# Import libraries
import warnings
warnings.filterwarnings("ignore")

from time import time
from datetime import datetime

import pandas as pd
import numpy as np
import os
import cv2
from PIL import Image
import seaborn as sns
import matplotlib.pyplot as plt
%matplotlib inline

from sklearn.model_selection import train_test_split

import keras
from keras.preprocessing.image import ImageDataGenerator
from keras import backend as K
from keras.layers import GlobalAveragePooling2D, Dense, Conv2D, BatchNormalization, Dropout
from keras.models import Model, load_model
import tensorflow as tf
from tensorflow.python.keras.callbacks import TensorBoard
from keras.callbacks import ModelCheckpoint

from sklearn.metrics import recall_score
from random import random
from random import seed

# https://github.com/qubvel/segmentation_models
import segmentation_models
print(segmentation_models.__version__)

import segmentation_models as sm
from segmentation_models import Unet
from segmentation_models import get_preprocessing

from tensorflow.keras.utils import plot_model

Using TensorFlow backend.


Segmentation Models: using `keras` framework.
1.0.1


In [0]:
!pip install kaggle



In [0]:
# https://www.kaggle.com/secutron01/account 에서 kaggle.json
# https://blog.naver.com/dongguri2/221543258598
# https://teddylee777.github.io/colab/google-colab-%EB%9F%B0%ED%83%80%EC%9E%84-%EC%97%B0%EA%B2%B0%EB%81%8A%EA%B9%80%EB%B0%A9%EC%A7%80

## upload denied by company!
##from google.colab import files
##files.upload()

!mkdir -p ~/.kaggle
!cp '/content/drive/My Drive/kaggle.json' ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json


In [0]:
!kaggle competitions download -c severstal-steel-defect-detection

Downloading 001d3d093.jpg to /content
  0% 0.00/104k [00:00<?, ?B/s]
100% 104k/104k [00:00<00:00, 41.8MB/s]
Downloading 0025bde0c.jpg to /content
  0% 0.00/129k [00:00<?, ?B/s]
100% 129k/129k [00:00<00:00, 42.5MB/s]
Downloading 001d1b355.jpg to /content
  0% 0.00/87.6k [00:00<?, ?B/s]
100% 87.6k/87.6k [00:00<00:00, 92.0MB/s]
Downloading 0030401a5.jpg to /content
  0% 0.00/106k [00:00<?, ?B/s]
100% 106k/106k [00:00<00:00, 101MB/s]
Downloading 002e73b3c.jpg to /content
  0% 0.00/135k [00:00<?, ?B/s]
100% 135k/135k [00:00<00:00, 43.3MB/s]
Downloading 003ac9d2a.jpg to /content
  0% 0.00/89.9k [00:00<?, ?B/s]
100% 89.9k/89.9k [00:00<00:00, 77.6MB/s]
Downloading 000418bfc.jpg to /content
  0% 0.00/138k [00:00<?, ?B/s]
100% 138k/138k [00:00<00:00, 44.6MB/s]
Downloading 001982b08.jpg to /content
  0% 0.00/116k [00:00<?, ?B/s]
100% 116k/116k [00:00<00:00, 116MB/s]
Downloading 000789191.jpg to /content
  0% 0.00/100k [00:00<?, ?B/s]
100% 100k/100k [00:00<00:00, 103MB/s]
Downloading 000a4bcdd.jpg

In [0]:
# https://stackoverflow.com/questions/31984387/command-line-for-7z-to-extract-specific-files-from-specific-folders-inside-an-ar
# extracting raw data
! 7z e '/content/drive/My Drive/severstal_february/archive.zip' -oA1_train       train_images/*.jpg
! 7z e '/content/drive/My Drive/severstal_february/archive.zip' -oA3_trainlabels train.csv


7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU @ 2.30GHz (306F0),ASM,AES-NI)

Scanning the drive for archives:
  0M Scan /content/drive/My Drive/severstal_february/                                                     
ERROR: No such file or directory
/content/drive/My Drive/severstal_february/archive.zip



System ERROR:
Unknown error -2147024894

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on,64 bits,2 CPUs Intel(R) Xeon(R) CPU @ 2.30GHz (306F0),ASM,AES-NI)

Scanning the drive for archives:
  0M Scan /content/drive/My Drive/severstal_february/                                                     

In [0]:
train_path = '/content/A1_train/'
train_image_names = os.listdir(train_path)
trainLabels = pd.read_csv('/content/A3_trainlabels/train.csv')

FileNotFoundError: ignored

**Generating X_train, X_val and X_test**

In [0]:
train_image_names[:5]

In [0]:
tr_img_id = []
tr_cls_id = []
for i in os.listdir(train_path):
    tr_img_id.append(i)
    tr_cls_id.append(1)
    tr_img_id.append(i)
    tr_cls_id.append(2)
    tr_img_id.append(i)
    tr_cls_id.append(3)
    tr_img_id.append(i)
    tr_cls_id.append(4)
train_img_nms = pd.DataFrame(tr_img_id,columns=['ImageId'])
train_img_nms['ClassId'] = tr_cls_id
train_img_nms.head()

In [0]:
train_df = pd.merge(train_img_nms, trainLabels,how='outer',on=['ImageId','ClassId'])
train_df = train_df.fillna('')
train_df.head()

In [0]:
train_data = pd.pivot_table(train_df, values='EncodedPixels', index='ImageId',columns='ClassId', aggfunc=np.sum).astype(str)
train_data = train_data.reset_index()
train_data.columns = ['ImageId','Defect_1','Defect_2','Defect_3','Defect_4']
train_data.head()

In [0]:
tmp = []
for i in range(len(train_data)):
    if all((train_data['Defect_1'][i]=='',train_data['Defect_2'][i]=='',train_data['Defect_3'][i]=='',train_data['Defect_4'][i]=='')):
        tmp.append(0)
    else:
        tmp.append(1)
train_data['hasDefect'] = tmp

tmp = []
for i in range(len(train_data)):
    if train_data['Defect_1'][i]=='':
        tmp.append(0)
    else:
        tmp.append(1)
train_data['hasDefect_1'] = tmp

tmp = []
for i in range(len(train_data)):
    if train_data['Defect_2'][i]=='':
        tmp.append(0)
    else:
        tmp.append(1)
train_data['hasDefect_2'] = tmp

tmp = []
for i in range(len(train_data)):
    if train_data['Defect_3'][i]=='':
        tmp.append(0)
    else:
        tmp.append(1)
train_data['hasDefect_3'] = tmp

tmp = []
for i in range(len(train_data)):
    if train_data['Defect_4'][i]=='':
        tmp.append(0)
    else:
        tmp.append(1)
train_data['hasDefect_4'] = tmp

train_data.head()

In [0]:
# For stratified sampling, stratified based on minority label priority
# Label 2 : 247
# Label 4 : 801
# Label 1 : 897
# Label 3 : 5150
tmp = []
for i in range(len(train_data)):
    if train_data['hasDefect_2'].iloc[i]==1:
        tmp.append(2)
    elif train_data['hasDefect_4'].iloc[i]==1:
        tmp.append(4)
    elif train_data['hasDefect_1'].iloc[i]==1:
        tmp.append(1)
    elif train_data['hasDefect_3'].iloc[i]==1:
        tmp.append(3)
    else:
        tmp.append(0)
train_data['stratify']=tmp
train_data.head()

**Train. Validation and Test split**

In [0]:
X = train_data.copy()
X_train, X_test = train_test_split(X, test_size = 0.1, stratify = X['stratify'],random_state=42)
X_train, X_val = train_test_split(X_train, test_size = 0.2, stratify = X_train['stratify'],random_state=42)
print(X_train.shape, X_val.shape, X_test.shape)

## 3. Exploratory Data Analysis

In [0]:
X_train.head()

In [0]:
# Sample Image
fig, ax = plt.subplots(1,1,figsize=(8, 7))
img = Image.open(str(train_path + X_train.ImageId.iloc[0]))
plt.imshow(img)
ax.set_title(X_train.ImageId.iloc[0])
plt.show()
print(img.size)

**Summary:** The images have 1600x256 pixel resolution

In [0]:
print("No. of Images in train set: ", X_train.shape[0],'\n','-'*50)

tmp = [sum(X_train['hasDefect_1']==1),
       sum(X_train['hasDefect_2']==1),
       sum(X_train['hasDefect_3']==1),
       sum(X_train['hasDefect_4']==1)]
fig, ax = plt.subplots()
sns.barplot(x=['1','2','3','4'],y=tmp,palette = "rocket")
ax.set_title("Number of images for each label")
ax.set_xlabel("Label")
plt.show()
print("No. of Images having: Label 1 = {}, Label 2 = {}, Label 3 = {}, Label 4 = {}".format(tmp[0],tmp[1],tmp[2],tmp[3]),'\n','-'*50)

tmp = (X_train['hasDefect_1']+X_train['hasDefect_2']+X_train['hasDefect_3']+X_train['hasDefect_4']).value_counts()
fig, ax = plt.subplots()
sns.barplot(x=['No label','1','2'],y=tmp,palette = "rocket")
ax.set_title("Number of labels for each image")
ax.set_xlabel("Label")
plt.show()
print("No. of Images with no defects: {}, with only one label: {}, with two labels: {}".format(tmp[0],tmp[1],tmp[2]))

#### Observation: The dataset is highly imbalanced. This will make predicting minority class (Class 2) difficult. 

In [0]:
# 5 images having no defects
tmp = []
cnt=0
print("Sample images with no defects:")
for i in X_train['ImageId'][X_train['hasDefect']==0]:
    if cnt<5:
        fig, ax = plt.subplots(1,1,figsize=(8, 7))
        img = Image.open(str(train_path + i))
        plt.imshow(img)
        ax.set_title(i)
        plt.show()
        cnt+=1

### Observation: 
The surface of the non-defective steel may contain different features or profile. It has to be noted that that presence of defect is limited to the 4 types of defects in this dataset. The steel surface may contain other defects but those should not be detected.

In [0]:
# We need a function to convert EncodedPixels into mask
# https://www.kaggle.com/paulorzp/rle-functions-run-lenght-encode-decode

def rle2mask(mask_rle, shape=(1600,256)):
    '''
    mask_rle: run-length as string formated (start length)
    shape: (width,height) of array to return 
    Returns numpy array, 1 - mask, 0 - background
    This function is specific to this competition

    '''
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape).T

def mask2rle(img):
    '''
    img: numpy array, 1 - mask, 0 - background
    Returns run length as string formated
    This function is specific to this competition
    '''
    pixels= img.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

In [0]:
# Visualization: Sample images having defect
for k in [1,2,3,4]:
    tmp = []
    cnt=0
    print("Sample images with Class {} defect:".format(k))
    for i in X_train[X_train[f'hasDefect_{k}']==1][['ImageId',f'Defect_{k}']].values:
        if cnt<5:
            fig, (ax1,ax2) = plt.subplots(nrows = 1,ncols = 2,figsize=(15, 7))
            img = Image.open(str(train_path + i[0]))
            ax1.imshow(img)
            ax1.set_title(i[0])
            cnt+=1
            ax2.imshow(rle2mask(i[1]))
            ax2.set_title(i[0]+'_mask_'+str(k))
            plt.show()
    print('-'*80)

### Observation:
The regional profile on the masks of defect conataining steel surfaces can be seen to be indistinguishable among different classes. Though defect type 1 can be seen to have multiplte small size regions and defect type 4 images have multiple regions of medium size. Defect type 3 images can be seen to also contain multiple regions of medium size. While defect type 2 and type 3 images can be seen to share some regional characteristics. 

#### 'area' as a new feature
Used for thresholding masks after generating predictions

In [0]:
fig, ((ax1, ax2), (ax3, ax4)) = plt.subplots(nrows=2, ncols=2,figsize=(12,7))

tmp = X_train['Defect_1'][X_train['hasDefect_1']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]]))
ax1.hist(tmp.values,bins = 25)
ax1.set_xlabel('Defect_1_area')
ax1.set_ylabel('Number of Images')


tmp = X_train['Defect_2'][X_train['hasDefect_2']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]]))
ax2.hist(tmp.values,bins = 25)
ax2.set_xlabel('Defect_2_area')
ax2.set_ylabel('Number of Images')


tmp = X_train['Defect_3'][X_train['hasDefect_3']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]]))
ax3.hist(tmp.values,bins = 25)
ax3.set_xlabel('Defect_3_area')
ax3.set_ylabel('Number of Images')


tmp = X_train['Defect_4'][X_train['hasDefect_4']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]]))
ax4.hist(tmp.values,bins = 25)
ax4.set_xlabel('Defect_4_area')
ax4.set_ylabel('Number of Images')
plt.tight_layout()
plt.show()
print('-'*50)

plt.figure(figsize=(8,4))
for i in [1,2,3,4]:
    tmp = X_train[f'Defect_{i}'][X_train[f'hasDefect_{i}']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]]))
    plt.plot(tmp,np.zeros_like(tmp)+i,'o')
plt.xlabel('area')
plt.ylabel('Defect type')
plt.show()
print('-'*50)

tmp =[]
for i in [1,2,3,4]:

    tmp.append(X_train[f'Defect_{i}'][X_train[f'hasDefect_{i}']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]])).describe())
area_df = pd.DataFrame(tmp)
area_df.index=['Defect_1','Defect_2','Defect_3','Defect_4']
area_df

### Observation: 
There is considerable overlap in the range of area. Minimum area for each defect type can be seen closer to each other. While the maximum is largely different. We can use the minimum and maximum values of area in training images to threshold test image defect predictions.

In [0]:
# removing areas below 2 percentile and above 98 percentile to threshold area of predicted masks
tmp = []
for i in [1,2,3,4]:
    tmp_1 = X_train[f'Defect_{i}'][X_train[f'hasDefect_{i}']==1].apply(lambda s: sum([int(k) for k in s.split(' ')[1::2]])).sort_values().reset_index().drop('index',axis=1)
    tmp.append([tmp_1.iloc[int(0.02*len(tmp_1))].values[0],tmp_1.iloc[-int(0.02*len(tmp_1))].values[0]])
print('Limiting area to above 2 percentile and below 98 percentile values: \n',tmp)

In [0]:
area_threshold = pd.DataFrame([[500,15500],[700,10000],[1100,160000],[2800,127000]],
                               columns=['min','max'], index=['defect_1','defect_2','defect_3','defect_4'])
area_threshold # to threshold predictions

### Summary:
Based on range of area for each defect, we will threshold predictions to filter outliers. For e.g. some predicted masks have only 4 pixels that have value 1. Such an image will reduce the performance of the model on the final metric.

## EDA conclusion: 

a) The dataset is imbalanced thus we will use stratified sampling for splitting the dataset into train and validation datasets. <br> 

b) This is a multi-label image segmentation problem. As there are around 50% of images with no defects, it is equally important to identify images with no defects. <br> 

c) Based on area thresholds from 'test_thresolds' dataframe and class probability thresholds (which are to be determined after predictions from neural networks), we will ensure that number of predicted images per defect will be closer to the values in 'count' column. <br>

d) Procedure:
  1. We will have a binary classification model to filter images with defects from no defect images. 
  2. A 4-label classification model to predict probablities of images beloning to each class.
  3. 4 segmentation models for four different classes to generate masks for each test image.
  4. Convert masks to EncodedPixels and filter them as per classification probabilities.

e) We are generating a new solution to the business problem with available libraries: tensorflow, keras and segmentation_models. 

### Model architecture:
<img src='https://github.com/secutron/steel-defect-detection/blob/master/model_arch_new.jpg?raw=1' width=1000px align=left>

Blue dots in the Architecture image indicates that an input is being given at that level, while black dot near "Apply threholds" correspond to the application of thresholds at the output of predicted masks. At the threshold application level images are filtered based on Defect presence probability, Defect type belongingness and area of the defect.

## 4. Data preparation and Model Building

In [0]:
train_segmentation = True
train_classification_binary = True
train_classification_multi = True
epochs = 30

In [0]:
# Metrics
# For image segmentation
# COMPETITION METRIC
# https://www.kaggle.com/xhlulu/severstal-simple-keras-u-net-boilerplate
def dice_coef(y_true, y_pred, smooth=K.epsilon()):
    '''
    This function returns dice coefficient of similarity between y_true and y_pred
    Dice coefficient is also referred to as F1_score, but we will use this name for image segmentation models
    For example, 
    let an instance on y_true and y_pred be [[1,1],[0,1]] and [[1,0],[0,1]]
    this metric first converts the above into [1,1,0,1] abd [1,0,0,1],
    then intersection is calculated as 1*1 + 1*0 + 0*1 + 1*1 = 2 and sum(y_true)+sum(y_pred)= 3+2 = 5
    this returns the value (2.* 2 + 10e-7)/(3 + 2 + 10e-7) ~ 0.8    
    '''
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + smooth) / (K.sum(y_true_f) + K.sum(y_pred_f) + smooth)

# Custom metrics, https://stackoverflow.com/questions/59196793/why-are-my-metrics-of-my-cnn-not-changing-with-each-epoch
# For clasification
def recall_m(y_true, y_pred):
    '''
    This function returns recall_score between y_true and y_pred
    This function is ported as a metric to the Neural Network Models
    Keras backend is used to take care of batch type training, the metric takes in a batch of y_pred and corresponding y_pred 
    as input and returns recall score of the batch
    '''
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1))) # calculates number of true positives
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))      # calculates number of actual positives
    recall = true_positives / (possible_positives + K.epsilon())   # K.epsilon takes care of non-zero divisions
    return recall

def precision_m(y_true, y_pred):
    '''
    This function returns precison_score between y_true and y_pred
    This function is ported as a metric to the Neural Network Models
    Keras backend is used to take care of batch type training, the metric takes in a batch of y_pred and corresponding y_pred 
    as input and returns prediction score of the batch
    '''
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))  # calculates number of true positives
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))      # calculates number of predicted positives   
    precision = true_positives /(predicted_positives + K.epsilon()) # K.epsilon takes care of non-zero divisions
    return precision
    
def f1_score_m(y_true, y_pred):
    '''
    This function returns f1_score between y_true and y_pred
    This 
    This function is ported as a metric to the Neural Network Models
    Keras backend is used to take care of batch type training, the metric takes in a batch of y_pred and corresponding y_pred 
    as input and returns f1 score of the batch
    '''
    precision = precision_m(y_true, y_pred)  # calls precision metric and takes the score of precision of the batch
    recall = recall_m(y_true, y_pred)        # calls recall metric and takes the score of precision of the batch
    return 2*((precision*recall)/(precision+recall+K.epsilon()))

dependencies = {
    'recall_m':recall_m,
    'precision_m':precision_m,
    'dice_coef':dice_coef,
    'f1_score_m':f1_score_m,
    'dice_loss':sm.losses.dice_loss
}

### 4.1 Binary Classification

- Train and predict the probability of presence of defects in images .

### 4.1.1 Data Preparation

In [0]:
X_train_binary = X_train[['ImageId','hasDefect']]
X_val_binary = X_val[['ImageId','hasDefect']]
X_test_binary = X_test[['ImageId','hasDefect']]

print(X_train.shape, X_val.shape, X_test.shape)
print(X_train_binary.shape, X_val_binary.shape, X_test_binary.shape)

In [0]:
# https://keras.io/preprocessing/image/
# https://stackoverflow.com/questions/52754492/write-custom-data-generator-for-keras

# DataGenerator for the binary classification model with image augmentations

train_DataGenerator_1 = ImageDataGenerator(rescale=1./255., shear_range=0.2, zoom_range=0.05, rotation_range=5,
                           width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, vertical_flip=True)

test_DataGenerator_1 = ImageDataGenerator(rescale=1./255)

train_generator = train_DataGenerator_1.flow_from_dataframe(
        dataframe=X_train_binary.astype(str),
        directory=train_path,
        x_col="ImageId",
        y_col="hasDefect",
        target_size=(256,512),
        batch_size=16,
        class_mode='binary')

validation_generator = test_DataGenerator_1.flow_from_dataframe(
        dataframe=X_val_binary.astype(str),
        directory=train_path,
        x_col="ImageId",
        y_col="hasDefect",
        target_size=(256,512),
        batch_size=16,
        class_mode='binary')

### 4.1.2 Binary Classification Model Definition

In [0]:
# https://www.youtube.com/watch?v=2U6Jl7oqRkM
# Using a pretrained model from keras for classification: 
# Selecting Xception pretrained model
# https://keras.io/applications/

base_model = keras.applications.xception.Xception(include_top = False, input_shape = (256,512,3))

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)

# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)

x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)

x = Dense(64, activation='relu')(x)

# and the prediction layer
predictions = Dense(1, activation='sigmoid')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
model.summary()

### 4.1.3 Binary Classification Model Training

In [0]:
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['acc',f1_score_m,precision_m,recall_m])
if train_classification_binary==True:
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_binary_01_02_2020'

    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    # https://keras.io/callbacks/

    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_binary_01_02_2020.h5', monitor='val_f1_score_m', mode='max', verbose=1, save_best_only=True)
    history = model.fit_generator(train_generator, validation_data = validation_generator, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    file_writer.close()

In [0]:
vy = history.history['val_loss']
ty = history.history['loss']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation loss")
ax.plot(x,ty,'b',label = "loss")
ax.set_xlabel('epoch')
ax.set_ylabel('Loss: BCE')
plt.legend()
plt.grid()
plt.show()

Tensorboard visualization (similar to above plot)

<img src='https://github.com/secutron/steel-defect-detection/blob/master/binary_tensorboard/train_loss.jpg?raw=1' width=600px align = left> 

**Summary:** Train loss reduction is smooth.

<img src='https://github.com/secutron/steel-defect-detection/blob/master/binary_tensorboard/val_loss.jpg?raw=1' width=600px align = left>

**Summary:** Binary cross entropy loss of the model can be seen to have large variations on validation set. This implies that the model is having tough time generalizing on unseen dataset.

In [0]:
vy = history.history['val_f1_score_m']
ty = history.history['f1_score_m']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation f1_score")
ax.plot(x,ty,'b',label = "f1_score") # Train set
ax.set_xlabel('epoch')
ax.set_ylabel('Metric: F1_score')
plt.legend()
plt.grid()
plt.show()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/binary_tensorboard/train_f1.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/binary_tensorboard/val_f1.jpg?raw=1' width=600px align = left>

**Summary:** Similar to BCE loss, f1_score on validation set can be seen to vary a lot this implies that the training dataset is insufficient.

In [0]:
vy = history.history['val_precision_m']
ty = history.history['precision_m']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation precision_m")
ax.plot(x,ty,'b',label = "precision_m") # Train set
ax.set_xlabel('epoch')
ax.set_ylabel('Metric: Precision')
plt.legend()
plt.grid()
plt.show()

**Summary:** Precision can be seen to improve smoothly at every epoch. Precision tells how many predicted positives are actually positive. The behavior implies that the performance of the model on successfully predicting negatives is good. High precision implies less false positives.

In [0]:
vy = history.history['val_recall_m']
ty = history.history['recall_m']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation recall")
ax.plot(x,ty,'b',label = "recall")
ax.set_xlabel('epoch')
ax.set_ylabel('Metric: recall')
plt.legend()
plt.grid()
plt.show()

**Summary:** Recall can be seen to vary largely at every epoch. Recall tells how many positives are predicted out of total actual positives. The behavior implies that the performance of the model on successfully predicting positives is poor. High recall implies less false negatives.

**F1_score metric Justification:** For this image segmentation task, it is very important to achieve a high precision high recall model. Thus, f1_score is a suitable metric for the classification models. The classification model is monitored for high f1_score for saving model weights.

### 4.1.4 Binary Classification Evaluation

In [0]:
# Loading best model trained on binary classification
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_binary_01_02_2020.h5', custom_objects=dependencies)

In [0]:
# During evaluation image augmentations are not to be included thus the DataGenerators are redefined.

train_generator = test_DataGenerator_1.flow_from_dataframe(dataframe=X_train_binary.astype(str),
                                                           directory=train_path,
                                                           x_col="ImageId",
                                                           y_col="hasDefect",
                                                           target_size=(256,512),
                                                           batch_size=16,
                                                           class_mode='binary',
                                                           shuffle=False)

validation_generator = test_DataGenerator_1.flow_from_dataframe(dataframe=X_val_binary.astype(str),
                                                                directory=train_path,
                                                                x_col="ImageId",
                                                                y_col="hasDefect",
                                                                target_size=(256,512),
                                                                batch_size=16,
                                                                class_mode='binary',
                                                                shuffle=False)

test_generator = test_DataGenerator_1.flow_from_dataframe(dataframe=X_test_binary.astype(str),
                                                                directory=train_path,
                                                                x_col="ImageId",
                                                                y_col="hasDefect",
                                                                target_size=(256,512),
                                                                batch_size=16,
                                                                class_mode='binary',
                                                                shuffle=False)

In [0]:
train_evaluate = model.evaluate(train_generator,verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

In [0]:
val_evaluate = model.evaluate(validation_generator,verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(val_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

In [0]:
test_evaluate = model.evaluate(test_generator,verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

**Summary:** The model is having good performance on train, validation and test dataset. The values of loss and metrics can be seen to be similar in these datasets. This tells that the model is not overfitting on dataset. The f1_score of 0.921 on validation dataset is acceptable.

## 4.2 Multi-label classification

### 4.2.1 Data Preparation

In [0]:
X_train_multi = X_train[['ImageId','hasDefect_1','hasDefect_2','hasDefect_3','hasDefect_4']][X_train['hasDefect']==1]
X_val_multi = X_val[['ImageId','hasDefect_1','hasDefect_2','hasDefect_3','hasDefect_4']][X_val['hasDefect']==1]
X_test_multi = X_test[['ImageId','hasDefect_1','hasDefect_2','hasDefect_3','hasDefect_4']][X_test['hasDefect']==1]

print(X_train.shape, X_val.shape, X_test.shape)
print(X_train_multi.shape, X_val_multi.shape, X_test_multi.shape)

In [0]:
# https://keras.io/preprocessing/image/
# https://stackoverflow.com/questions/52754492/write-custom-data-generator-for-keras

# DataGenerator for the multi label classification model with image augmentations
train_DataGenerator_2 = ImageDataGenerator(rescale=1./255., shear_range=0.2, zoom_range=0.05, rotation_range=5,
                           width_shift_range=0.2, height_shift_range=0.2, horizontal_flip=True, vertical_flip=True)


train_generator = train_DataGenerator_2.flow_from_dataframe(
        dataframe=X_train_multi.astype(str),
        directory='/content/A1_train',
        x_col="ImageId",
        y_col=["hasDefect_1","hasDefect_2","hasDefect_3","hasDefect_4"],
        target_size=(256,512),
        batch_size=16,
        class_mode='other')


test_DataGenerator_2 = ImageDataGenerator(rescale=1./255)
validation_generator = test_DataGenerator_2.flow_from_dataframe(
        dataframe=X_val_multi.astype(str),
        directory='/content/A1_train',
        x_col="ImageId",
        y_col=["hasDefect_1","hasDefect_2","hasDefect_3","hasDefect_4"],
        target_size=(256,512),
        batch_size=16,
        class_mode='other')

### 4.2.2 Multi-Label Classification Model Definition

In [0]:
# Using a pretrained model from keras for classification: 
# Selecting Xception pretrained model
# https://keras.io/applications/

base_model = keras.applications.xception.Xception(include_top = False, input_shape = (256,512,3))

# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)

# let's add fully-connected layers
x = Dense(1024, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)

x = Dense(512, activation='relu')(x)
x = BatchNormalization()(x)
x = Dropout(0.3)(x)

x = Dense(64, activation='relu')(x)

# and the prediction layer
predictions = Dense(4, activation='sigmoid')(x)

# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
model.summary()

### 4.2.3 Multi-label Classification Model Training

In [0]:
model.compile(optimizer='adam', loss='binary_crossentropy',metrics=['acc',f1_score_m,precision_m,recall_m])
if train_classification_multi==True:
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_multi_Defect_01_02_2020'
    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    # https://keras.io/callbacks/
    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_multi_01_02_2020.h5', monitor='val_f1_score_m', mode='max', verbose=1, save_best_only=True)
    history = model.fit_generator(train_generator, validation_data = validation_generator, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    file_writer.close()

In [0]:
# plot on "loss" vs epoch
vy = history.history['val_loss']
ty = history.history['loss']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation loss")
ax.plot(x,ty,'b',label = "Train loss")
ax.set_xlabel('Epoch')
ax.set_ylabel('Loss: Binary Cross Entropy')
plt.legend()
plt.grid()
plt.show()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/multi_tensorboard/train_loss.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/multi_tensorboard/val_loss.jpg?raw=1' width=600px align = left>

**Summary:** Binary cross entropy loss of the model can be seen to reduce smoothly on validation set. 

In [0]:
# Plot on f1_score vs epoch
vy = history.history['val_f1_score_m']
ty = history.history['f1_score_m']
x = list(range(1,len(vy)+1))
fig,ax = plt.subplots(1,1)
ax.plot(x,vy,'r',label = "Validation f1 score")
ax.plot(x,ty,'b',label = "Train score")
ax.set_xlabel('epoch')
ax.set_ylabel('Metric: score')
plt.legend()
plt.grid()
plt.show()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/multi_tensorboard/train_f1.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/multi_tensorboard/val_f1.jpg?raw=1' width=600px align = left>

**Summary:** Similar to BCE loss, f1_score can be seen to smoothly increase on validation set.

### 4.2.4 Multi-label Classification Evaluation

In [0]:
# loading best saved multi_label classification model
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_multi_01_02_2020.h5', custom_objects=dependencies)

In [0]:
# During evaluation we do not require image augmentations
train_generator = test_DataGenerator_2.flow_from_dataframe(dataframe=X_train_multi.astype(str),
                                                           directory=train_path,
                                                           x_col="ImageId",
                                                           y_col=["hasDefect_1","hasDefect_2","hasDefect_3","hasDefect_4"],
                                                           target_size=(256,512),
                                                           batch_size=16,
                                                           class_mode='other',
                                                           shuffle=False)

validation_generator = test_DataGenerator_2.flow_from_dataframe(dataframe=X_val_multi.astype(str),
                                                                directory=train_path,
                                                                x_col="ImageId",
                                                                y_col=["hasDefect_1","hasDefect_2","hasDefect_3","hasDefect_4"],
                                                                target_size=(256,512),
                                                                batch_size=16,
                                                                class_mode='other',
                                                                shuffle=False)

test_generator = test_DataGenerator_2.flow_from_dataframe(dataframe=X_test_multi.astype(str),
                                                                directory=train_path,
                                                                x_col="ImageId",
                                                                y_col=["hasDefect_1","hasDefect_2","hasDefect_3","hasDefect_4"],
                                                                target_size=(256,512),
                                                                batch_size=16,
                                                                class_mode='other',
                                                                shuffle=False)


In [0]:
train_evaluate = model.evaluate(train_generator,verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

In [0]:
val_evaluate = model.evaluate(validation_generator,verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(val_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

In [0]:
test_evaluate = model.evaluate(test_generator,verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['binary_crossentropy','acc','f1_score_m','precision_m','recall_m'])

**Summary:** The multi-label classification model is generalizing well on unseen data (the values of evaluation on test set and validation set are closer to train set).

## 4.3 Image segmentation

### 4.3.1 Data preparation

In [0]:
# Dividing the datasets w.r.t. Class Label (defect type)
train_data_1 = X_train[X_train['hasDefect_1']==1][['ImageId','Defect_1']]
train_data_2 = X_train[X_train['hasDefect_2']==1][['ImageId','Defect_2']]
train_data_3 = X_train[X_train['hasDefect_3']==1][['ImageId','Defect_3']]
train_data_4 = X_train[X_train['hasDefect_4']==1][['ImageId','Defect_4']]

val_data_1 = X_val[X_val['hasDefect_1']==1][['ImageId','Defect_1']]
val_data_2 = X_val[X_val['hasDefect_2']==1][['ImageId','Defect_2']]
val_data_3 = X_val[X_val['hasDefect_3']==1][['ImageId','Defect_3']]
val_data_4 = X_val[X_val['hasDefect_4']==1][['ImageId','Defect_4']]

test_data_1 = X_test[X_test['hasDefect_1']==1][['ImageId','Defect_1']]
test_data_2 = X_test[X_test['hasDefect_2']==1][['ImageId','Defect_2']]
test_data_3 = X_test[X_test['hasDefect_3']==1][['ImageId','Defect_3']]
test_data_4 = X_test[X_test['hasDefect_4']==1][['ImageId','Defect_4']]

train_data_1.columns = train_data_2.columns = train_data_3.columns = train_data_4.columns = ['ImageId','EncodedPixels']
val_data_1.columns = val_data_2.columns = val_data_3.columns = val_data_4.columns = ['ImageId','EncodedPixels']
test_data_1.columns = test_data_2.columns = test_data_3.columns = test_data_4.columns = ['ImageId','EncodedPixels']

print(test_data_1.head())
print('-'*50)
print(X_train.shape, X_val.shape, X_test.shape)
print(train_data_1.shape,val_data_1.shape,test_data_1.shape)
print(train_data_2.shape,val_data_2.shape,test_data_2.shape)
print(train_data_3.shape,val_data_3.shape,test_data_3.shape)
print(train_data_4.shape,val_data_4.shape,test_data_4.shape)

In [0]:
# code reference, https://www.kaggle.com/cdeotte/keras-unet-with-eda# https://www.kaggle.com/cdeotte/keras-unet-with-eda
def rle2maskResize(rle):
    '''
    Generates masks for each image taking RLE as input
    Converts run length encoding to an image of shape defined uniform throughout segmentation models: 256x800
    Takes EncodedPixels as input, converts into 256x1600 mask and returns a resized mask image of size 256x800
    '''
    if (pd.isnull(rle))|(rle==''): # If the EncodedPixels string is empty an empty mask is returned
        return np.zeros((256,800) ,dtype=np.uint8)

    height= 256
    width = 1600
    mask= np.zeros( width*height ,dtype=np.uint8)

    array = np.asarray([int(x) for x in rle.split()])
    starts = array[0::2]-1 # The pixel array definition starts from 1 while array starts from 0
    lengths = array[1::2]  # The second element of EncodedPixels is the length denoting number of pixels in successive that are active (value = 1)
    for index, start in enumerate(starts):
        mask[int(start):int(start+lengths[index])] = 1 # Making 
    
    return mask.reshape((height,width),order='F')[::,::2]

In [0]:
# https://www.kaggle.com/cdeotte/keras-unet-with-eda# https://www.kaggle.com/cdeotte/keras-unet-with-eda
# https://stackoverflow.com/questions/52754492/write-custom-data-generator-for-keras
# DataGenerator custom built for training segmentation models with random image augmentations
# 
class train_DataGenerator_3(keras.utils.Sequence): # with augmentation for training
    '''
    The DataGenerator takes a batch of ImageIds of batch size 8 and returns Image array to the model with its mask.
    With the help of ImageIds the DataGenerator locates the Image file in the path, the image is read and resized from
    256 x 1600 to 256x800.
    A set of random numbers are generated to generate random Image Augmentations.
    Shuffling is enabled during training to include variations in the sequence of images processed at each epoch.
    '''
    def __init__(self, df, batch_size = 8,  shuffle=True, 
                 preprocess=None, info={}):
        super().__init__()
        self.df = df
        self.shuffle = shuffle
        self.batch_size = batch_size
        self.preprocess = preprocess
        self.info = info
        self.data_path = '/content/A1_train/'
        self.on_epoch_end()

    def __len__(self):
        return int(np.floor(len(self.df) / self.batch_size))
    
    def on_epoch_end(self):
        self.indexes = np.arange(len(self.df))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)
    
    def __getitem__(self, index): 
        X = np.empty((self.batch_size,256,800,3),dtype=np.float32)
        X1 = np.empty((self.batch_size,256,800,3),dtype=np.float32)

        y = np.empty((self.batch_size,256,800,1),dtype=np.int8)
        y1 = np.empty((self.batch_size,256,800,1),dtype=np.int8)

        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]
        for i,f in enumerate(self.df['ImageId'].iloc[indexes]):
            self.info[index*self.batch_size+i]=f
            X[i,] = Image.open(self.data_path + f).resize((800,256))           
            y[i,:,:,0] = rle2maskResize(self.df['EncodedPixels'].iloc[indexes[i]])
        if self.preprocess!=None: X = self.preprocess(X)

        # generate some random image augmentations
        augment = random()
        if augment>0.35:
            in_gen1 = ImageDataGenerator()
            augment1 = random()
            augment2 = random()
            augment3 = random()
            augment4 = random()
            augment5 = random()
            augment6 = random()

            args = dict(tx = 0, ty = 0, zx = 1.0, zy= 1.0, flip_horizontal = False, flip_vertical = False)

            if augment1>0.5:
                args.update({'tx':50})

            if augment2>0.5:
                args.update({'ty':25})

            if augment3>0.5:
                args.update({'zx':0.9})

            if augment4>0.5:
                args.update({'zy':0.9})

            if augment5>0.5:
                args.update({'flip_horizontal' : True})

            if augment6>0.5:
                args.update({'flip_vertical' : True})

            for i,h in enumerate(X):
                X1[i] = in_gen1.apply_transform(h, transform_parameters = args)
            for i,g in enumerate(y):
                y1[i] = in_gen1.apply_transform(g, transform_parameters = args)
            return X1, y1
        else:
            return X, y

In [0]:
class test_DataGenerator_3(keras.utils.Sequence): # without augmentations for predictions
    '''
    The DataGenerator takes a batch of ImageIds of batch size 1 and returns Image array to the model with mask on validation
    dataset and without mask on test dataset.
    During Prediction and Evaluation stage Image augmentations are to not required. Thus this Train DataGenerator is modified 
    to create test Datagenerator
    With the help of ImageIds the DataGenerator locates the Image file in the path, the image is read and resized from
    256x1600 to 256x800.
    Shuffling is disabled during predictions to make sure each prediction belongs to its corresponding ImageId.
    '''
    def __init__(self, df, batch_size = 1, shuffle=False, 
                 preprocess=None, info={}):
        super().__init__()
        self.df = df
        self.shuffle = shuffle
        self.batch_size = batch_size
        self.preprocess = preprocess
        self.info = info
        self.data_path = '/content/A1_train/'
        self.on_epoch_end()

    def __len__(self):
        return int(np.floor(len(self.df) / self.batch_size))
    
    def on_epoch_end(self):
        self.indexes = np.arange(len(self.df))
        if self.shuffle == True:
            np.random.shuffle(self.indexes)
    
    def __getitem__(self, index): 
        X = np.empty((self.batch_size,256,800,3),dtype=np.float32)
        y = np.empty((self.batch_size,256,800,1),dtype=np.int8)

        indexes = self.indexes[index*self.batch_size:(index+1)*self.batch_size]
        for i,f in enumerate(self.df['ImageId'].iloc[indexes]):
            self.info[index*self.batch_size+i]=f
            X[i,] = Image.open(self.data_path + f).resize((800,256))      
            y[i,:,:,0] = rle2maskResize(self.df['EncodedPixels'].iloc[indexes[i]])
        if self.preprocess!=None: X = self.preprocess(X)
        return X, y

### 4.3.2 Segmentation Model definition

In [0]:
# https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/preprocessing.py
# preprocesses image to input to the segmentation_model, generally image pixel value standardization
preprocess = get_preprocessing('efficientnetb1') 

# https://github.com/qubvel/segmentation_models
# segmentation using pretrained weights for faster convergence
model = Unet('efficientnetb1', classes=1, activation='sigmoid', encoder_weights='imagenet') 
model.summary()

### 4.3.3 Segmentation Model Training, Evaluation and Predictions

### I) Defect 1

In [0]:
if train_segmentation == True:
    # TRAIN AND VALIDATE MODEL
    # Defect 1
    model.compile(optimizer='adam', loss=sm.losses.dice_loss,metrics=[dice_coef])
    train_batches = train_DataGenerator_3(train_data_1,shuffle=True,preprocess=preprocess)    
    valid_batches = test_DataGenerator_3(val_data_1,preprocess=preprocess)
    
    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_Defect_1_01_02_2020'
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    
    # https://keras.io/callbacks/
    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_1_01_02_2020.h5', monitor='val_dice_coef', mode='max', verbose=1, save_best_only=True)
    #model training
    history = model.fit_generator(train_batches, validation_data = valid_batches, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    
    # plotting the metric
    vy = history.history['val_dice_coef']
    ty = history.history['dice_coef']
    x = list(range(1,len(vy)+1))
    fig,ax = plt.subplots(1,1)
    ax.plot(x,vy,'r',label = "Validation dice")
    ax.plot(x,ty,'b',label = "Train dice")
    ax.set_xlabel('epoch')
    ax.set_ylabel('Metric: dice coef')
    plt.legend()
    plt.grid()
    plt.show()
    file_writer.close()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_1_tensorboard/train_dice.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_1_tensorboard/val_dice.jpg?raw=1' width=600px align = left>

**Summary:** Improvement of the Model performance at each epoch is smooth. It can also be observed that the model is not overfitting on the training set as the dice coefficien values on both the datasets are closer to each other and are improving simultaneously.

### Defect type 1: Evaluation

In [0]:
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_1_01_02_2020.h5', custom_objects=dependencies)

In [0]:
train_evaluate = model.evaluate(test_DataGenerator_3(train_data_1,preprocess=preprocess),verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate, columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
validation_evaluate = model.evaluate(test_DataGenerator_3(val_data_1,preprocess=preprocess),verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(validation_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
test_evaluate = model.evaluate(test_DataGenerator_3(test_data_1,preprocess=preprocess),verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

**Summary:** The values of dice coefficient metric for Defect 1 train, test and validation images can be seen to be far from each other. <br>

**Dice loss = 1 - dice coefficient.**
The performance of the model on dice_coefficient needs improvement which can be achieved by further training the model to 100+ epochs.

**Note:** Dice coefficient is also known as F1_score.

In [0]:
# Train dataset prediction visualization
train_preds = model.predict_generator(test_DataGenerator_3(train_data_1[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + train_data_1[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(train_data_1[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(train_data_1[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(train_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The above visualizations on training image dataset show how well the images are trained with supervised learning. The approximation in the predictions profile compered to true profile tells that the models can be further trained to identify the type 1 defects.

In [0]:
# Validation set
val_preds = model.predict_generator(test_DataGenerator_3(val_data_1[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + val_data_1[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(val_data_1[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(val_data_1[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(val_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The visualizations on validation dataset indicates that the model is performing well in identifying trained defect locations.

In [0]:
# Test set
test_preds = model.predict_generator(test_DataGenerator_3(test_data_1[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + test_data_1[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(test_data_1[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(test_data_1[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(test_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The above visualizations on test images tells us that the predicted locations of defect are similar to that of ground truth masks. The approximation in the predictions profile compered to true profile tells that the models can be further trained to identify the type 1 defects.

### II) Defect 2

In [0]:
if train_segmentation == True:
    # TRAIN AND VALIDATE MODEL
    # Defect 2
    model.compile(optimizer='adam', loss=sm.losses.dice_loss,metrics=[dice_coef])
    train_batches = train_DataGenerator_3(train_data_2,shuffle=True,preprocess=preprocess)    
    valid_batches = test_DataGenerator_3(val_data_2,preprocess=preprocess)
    
    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_Defect_2_01_02_2020'
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    
    # https://keras.io/callbacks/
    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_2_01_02_2020.h5', monitor='val_dice_coef', mode='max', verbose=1, save_best_only=True)
    #model training
    history = model.fit_generator(train_batches, validation_data = valid_batches, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    
    # plotting the metric
    vy = history.history['val_dice_coef']
    ty = history.history['dice_coef']
    x = list(range(1,len(vy)+1))
    fig,ax = plt.subplots(1,1)
    ax.plot(x,vy,'r',label = "Validation dice")
    ax.plot(x,ty,'b',label = "Train dice")
    ax.set_xlabel('epoch')
    ax.set_ylabel('Metric: dice coef')
    plt.legend()
    plt.grid()
    plt.show()
    file_writer.close()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_2_tensorboard/train_dice.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_2_tensorboard/val_dice.jpg?raw=1' width=600px align = left>

**Summary:** Dice coefficient can be seen to get stabilised in 0.65-0.7 range on validation set Thus, to improve performance number of training images should be increased and other Nueral Network architectures are required to be explored.

### Defect type 2: Evaluation

In [0]:
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_2_01_02_2020.h5', custom_objects=dependencies)

In [0]:
train_evaluate = model.evaluate(test_DataGenerator_3(train_data_2,preprocess=preprocess),verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate, columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
validation_evaluate = model.evaluate(test_DataGenerator_3(val_data_2,preprocess=preprocess),verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(validation_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
test_evaluate = model.evaluate(test_DataGenerator_3(test_data_2,preprocess=preprocess),verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

**Summary:** The dice coefficient of test set and validation set predictions can be seen to be closer to each other which imples that the model is generalizing well on unseen images. 

In [0]:
# Train dataset prediction visualization
train_preds = model.predict_generator(test_DataGenerator_3(train_data_2[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + train_data_2[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(train_data_2[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(train_data_2[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(train_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the training is satisfactory. The defect regional profiles can be seen to match in the training set.

In [0]:
# Validation set
val_preds = model.predict_generator(test_DataGenerator_3(val_data_2[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + val_data_2[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(val_data_2[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(val_data_2[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(val_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the validation set.

In [0]:
# Test set
test_preds = model.predict_generator(test_DataGenerator_3(test_data_2[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + test_data_2[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(test_data_2[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(test_data_2[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(test_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the test set.

### III) Defect 3

In [0]:
if train_segmentation == True:
    # TRAIN AND VALIDATE MODEL
    # Defect 3
    model.compile(optimizer='adam', loss=sm.losses.dice_loss,metrics=[dice_coef])
    train_batches = train_DataGenerator_3(train_data_3,shuffle=True,preprocess=preprocess)    
    valid_batches = test_DataGenerator_3(val_data_3,preprocess=preprocess)
    
    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_Defect_3_01_02_2020'
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    
    # https://keras.io/callbacks/
    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_3_01_02_2020.h5', monitor='val_dice_coef', mode='max', verbose=1, save_best_only=True)
    #model training
    history = model.fit_generator(train_batches, validation_data = valid_batches, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    
    # plotting the metric
    vy = history.history['val_dice_coef']
    ty = history.history['dice_coef']
    x = list(range(1,len(vy)+1))
    fig,ax = plt.subplots(1,1)
    ax.plot(x,vy,'r',label = "Validation dice")
    ax.plot(x,ty,'b',label = "Train dice")
    ax.set_xlabel('epoch')
    ax.set_ylabel('Metric: dice coef')
    plt.legend()
    plt.grid()
    plt.show()
    file_writer.close()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_3_tensorboard/train_dice.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_3_tensorboard/val_dice.jpg?raw=1' width=600px align = left>

**Summary:** The model performance got constrained near 0.72 dice coefficient level on defect type 3. To improve the performance training for 100 + epochs is required while the Neural Network architecture can also be experimented with.

### Defect type 3: Evaluation

In [0]:
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_3_01_02_2020.h5', custom_objects=dependencies)

In [0]:
train_evaluate = model.evaluate(test_DataGenerator_3(train_data_3,preprocess=preprocess),verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate, columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
validation_evaluate = model.evaluate(test_DataGenerator_3(val_data_3,preprocess=preprocess),verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(validation_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
test_evaluate = model.evaluate(test_DataGenerator_3(test_data_3,preprocess=preprocess),verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

**Summary:** The dice coefficient of test set and validation set predictions can be seen to be closer to each other which imples that the model is generalizing well on unseen images. 

In [0]:
# Train dataset prediction visualization
train_preds = model.predict_generator(test_DataGenerator_3(train_data_3[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + train_data_3[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(train_data_3[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(train_data_3[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(train_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the training is satisfactory. The defect regional profiles can be seen to match in the training set.

In [0]:
# Validation set
val_preds = model.predict_generator(test_DataGenerator_3(val_data_3[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + val_data_3[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(val_data_3[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(val_data_3[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(val_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the validation set.

In [0]:
# Test set
test_preds = model.predict_generator(test_DataGenerator_3(test_data_3[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + test_data_3[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(test_data_3[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(test_data_3[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(test_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the test set.

### IV) Defect 4

In [0]:
if train_segmentation == True:
    # TRAIN AND VALIDATE MODEL
    # Defect 4
    model.compile(optimizer='adam', loss=sm.losses.dice_loss,metrics=[dice_coef])
    train_batches = train_DataGenerator_3(train_data_4,shuffle=True,preprocess=preprocess)    
    valid_batches = test_DataGenerator_3(val_data_4,preprocess=preprocess)
    
    # https://www.tensorflow.org/tensorboard/r2/scalars_and_keras
    logdir = "/content/drive/My Drive/severstal_february/severstal_logs/scalars/" + datetime.now().strftime("%Y%m%d-%H%M%S")+'_Defect_4_01_02_2020'
    file_writer = tf.summary.FileWriter(logdir + "/metrics")
    tensorboard = keras.callbacks.TensorBoard(log_dir=logdir,histogram_freq=0,write_images=True)
    
    # https://keras.io/callbacks/
    mc = ModelCheckpoint('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_4_01_02_2020.h5', monitor='val_dice_coef', mode='max', verbose=1, save_best_only=True)
    #model training
    history = model.fit_generator(train_batches, validation_data = valid_batches, epochs = epochs, verbose=1, callbacks = [mc,tensorboard])
    
    # plotting the metric
    vy = history.history['val_dice_coef']
    ty = history.history['dice_coef']
    x = list(range(1,len(vy)+1))
    fig,ax = plt.subplots(1,1)
    ax.plot(x,vy,'r',label = "Validation dice")
    ax.plot(x,ty,'b',label = "Train dice")
    ax.set_xlabel('epoch')
    ax.set_ylabel('Metric: dice coef')
    plt.legend()
    plt.grid()
    plt.show()
    file_writer.close()

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_4_tensorboard/train_dice.jpg?raw=1' width=600px align = left>

<img src='https://github.com/secutron/steel-defect-detection/blob/master/defect_4_tensorboard/val_dice.jpg?raw=1' width=600px align = left>

**Summary:** The model performance got constrained near 0.78 dice coefficient level on defect type 4. To improve the performance training for 100 + epochs is required while the Neural Network architecture can also be experimented with.

### Defect type 4: Evaluation

In [0]:
model = load_model('/content/drive/My Drive/severstal_february/severstal_model/severstal_segmentation_Defect_4_01_02_2020.h5', custom_objects=dependencies)

In [0]:
train_evaluate = model.evaluate(test_DataGenerator_3(train_data_4,preprocess=preprocess),verbose=1)
print('Train set evaluation score:')
pd.DataFrame(train_evaluate, columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
validation_evaluate = model.evaluate(test_DataGenerator_3(val_data_4,preprocess=preprocess),verbose=1)
print('Validation set evaluation score:')
pd.DataFrame(validation_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

In [0]:
test_evaluate = model.evaluate(test_DataGenerator_3(test_data_4,preprocess=preprocess),verbose=1)
print('Test set evaluation score:')
pd.DataFrame(test_evaluate,columns = [' '], index=['dice_loss','dice_coef'])

**Summary:** The dice coefficient of test set and validation set predictions can be seen to be closer to each other which imples that the model is generalizing well on unseen images. 

In [0]:
# Train dataset prediction visualization
train_preds = model.predict_generator(test_DataGenerator_3(train_data_4[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + train_data_4[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(train_data_4[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(train_data_4[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(train_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

In [0]:
# Validation set
val_preds = model.predict_generator(test_DataGenerator_3(val_data_4[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + val_data_4[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(val_data_4[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(val_data_4[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(val_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the validation set.

In [0]:
# Test set
test_preds = model.predict_generator(test_DataGenerator_3(test_data_4[10:20],preprocess=preprocess),verbose=1)
for i in range(10):
    fig, (ax1,ax2,ax3) = plt.subplots(1,3,figsize=(20, 13))
    img = cv2.imread(str("/content/A1_train/" + test_data_4[10:20].ImageId.values[i]))
    ax1.imshow(img)
    ax1.set_title(test_data_4[10:20].ImageId.values[i])

    ax2.imshow(rle2mask(test_data_4[10:20].EncodedPixels.values[i]))
    ax2.set_title('Ground Truth Mask')

    c1 = Image.fromarray(test_preds[i][:,:,0])
    ax3.imshow(np.array(c1.resize((1600,256)))>0.5)
    ax3.set_title('Predicted Mask')
    plt.show()

**Summary:** The evaluation visualization indicates that the model predictability is satisfactory. The defect regional profiles can be seen to match in the test set.

In [0]:
# https://stackoverflow.com/questions/21892570/ipython-notebook-align-table-to-the-left-of-cell

%%html
<style>
table {float:left}
</style>


### Performance of the above trained models:

**Binary Classifier:** <br>

| Dataset | binary_crossentropy | acc | f1_score_m |precision_m | recall_m |
| :---: | :---: | :---: | :---: | :---: | :---: |
| X_train | 0.202241 | 0.923630 | 0.921999 | 0.949316 | 0.905966 |
| X_val | 0.240638 | 0.912064 | 0.912423 | 0.937087 | 0.898664 |
| X_test | 0.194755 | 0.926810 | 0.921435 | 0.955327 | 0.902135 |


**Multi Label Classifier:** <br>

| Dataset | binary_crossentropy | acc | f1_score_m |precision_m | recall_m |
| :---: | :---: | :---: | :---: | :---: | :---: |
| X_train | 0.081054 | 0.968118 | 0.940510 | 0.945815 | 0.937232 |
| X_val | 0.092119 | 0.962500| 0.929417 | 0.929264 | 0.931588 |
| X_test | 0.094178 |  0.965517 | 0.936398 | 0.941134 | 0.933854 |


**Segmentation models: Dice Coefficient:** <br>

| Dataset | Defect 1 model | Defect 2 model | Defect 3 model | Defect 4 model |
| :---: | :---: | :---: | :---: | :---: | :---: |
| X_train | 0.714258 | 0.766948 | 0.735519 | 0.821943 |
| X_val | 0.665121 | 0.678812 | 0.709641 | 0.76066 |
| X_test | 0.611203 | 0.655394 | 0.698548 | 0.78822 |


## 5. Predictions and Kaggle Score

See Inference.ipynb for how to make predictions on new images.

In [0]:
Upload predictions on raw test data from final.ipynb to Kaggle as a Dataset and run the following code in a Kaggle Kernel

In [0]:
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
df = pd.read_csv('../input/severstal-steel-defect-detection/sample_submission.csv')
df['EncodedPixels']=['' for i in range(len(df))]
#df_submit = pd.read_csv("../input/sever-sub-13012020/sever_sub_13012020.csv")
df_submit = pd.read_csv("../input/sever-3101/severstal_final_test_preds_2.csv").fillna('')
if df.columns[0]=='ImageId_ClassId':
    df.set_index('ImageId_ClassId', inplace=True)
    df_submit.set_index('ImageId_ClassId', inplace=True)

    for name, row in df_submit.iterrows():
        df.loc[name] = row

    df.reset_index(inplace=True)

df.to_csv('submission.csv', index=False)

### Mean Dice Coefficient of test data predictions:

<img src='https://github.com/secutron/steel-defect-detection/blob/master/final_ipynb_score.jpg?raw=1' width=1200px align=left>

## 6. Summary

1. Images and its masks (in form of EncodedPixels) are provided to train a Deep Learning Model to Detect and Classify defects in steel. (Multi-label Classification). The competition is hosted by Severstal on Kaggle.
2. Exploratory Data Analysis revealed that the dataset is imbalanced. A new feature 'area' is created to clip predictions with segmentation areas within a determined range. Different classes are observed to overlap on smaller values of area feature. This makes class separation not possible based solely on 'area' feature. It was observed that most of the images either contain one defect or do not have a defect.
3. A 6 model architecture is generated to train and test on this dataset. One binary classifier, One Multi-Label Classifier and Four segmentation models are used for the task. 
4. Image data contains minimal preprocessing. Pixel value scaling and Image augmentations for Model training are achieved using DataGenerators.
5. Minority class priority based stratified sampling is performed on the dataset to split train set into train and validation sets.
6. Pre-trained Deep Learning models are used: Xception architecture for Classification and legendary Unet architecture with efficientnetb1 backbone trained on ImageNet dataset for Segmentation.
7. Tenosorboard is utilized for saving logs and visualizing model performance at each epoch. It has been observed that the models have satisfactory performance on defined metrics. It can also be deduced that a certain degree of confusion exists in both classification and segmentation models as the defect detection and loalization are not perfect. 
8. A final.ipynb notebook is submitted which is an inference version of this notebook.