<a href="https://colab.research.google.com/github/FemiAdesola/Data-Science/blob/main/Maintenance_%26_Manufacturing_Department.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# <ins>**THE DATA SCIENCE FOR BUSINESS**


<table>
  <tr><td>
    <img src="https://drive.google.com/uc?id=1rcxnQuLqFyn8l9hQmdyp-yxSaXhxJPed"
         alt="Fashion MNIST sprite"  width="1000">
  </td></tr>
  <tr><td align="center">
    <b>Figure 1. Predict Defects Using Deep Learning
  </td></tr>
</table>


# <ins>**Maintenance & Manufacturing Department**
+ Artificial Intelligence and Machine Learning are transforming the manufacturing industry. According to the report released by World Economic Forum, these technologies will play significant roles in the fourth industrial revolution. Major areas which can be benefited from this are:
  + Maintenance Department
  + Production Department
  + Supply Chain Department
+ Deep learning has been proven to be superior in detecting and localizing defects using imagery data which could significantly improve the production efficiency in the manufacturing industry.
+ Great Example from LandingAI:

https://landing.ai/defect-detection


## <ins>**Case Study**

In this case study, we will assume that you work as an Al/ ML consultant.
+ You have been hired by a steel manufacturing company in San Diego and you have been tasked to automate the process of detecting and localizing defects found in Steel manufacturing.
+ Detecting defects would help in improving the quality of manufacturing as well as in reducing the waste due to production defects.
+ The team has collected images of steel surfaces and have approached you to develop a model that could detect and localize defects in real-time.
+ You have been provided with 12600 images that contain 4 types of defects, along with their location in the steel surface.

![alt text](https://drive.google.com/uc?id=1nzRA5KU0eWo0YPZtSxUWOoMlvooKGCkt)

## <ins>**WHAT IS IMAGE SEGMENTATION?**
+ The goal of image segmentation is to understand and extract information from images at the pixel-level.
+ Image Segmentation can be used for object recognition and localization which offers tremendous value in many applications such as medical imaging and self-driving cars etc.
+ The goal of image segmentation is to train a neural network to produce pixel-wise mask of the image.
+ Modern image segmentation techniques are based on deep learning approach which makes use of common architectures such as CNN, FCs (Fully Convolution Networks) and Deep Encoders-Decoders.
+ You will be using ResUNet architecture to solve the current task.
+ When we applied CNN for image classification problems? We had to convert the image into a vector and possibly add a classification head at the end.
+ However, in case of Unet, we convert (encode) the image into a vector followed by up sampling (decode) it back again into an image.
+ In case of Unet, the input and output have the same size so the size of the image is preserved.
+ **For classical CNNs**: they are generally used when the entire image is needed to be classified as a class label.
+ **For Unet**: pixel level classification is performed.
+ **U-net** formulates a loss function for every pixel in the input image.
+ **Softmax** function is applied to every pixel which makes the segmentation problem works as a classification problem where classification is performed on every pixel of the image

https://aditi-mittal.medium.com/introduction-to-u-net-and-res-net-for-image-segmentation-9afcb432ee2f


# <ins>**IMPORT LIBRARIES AND DATASETS**

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import zipfile
import cv2
from skimage import io
import tensorflow as tf
from tensorflow.python.keras import Sequential
from tensorflow.keras import layers, optimizers
from tensorflow.keras.applications import DenseNet121
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.initializers import glorot_uniform
from tensorflow.keras.utils import plot_model
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint, LearningRateScheduler
from IPython.display import display
from tensorflow.keras import backend as K
from sklearn.preprocessing import StandardScaler, normalize
import os
from google.colab import files
%matplotlib inline


In [2]:
# You will need to mount your drive using the following commands:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


## <ins>**MASK:**
+ The goal of image segmentation is to understand the image at the pixel level. It associates each pixel with a certain class. The output produce by image segmentation model is called a
"mask" of the image.
+ Masks can be represented by associating pixel values with their coordinates. For example if we have a black image of shape (2,2), this can be represented as:\
[[0, 0], B B\
[0, 0]] B B

If our output mask is as follows:\
[[255, 01, W B \
[0,255, B W]]

+ To represent this mask we have to first flatten the image into a
1-D array. This would result in something like [255,0,0,255] for mask. Then, we can use the index to create the mask. \
Finally we would have something like [1,0,0,1] as our mask.

## <ins>**RUN LENGTH ENCODING (RLE):**
+ Sometimes it is hard to represent mask using index as it would make the length of mask equal to product of height and width of the image
+ To overcome this we use lossless data compression technique called Run-length encoding (RLE), which stores sequences that contain many consecutive data elements as a single data value followed by the count.
+ For example, assume we have an image (single row) containing plain black text on a solid white background. **B** represents black pixel and **W** represents white: \

  WWWWWWWWWWWWBWWWWWWWW \
  WWWWWBBBWWWWWWWWWWWWW \
  WWWWWWWWWWWWBWWWWWWWW \
  WWWWWW

+ Run-length encoding (RLE):
    + 12W1B12W3B24W1B14W

+ This can be interpreted as a sequence of twelve Ws, one B, twelve Ws, three Bs, etc.,

In [37]:
# data containing defect images with segmentation mask
defect_class_mask_df = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Manufacturing/train_M.csv')

And the goal of image segmentation is to train in vour network to produce pixel y's mask of the image.

In [38]:
# data containing defective and non defective images
all_images_df = pd.read_csv('/content/drive/My Drive/Colab Notebooks/Manufacturing/defect_and_no_defect.csv')

In [39]:
defect_class_mask_df

Unnamed: 0,ImageId,ClassId,EncodedPixels
0,d2291de5c.jpg,1,147963 3 148213 9 148461 18 148711 24 148965 2...
1,78416c3d0.jpg,3,54365 3 54621 7 54877 10 55133 12 55388 14 556...
2,2283f2183.jpg,3,201217 43 201473 128 201729 213 201985 5086 20...
3,f0dc068a8.jpg,3,159207 26 159412 77 159617 128 159822 179 1600...
4,00d639396.jpg,3,229356 17 229595 34 229850 36 230105 37 230360...
...,...,...,...
5743,c12842f5e.jpg,3,88 23 342 29 596 34 850 39 1105 44 1361 46 161...
5744,2222a03b3.jpg,3,63332 4 63587 11 63841 20 64096 27 64351 35 64...
5745,b43ea2c01.jpg,1,185024 7 185279 11 185535 12 185790 13 186045 ...
5746,1bc37a6f4.jpg,3,303867 1 304122 3 304376 6 304613 3 304630 9 3...


In [40]:
all_images_df

Unnamed: 0,ImageID,label
0,0002cc93b.jpg,1
1,0007a71bf.jpg,1
2,000a4bcdd.jpg,1
3,000f6bf48.jpg,1
4,0014fce06.jpg,1
...,...,...
12992,0482ee1d6.jpg,0
12993,04802a6c2.jpg,0
12994,03ae2bc91.jpg,0
12995,04238d7e3.jpg,0


# <ins>**VISUALIZE AND EXPLORE DATASET**

In [41]:
defect_class_mask_df['mask'] = defect_class_mask_df['ClassId'].map(lambda x: 1)

In [43]:
defect_class_mask_df.head(10)

Unnamed: 0,ImageId,ClassId,EncodedPixels,mask
0,d2291de5c.jpg,1,147963 3 148213 9 148461 18 148711 24 148965 2...,1
1,78416c3d0.jpg,3,54365 3 54621 7 54877 10 55133 12 55388 14 556...,1
2,2283f2183.jpg,3,201217 43 201473 128 201729 213 201985 5086 20...,1
3,f0dc068a8.jpg,3,159207 26 159412 77 159617 128 159822 179 1600...,1
4,00d639396.jpg,3,229356 17 229595 34 229850 36 230105 37 230360...,1
5,17d02873a.jpg,3,254980 43 255236 127 255492 211 255748 253 256...,1
6,47b5ab1bd.jpg,3,128976 8 129230 12 129484 16 129739 23 129995 ...,1
7,a6ecee828.jpg,3,179011 27 179126 73 179259 39 179375 80 179497...,1
8,11aaf18e2.jpg,3,303235 2 303489 7 303743 9 303997 11 304181 2 ...,1
9,cdf669a1f.jpg,4,310246 11 310499 25 310753 28 311007 31 311262...,1


In [5]:
# Some images are classified with more than one defect, let's explore this futher
# we have one image with 3 types of defects
# we have 272 images with 2 types of defects
# we have 5201 images with 1 type of defect



In [6]:
# Let's count defective and non defective images


In [7]:
# Visualize images with defects along with their corresponding labels
# Images are 256 x 1600

In [8]:
# Let's try to use the rle2mask on a sample image


In [9]:
# Let's show the mask


# <ins>**UNDERSTAND THE THEORY AND INTUITION BEHIND CONVOLUTIONAL NEURAL NETWORKS, RESNETS, AND TRANSFER LEARNING**

![alt text](https://drive.google.com/uc?id=1HD2FFDD8fonGMyHARfw8ZqaofP3Udek6)

![alt text](https://drive.google.com/uc?id=1-HAo3xcPKGoH-gG8495p12o33nUC1j6W)

![alt text](https://drive.google.com/uc?id=1zmzg777lS1PGkTyJXA5fPmrJ9mcKneDi)


## <ins>**TRANSFER LEARNING TRAINING STRATEGIES**
+ **Strategy #1 Steps:**
  + Freeze the trained CNN network weights from the first layers.
  + Only train the newly added dense layers (with randomly initialized weights).
+ **Strategy #2 Steps:**
  + Initialize the CNN network with the pre-trained weights
  + Retrain the entire CNN network while setting the learning rate to be very small, this is critical to ensure that you do not aggressively change the trained weights.
+ **Transfer learning advantages are:**
  + Provides fast training progress, you don't have to start from scratch using randomly initialized weights
  + You can use small training dataset to achieve incredible results

# <ins>**BUILD AND TRAIN A DEEP LEARNING MODEL TO DETECT WHETHER A DEFECT IS PRESENT IN AN IMAGE OR NOT**

In [10]:
# split the data (defective and non defective) into training and testing

In [11]:
# create a image generator for the training and validation dataset
# we will divide the data to training, validation and testing
# Training = 9390
# validation = 1657
# testing = 1950


# Create a data generator which scales the data from 0 to 1 and makes validation split of 0.15


In [12]:
# Create a data generator for test images



In [13]:
# freeze the model weights


In [14]:
# use early stopping to exit training if validation loss is not decreasing even after certain epochs (patience)


# save the best model with least validation loss


In [15]:
# (WARNING TAKES LONG TIME (~90 mins)!)

In [16]:
# save the trained model architecture for future use

# <ins>**ASSESS TRAINED MODEL PERFORMANCE**

In [17]:
# Make prediction (WARNING TAKES LONG TIME (~10 mins)!)



In [18]:
# Since we have used sigmoid activation at the end, our result would contain continuous values from 0 to 1.
# The network is initially used to classify whether the image has defect or not
# Then these images (defective) is passed through the segmentation network to get the localization and type of defect.
# Let's choose 0.01, to make sure, that we omit images from passing through the segmentation network only we are highly certain that it has no defect and if we are not confident, we can pass this image through the segmentation
# network


In [19]:
# since we have used test generator, it limited the images to 1936, due to batch size



In [20]:
# Find the accuracy of the model


In [21]:
# Plot the confusion matrix


In [22]:
# Print the classification report


# <ins>**UNDERSTAND THE THEORY AND INTUITION BEHIND RESUNET (SEGMENTATION)**

![alt text](https://drive.google.com/uc?id=1D7mAjdEFv6cIb4UFiXwndJy6enzZQzpb)

![alt text](https://drive.google.com/uc?id=1TK1Y9gry62NORdA-EjWD8HJadBZMa_sL)

## <ins>**RESUNET ARCHITECTURE:**
1. Encoder or contracting path consist of 4 blocks:
  + First block consists of 3×3 convolution layer + Relu + Batch-Normalization
  + Remaining three blocks consist of Res-blocks followed by
Max-pooling 2x2.
2. Bottleneck:
  + It is in-between the contracting and expanding path.
  + It consist of Res-block followed by up sampling conv layer
2x2.
3. Expanding or Decoder path consist of 4 blocks:
  + 3 blocks following bottleneck consist of Res-blocks followed by up-sampling conv layer 2 x 2
  + Final block consist of Res-block followed by 1x1 conv layer.


## **RESUNET ADDITIONAL RESOURCES:**

Paper #1: https://arxiv.org/abs/1505.04597


Paper #2: https://arxiv.org/abs/1904.00592



https://aditi-mittal.medium.com/introduction-to-u-net-and-res-net-for-image-segmentation-9afcb432ee2f


# <ins>**BUILD A RESUNET SEGMENTATION MODEL**

In [23]:
#spliting the data into train and test data


In [24]:
#creating separate list for imageId, classId and rle to pass into the generator

In [25]:
# function to upscale and concatnating the values passsed

## Loss function:

We need a custom loss function to train this ResUNet.So,  we have used the loss function as it is from https://github.com/nabsabraham/focal-tversky-unet/blob/master/losses.py


@article{focal-unet,
  title={A novel Focal Tversky loss function with improved Attention U-Net for lesion segmentation},
  author={Abraham, Nabila and Khan, Naimul Mefraz},
  journal={arXiv preprint arXiv:1810.07842},
  year={2018}
}

In [26]:
# using early stopping to exit training if validation loss is not decreasing even after certain epochs (patience)

# save the best model with lower validation loss


In [27]:
# save the model for future use


# <ins>**ASSESS TRAINED SEGMENTATION MODEL PERFORMANCE**

In [28]:
# data containing test images for segmentation task



In [29]:
# create a dataframe for the result

In [30]:
# Let's show the images along with their original (ground truth) masks


In [31]:
# visualize the results (model predictions)