<a href="https://colab.research.google.com/github/cagBRT/computer-vision/blob/master/CV9c_Semantic_Segmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!git clone -l -s https://github.com/cagBRT/computer-vision.git cloned-repo
%cd cloned-repo

In [None]:
!git clone https://github.com/rkuo2000/image-segmentation-keras
#%cd image-segmentation-keras

In [None]:
#!pip install --upgrade keras
!pip install tensorflow==2.8

In [None]:
!pip install keras-segmentation

In [None]:
# import the necessary packages
import numpy as np
#import time
import cv2
import os
import cv2
import matplotlib.pyplot as plt
from google.colab.patches import cv2_imshow
from six.moves import urllib
from matplotlib import gridspec
from PIL import Image
from tensorflow import keras
from keras.utils import np_utils

Semantic segmentation classifies every pixel of the image to one of the classes.

# **PSPNet: the model**<br>
PSPNet (Pyramid Scene Parsing Network)<br>
- **Semantic Segmentation** is to know the category label of each pixels for known objects only.
- **Scene Parsing**, which is based on Semantic Segmentation, is to know the category label of ALL pixels within the image.<br>


In [None]:
imageP = cv2.imread("/content/cloned-repo/images/sceneParsing.png")
cv2_imshow(imageP)

PSPNet uses global information to parse a scene<br>
- **Mismatched Relationship**: FCN predicts the boat in the yellow box as a “car” based on its appearance. But the common knowledge is that a car is seldom over a river.
- **Confusion Categories**: FCN predicts the object in the box as part of skyscraper and part of building. These results should be excluded so that the whole object is either skyscraper or building, but not both.
-**Inconspicuous Classes**: The pillow has similar appearance with the sheet. Overlooking the global scene category may fail to parse the pillow.

In [None]:
imageP = cv2.imread("/content/cloned-repo/images/contextAgg.png")
imageP = cv2.resize(imageP, (900,600))
cv2_imshow(imageP)

**PSPNet in video segmentation**

In [None]:
from IPython.display import Image
Image(url="https://miro.medium.com/max/1200/1*J33mxWAtCSEV1GsWV3vKLQ.gif")


**PSPNet Architecture**

In [None]:
imageP = cv2.imread("/content/cloned-repo/images/pspNet.png")
cv2_imshow(imageP)

**A example image**
This image was taken from a video, it is one of a series of images from the video. 

In [None]:
image = cv2.imread("/content/cloned-repo/datasetSeg/images_prepped_test/0016E5_07971.png")
cv2_imshow(image)

# **Install the required libraries**

In [None]:
!apt-get install -y libsm6 libxext6 libxrender-dev
!pip install opencv-python

In [None]:
!pip install keras-segmentation

In [None]:
import cv2
from google.colab.patches import cv2_imshow

# **The datasets**<br>
Three different datasets are used to train the same model. <br>
This means each of the models - although the same model - will return a different segmentation image. <br>

- **pspnet_50_ADE_20K**:  the 20,000-image ADE20K challenge dataset. This ADE20K dataset is a landmark image segmentation dataset, containing a large corpus of both indoor and outdoor images. Every image has an accompanying image segmentation mask dividing the image into 150 different classes pixel-by-pixel.<br>
The dataset can be found here: https://groups.csail.mit.edu/vision/datasets/ADE20K/<br>
The labels can be found here: https://github.com/CSAILVision/sceneparsing/tree/master/visualizationCode/color150<br>
<br>
- **psp_101_cityscapres**: this dataset focuses on semantic understanding of urban street scenes. It has 30 classes and 20,000 images <br>
The dataset can be found here: https://www.cityscapes-dataset.com/dataset-overview/<br>
<br>
- **pspnet_101_voc12**: the main goal of the datset is the detection and identification of individual objects from a number of visual object classes in a realistic scene. The data set has 21 classes<br>
The class list can be found here: https://github.com/NVIDIA/DIGITS/blob/master/examples/semantic-segmentation/pascal-voc-classes.txt<br>
The dataset can be found here: https://deepai.org/dataset/pascal-voc


# **Loading the pretrained models**
Three pretrained models PSPNet models are loaded. Each model is trained on a different dataset. <br> The same image is used for each model and a comparison of the outputs is compared. 

In [None]:
from keras_segmentation.pretrained import pspnet_50_ADE_20K,pspnet_101_cityscapes, pspnet_101_voc12

model1 = pspnet_50_ADE_20K() # load the pretrained model trained on ADE20k dataset
model2 = pspnet_101_cityscapes() # load the pretrained model trained on Cityscapes dataset
model3 = pspnet_101_voc12() # load the pretrained model trained on Pascal VOC 2012 dataset


out = model1.predict_segmentation(
    inp=image,
    out_fname="out1.png"
)
out2 = model2.predict_segmentation(
    inp=image,
    out_fname="out2.png"
)
out3 = model3.predict_segmentation(
    inp=image,
    out_fname="out3.png"
)

**A helper function for plotting the images**

In [None]:
def plotting(image, outSeg):
  plt.figure(figsize=(15, 8))
  grid_spec = gridspec.GridSpec(1, 3, width_ratios=[6, 6, 6])

  plt.subplot(grid_spec[0])
  plt.imshow(image)
  plt.axis('off')
  plt.title('input image')

  plt.subplot(grid_spec[1])
  plt.imshow(outSeg)
  plt.axis('off')
  plt.title('segmentation map')

  plt.subplot(grid_spec[2])
  plt.imshow(image)
  plt.imshow(outSeg, alpha=0.7)
  plt.axis('off')
  plt.title('segmentation overlay')

In [None]:
imageO1 = cv2.imread("out1.png")
imageO2 = cv2.imread("out2.png")
imageO3 = cv2.imread("out3.png")

**The model trained on the ADE20k dataset**

In [None]:
plotting(image, imageO1)

**The model trained on the Cityscapes dataset**

In [None]:
plotting(image, imageO2)

**The model trained on the Pascal VOC 2012 dataset**

In [None]:
plotting(image, imageO3)