# **Semantic vs Instance vs Panoptic**

Image segmentation forms the basis of numerous Computer Vision projects. It segments the visual input in order to process it for tasks such as image classification and object detection. However, all the segmentation techniques may not delineate the objects in an image factory with equally satisfying accuracy. Some may be capable of merely identifying the presence of different kinds of objects in the image, some may separate out occurrences of each object type while some others may perform both these tasks. Accordingly, recent image segmentation methods can be classified into three categories viz. semantic segmentation, instance segmentation and panoptic segmentation. 

To read about it more, please refer [this](https://analyticsindiamag.com/semantic-vs-instance-vs-panoptic-which-image-segmentation-technique-to-choose/) article.

## **Practical Implementation**

To compare all the three image segmentation techniques, we have applied each of them on a common image. Have a look at the input image as well as the code and output of each segmentation method.

## **Semantic segmentation**

We have used the [PixelLib](https://pypi.org/project/pixellib/) Python library here which has been built for performing segmentation of images and videos with much ease.

Install PixelLib and its dependencies as follows:

In [None]:
!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn tensorflow keras opencv-python pillow scikit-image torch torchvision \
     tqdm pixellib --user -q --no-warn-script-location

import IPython
IPython.Application.instance().kernel.do_shutdown(True)


Import statements

In [None]:
import pixellib
from pixellib.semantic import semantic_segmentation

Instantiate the semantic_segmentation class of pixellib

In [None]:
segment_image = semantic_segmentation()

Load the exception model trained on pascal voc for segmenting objects. The model can be downloaded from [here](https://github.com/ayoolaolafenwa/PixelLib/releases/download/1.1/deeplabv3_xception_tf_dim_ordering_tf_kernels.h5).

In [None]:
# !wget https://github.com/ayoolaolafenwa/PixelLib/releases/download/1.1/deeplabv3_xception_tf_dim_ordering_tf_kernels.h5

In [None]:
segment_image.load_pascalvoc_model("https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/image_segmentation/deeplabv3_xception_tf_dim_ordering_tf_kernels.h5")

Perform instance segmentation on an image

In [None]:
# !wget http://images.cocodataset.org/val2017/000000281759.jpg

In [None]:
segment_image.segmentAsPascalvoc("000000281759.jpg", output_image_name = "testoutput.jpg")

In [None]:
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("testoutput.jpg")
plt.imshow(img)
plt.show()

## **Instance Segmentation**

In [None]:
import pixellib
from pixellib.instance import instance_segmentation

Load the mask r-cnn model to perform instance segmentation. The model can be downloaded from [here](https://github.com/ayoolaolafenwa/PixelLib/releases/download/1.2/mask_rcnn_coco.h5).

In [None]:
# ! wget https://github.com/ayoolaolafenwa/PixelLib/releases/download/1.2/mask_rcnn_coco.h5

In [None]:
segment_image = instance_segmentation()
segment_image.load_model("https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/image_segmentation/mask_rcnn_coco.h5") 
segment_image.segmentImage("000000281759.jpg", output_image_name = "testoutput2.jpg")

In [None]:
import matplotlib.pyplot as plt
import cv2
img = cv2.imread("testoutput2.jpg")
plt.imshow(img)
plt.show()

# **Panoptic Segmentation**

Panoptic segmentation is an image segmentation method used for Computer Vision tasks. It unifies two distinct concepts used to segment images namely, semantic segmentation and instance segmentation.

To read about it more, please refer [this](https://analyticsindiamag.com/guide-to-panoptic-segmentation-a-semantic-instance-segmentation-approach/) article.

## **Panoptic Segmentation**

Import the required libraries

In [None]:
from PIL import Image
import requests
import io
import math
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'retina'
import torch
from torch import nn
from torchvision.models import resnet50
import torchvision.transforms as T
import numpy
torch.set_grad_enabled(False);
import itertools
import seaborn as sns 

Install the Panoptic API from GitHub for panoptic inference

In [None]:
!python -m pip install git+https://github.com/cocodataset/panopticapi.git --user -q

Import the installed API

In [None]:
import panopticapi
from panopticapi.utils import id2rgb, rgb2id 

List of COCO semantic classes:

In [None]:
CLASSES = [
     'N/A', 'person', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus',
     'train', 'truck', 'boat', 'traffic light', 'fire hydrant', 'N/A',
     'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse',
     'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'N/A',   
     'Backpack', 'umbrella', 'N/A', 'N/A', 'handbag', 'tie', 'suitcase', 
     'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 
     'baseball glove', 'skateboard', 'surfboard', 'tennis racket',    
     'bottle', 'N/A', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 
     'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 
     'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch', 'potted plant', 'bed', 'N/A', 'dining table', 'N/A', 'N/A', 'toilet', 'N/A', 
     'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 
     'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'N/A', 'book', 
     'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush'] 

Enumerate the above classes (Detectron2 model uses different numbering convention so we need to change it)

In [None]:
coco2d2 = {}
count = 0
for i, c in enumerate(CLASSES):
  if c != "N/A":
    coco2d2[i] = count
    count+=1 

Perform standard PyTorch mean-std input image normalization

In [None]:
transform = T.Compose([
    T.Resize(800),
    T.ToTensor(),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
]) 

Load a pre-trained model from torch hub and request the post-processor

In [None]:
model, postprocessor = torch.hub.load('facebookresearch/detr', 'detr_resnet101_panoptic', pretrained=True, return_postprocessor=True, num_classes=250)
model.eval();

Retrieve an image from the validation set of COCO dataset for testing purpose

In [None]:
url = "http://images.cocodataset.org/val2017/000000281759.jpg"
im = Image.open(requests.get(url, stream=True).raw)

Mean-std normalize the input testing image (batch-size: 1)

In [None]:
img = transform(im).unsqueeze(0)
out = model(img) 

Compute the probability score for each possible class, excluding the “no-object” class (the last one)

In [None]:
scores = out["pred_logits"].softmax(-1)[..., :-1].max(-1)[0]

Threshold the confidence to only masks with high confidence >0.85

In [None]:
keep = scores > 0.85

Plot the masks satisfying the confidence level condition

In [None]:
ncols = 5
fig, axs = plt.subplots(ncols=ncols, nrows=math.ceil(keep.sum().item() / ncols), figsize=(18, 10))
for line in axs:
    for a in line:
        a.axis('off')
for i, mask in enumerate(out["pred_masks"][keep]):
    ax = axs[i // ncols, i % ncols]
    ax.imshow(mask, cmap="cividis")
    ax.axis('off')
fig.tight_layout() 

Merge the individual predictions obtained by running the above lines of code into a unified panoptic segmentation. For that, we use DETR’s postprocessor.

The post-processor requires as input the target size of predictions

In [None]:
result = postprocessor(out, torch.as_tensor(img.shape[-2:]).unsqueeze(0))[0]

The segmentation is stored in a special-format png

In [None]:
panoptic_seg = Image.open(io.BytesIO(result['png_string']))
panoptic_seg = numpy.array(panoptic_seg, dtype=numpy.uint8).copy()

Retrieve the instance id corresponding to each mask

In [None]:
panoptic_seg_id = rgb2id(panoptic_seg)

Colour each mask individually and plot the visualization

In [None]:
import itertools
import seaborn as sns
palette = itertools.cycle(sns.color_palette())
panoptic_seg[:, :, :] = 0
for id in range(panoptic_seg_id.max() + 1):
  panoptic_seg[panoptic_seg_id == id] = numpy.asarray(next(palette)) * 255
plt.figure(figsize=(15,15))
plt.imshow(panoptic_seg)
plt.axis('off')
plt.show() 

Use Detectron2’s plotting utilities to better visualize the above panoptic segmentation results.

Import the utilities

In [None]:
# install dependencies: 
!python -m pip install pyyaml==5.1 --user -q
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())
!gcc --version
# opencv is pre-installed on colab

In [None]:
# install detectron2:
# See https://detectron2.readthedocs.io/tutorials/install.html for instructions
import torch
assert torch.__version__.startswith("1.8")   # need to manually install torch 1.8
!python -m pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.8/index.html --user -q
# exit(0)  # After installation, you need to "restart runtime" in Colab. This line can also restart runtime

In [None]:
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

In [None]:
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog

Extract the segments information and the panoptic result from DETR’s prediction

In [None]:
from copy import deepcopy
segments_info = deepcopy(result["segments_info"])

Store the panoptic predictions in a special format png

In [None]:
panoptic_seg = Image.open(io.BytesIO(result['png_string']))
final_w, final_h = panoptic_seg.size 

Convert the png into segment id map

In [None]:
panoptic_seg = numpy.array(panoptic_seg, dtype=numpy.uint8)
panoptic_seg = torch.from_numpy(rgb2id(panoptic_seg))

Change Detectron2’s numbering to appropriate class id’s

In [None]:
meta = MetadataCatalog.get("coco_2017_val_panoptic_separated")
for i in range(len(segments_info)):
    c = segments_info[i]["category_id"]
    segments_info[i]["category_id"] = meta.thing_dataset_id_to_contiguous_id[c] if segments_info[i]["isthing"] else meta.stuff_dataset_id_to_contiguous_id[c] 

Visualize the improved prediction results

In [None]:
v = Visualizer(numpy.array(im.copy().resize((final_w, final_h)))[:, :, ::-1], meta, scale=1.0)
v._default_font_size = 20
v = v.draw_panoptic_seg_predictions(panoptic_seg, segments_info, area_threshold=0)
plt.imshow(v.get_image()) 
plt.show()

#**Related Articles:**

> * [Comparison of Semantic, Instance and Panoptic Segmentation](https://analyticsindiamag.com/semantic-vs-instance-vs-panoptic-which-image-segmentation-technique-to-choose/)

> * [Panoptic Segmentation](https://analyticsindiamag.com/guide-to-panoptic-segmentation-a-semantic-instance-segmentation-approach/)

> * [PaddleSeg](https://analyticsindiamag.com/guide-to-asymmetric-non-local-neural-networks-using-paddleseg/)

> * [MMDetection](https://analyticsindiamag.com/guide-to-mmdetection-an-object-detection-python-toolbox/)

> * [Facebook D2Go to Mobile](https://analyticsindiamag.com/facebooks-d2go-brings-detectron2-to-mobile/)

> * [Multi Class Image Classification with Tensorflow and Keras](https://analyticsindiamag.com/multi-label-image-classification-with-tensorflow-keras/)

> * [Transfer Learning in Tensorflow Keras](https://analyticsindiamag.com/a-practical-guide-to-implement-transfer-learning-in-tensorflow/)