<a href="https://colab.research.google.com/github/thesteve0/impatient-computer-vision/blob/main/5_segmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Segmentation

With this new notebook we are going to take on Segmentation. This task tries to assign every pixel to a class.
There are actually at least 3 types of segmentation

1. Instance Segmentation - like object detection except there are no boxes. All the pixels the model thinks belong to the object are given a code equal to the class.
2. Semantic Segmentation - all pixels are assigned to a class but there is no distinct object detection. For example, if one person is standing partially in front of another person, there will just be a big person blob that is the outline of the two people with no line in the middle
3. Panoptic Segmentation - Every pixel is assigned to a class like in semantic segmentation but some classes get instance segmentation. In our "Find the people in this picture example", every pixel that doesn't belong to a person can be put into one class ("background") and then each individual person could be uniquely identified.


Time to do housekeeping:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

!pip install fiftyone==1.4.1 torch torchvision

import fiftyone as fo
import fiftyone.zoo as foz

name = "our-photos"
dir = "/content/drive/MyDrive/impatient-cv/flickr-labeled"

dataset = fo.Dataset.from_dir(
    dataset_dir=dir,
    dataset_type=fo.types.FiftyOneDataset,
    name=name
)


## Loading the model and looking at the results
Let's go ahead and load one of the models from the zoo and then just apply it to our photos. This model was also trained on COCO and has a ResNet backbone. We'll go right ahead and load up the results in the app.

By the way, this is not the full COCO dataset, it is only the 20 classes from PASCAL VOC Challenge

http://host.robots.ox.ac.uk/pascal/VOC/voc2012/segexamples/index.html

But the hint was here:
https://docs.pytorch.org/vision/main/models/generated/torchvision.models.segmentation.deeplabv3_resnet101.html#torchvision.models.segmentation.DeepLabV3_ResNet101_Weights

Note, while this may look like instance segmentation, it is actually semantic segmentation. The model only produces masks over classes it recognizes, everything else gets a zero - which you can think of as "other".

In [None]:
model = foz.load_zoo_model("deeplabv3-resnet101-coco-torch")

dataset.apply_model(
    model,
    label_field="segmentations",
    progres=True,
    num_workers=2,
    batch=64
)

session = fo.launch_app(dataset, auto=False)
session.url

## Wrap up

Segmentation is actually quite a sophisticated modeling task. The models to run this are a bit more involved so we are not going to dig in to much deeper on this one. Sometime people will use segmentation as a way to "crop" out items they want to classify. For example, if we were trying to identify faces in images, we could first segment out all the human faces and subtract the background out. This would give a much cleaner image to try and classify.

The next notebook will be a more specialized technique called Keypoint Detection, it will allow us to estimate human (or other animal) poses or action.

[Keypoints](https://github.com/thesteve0/impatient-computer-vision/blob/main/6_keypoints.ipynb)