<a href="https://colab.research.google.com/github/sjain-21/cv/blob/main/CV_A0_model_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model 1: Segment Anything Model (SAM)

---

[![GitHub](https://badges.aleen42.com/src/github.svg)](https://github.com/facebookresearch/segment-anything) [![arXiv](https://img.shields.io/badge/arXiv-2304.02643-b31b1b.svg)](https://arxiv.org/abs/2304.02643)

Segment Anything Model (SAM): a new AI model from Meta AI that can "cut out" any object, in any image, with a single click. SAM is a promptable segmentation system with zero-shot generalization to unfamiliar objects and images, without the need for additional training. This notebook is an extension of the [official notebook](https://colab.research.google.com/github/facebookresearch/segment-anything/blob/main/notebooks/automatic_mask_generator_example.ipynb) prepared by Meta AI.

## Use GPU Acceleration

If you are running this notebook in Google Colab, navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator`, set it to `GPU`, and then click `Save`. This will ensure your notebook uses a GPU, which will significantly speed up model training times.

## Steps in this Tutorial

In this tutorial, we are going to cover:

- **Before you start** - Make sure you have access to the GPU
- Install Segment Anything Model (SAM)
- Download Example Data
- Load Model
- Automated Mask Generation

## Let's begin!

## Before you start

Let's make sure that we have access to GPU. We can use `nvidia-smi` command to do that. In case of any problems navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator`, set it to `GPU`, and then click `Save`.

In [None]:
!nvidia-smi

**NOTE:** To make it easier for us to manage datasets, images and models we create a `HOME` constant.

In [None]:
import os
HOME = os.getcwd()
print("HOME:", HOME)

## Install Segment Anything Model (SAM) and other dependencies

In [None]:
%cd {HOME}

import sys
!{sys.executable} -m pip install 'git+https://github.com/facebookresearch/segment-anything.git'

In [None]:
!pip install -q jupyter_bbox_widget roboflow dataclasses-json supervision

## Download the SAM model

In [None]:
%cd {HOME}
!mkdir {HOME}/weights
%cd {HOME}/weights

!wget -q https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth

In [None]:
import os

CHECKPOINT_PATH = os.path.join(HOME, "weights", "sam_vit_h_4b8939.pth")
print(CHECKPOINT_PATH, "; exist:", os.path.isfile(CHECKPOINT_PATH))

## Download Example Data

**NONE:** Let's download few example images. Feel free to use your images or videos.

In [None]:
%cd {HOME}
!mkdir {HOME}/data
%cd {HOME}/data

!wget -q https://upload.wikimedia.org/wikipedia/commons/thumb/e/e9/Times_Sq_Feb_2017_4.jpg/640px-Times_Sq_Feb_2017_4.jpg
!wget -q https://upload.wikimedia.org/wikipedia/commons/thumb/6/6b/Liberty_Island_photo_Don_Ramey_Logan.jpg/640px-Liberty_Island_photo_Don_Ramey_Logan.jpg

## Load model

In [None]:
import torch

DEVICE = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
MODEL_TYPE = "vit_h"

In [None]:
from segment_anything import sam_model_registry, SamAutomaticMaskGenerator, SamPredictor

sam = sam_model_registry[MODEL_TYPE](checkpoint=CHECKPOINT_PATH).to(device=DEVICE)

## Automated Mask Generation

To run automatic mask generation, provide a SAM model to the `SamAutomaticMaskGenerator` class. Set the path below to the SAM checkpoint. Running on CUDA and with the default model is recommended.

In [None]:
mask_generator = SamAutomaticMaskGenerator(sam)

In [None]:
import os

IMAGE_NAME1 = "640px-Times_Sq_Feb_2017_4.jpg"
IMAGE_NAME2 = "640px-Liberty_Island_photo_Don_Ramey_Logan.jpg"

IMAGE_PATH1 = os.path.join(HOME, "data", IMAGE_NAME1)
IMAGE_PATH2 = os.path.join(HOME, "data", IMAGE_NAME2)

### Generate masks with SAM

In [None]:
import cv2
import supervision as sv

#Image 1 segmentation
image_bgr1 = cv2.imread(IMAGE_PATH1)
image_rgb1 = cv2.cvtColor(image_bgr1, cv2.COLOR_BGR2RGB)

sam_result1 = mask_generator.generate(image_rgb1)

#Image 2 segmentation
image_bgr2 = cv2.imread(IMAGE_PATH2)
image_rgb2 = cv2.cvtColor(image_bgr2, cv2.COLOR_BGR2RGB)

sam_result2 = mask_generator.generate(image_rgb2)

### Output format

`SamAutomaticMaskGenerator` returns a `list` of masks, where each mask is a `dict` containing various information about the mask:

* `segmentation` - `[np.ndarray]` - the mask with `(W, H)` shape, and `bool` type
* `area` - `[int]` - the area of the mask in pixels
* `bbox` - `[List[int]]` - the boundary box of the mask in `xywh` format
* `predicted_iou` - `[float]` - the model's own prediction for the quality of the mask
* `point_coords` - `[List[List[float]]]` - the sampled input point that generated this mask
* `stability_score` - `[float]` - an additional measure of mask quality
* `crop_box` - `List[int]` - the crop of the image used to generate this mask in `xywh` format

In [None]:
print(sam_result1[0].keys())

### Results visualisation with Supervision

As of version `0.5.0` Supervision has native support for SAM.

In [None]:
mask_annotator = sv.MaskAnnotator(color_lookup = sv.ColorLookup.INDEX)

detections1 = sv.Detections.from_sam(sam_result=sam_result1)

annotated_image1 = mask_annotator.annotate(scene=image_bgr1.copy(), detections=detections1)

sv.plot_images_grid(
    images=[image_bgr1, annotated_image1],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)

In [None]:
detections2 = sv.Detections.from_sam(sam_result=sam_result2)

annotated_image2 = mask_annotator.annotate(scene=image_bgr2.copy(), detections=detections2)

sv.plot_images_grid(
    images=[image_bgr2, annotated_image2],
    grid_size=(1, 2),
    titles=['source image', 'segmented image']
)