[![Fixel Algorithms](https://fixelalgorithms.co/images/CCExt.png)](https://fixelalgorithms.gitlab.io)

# AI for System Engineers and Project Managers

## Deep Learning - Computer Vision - Segment Anything Model (SAM)

Displays using a _Zero Shot Model_ for segmentation.

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 1.0.000 | 05/03/2025 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/AIProgram/2024_02/0037FeaturesTransform.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning

# Deep Learning
import onnxruntime

# Image Processing
import skimage as ski

# Miscellaneous
import math
import os
import pickle
from platform import python_version
import random
import onedrivedownloader #<! https://github.com/loribonna/onedrivedownloader

# Typing
from typing import Callable, Dict, List, Optional, Self, Set, Tuple, Union

# Visualization
import matplotlib as mpl
from matplotlib.patches import Rectangle
import matplotlib.pyplot as plt
import seaborn as sns

# Jupyter
from IPython import get_ipython

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

 ```python
 vallToFill = ???
 ```

 - Multi Line to Fill (At least one)

```python
# You need to start writing
?????
```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

?????
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# Matplotlib default color palette
lMatPltLibclr = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

In [None]:
# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2

PROJECT_NAME      = 'FixelCourses'
DATA_FOLDER_PATH  = 'DataSets'
MODEL_FOLDER_PATH = 'Models'

BASE_FOLDER      = os.getcwd()[:len(os.getcwd()) - (os.getcwd()[::-1].lower().find(PROJECT_NAME.lower()[::-1]))]

L_IMG_EXT = ['.png', '.jpeg', '.jpg']

In [None]:
# Courses Packages

from SAM2ONNX import SAM2Image



In [None]:
# General Auxiliary Functions


## Object Segmentation

Object Segmentation is a _Image to Image_ model.    
It basically applies Regression / Classification per pixel.

### Facebook's / Meta's Segment Anything Model (SAM)

![](https://i.postimg.cc/7YMBt9Dm/sam-architecture.png)
<!-- ![](https://i.imgur.com/gxFI99L.png) -->

* <font color='brown'>(**#**)</font> [SAM 1 Online Demo](https://segment-anything.com) ([Hacker News Discussion](https://news.ycombinator.com/item?id=35455566), [SAM2 Paper Review](https://openreview.net/forum?id=Ha6RTeWMd0)).
* <font color='brown'>(**#**)</font> The SAM model can be used for: Text based prompting, Generating segmentation from bounding box, generating bounding box from points, tracking, etc...
* <font color='brown'>(**#**)</font> It is integrated into many applications. See [Segmenting Remote Sensing Imagery with Text Prompts and the Segment Anything Model 2](https://samgeo.gishub.org/examples/sam2_text_prompts).
* <font color='brown'>(**#**)</font> [SAM-HQ](https://github.com/SysCV/sam-hq) - High Quality variant.
* <font color='brown'>(**#**)</font> Available on MATLAB in [`segmentAnythingModel`](https://www.mathworks.com/help/images/ref/segmentanythingmodel.html).
* <font color='brown'>(**#**)</font> [Segment Anything Model and Friends](https://www.lightly.ai/post/segment-anything-model-and-friends) ([Discussion on HackerNews](https://news.ycombinator.com/item?id=41180632)).
* <font color='brown'>(**#**)</font> [Latent Space - Segment Anything 2: Demo First Model Development](https://www.latent.space/p/sam2) - Interview with one of the developer of SAM2.
* <font color='brown'>(**#**)</font> [Kornia SAM](https://kornia.readthedocs.io/en/latest/models/segment_anything.html).
* <font color='brown'>(**#**)</font> [Segment Anything Model (SAM): Explained](https://scribe.rip/2900743cb61e).
* <font color='brown'>(**#**)</font> [Highly Accurate Dichotomous Image Segmentation ECCV 2022](https://github.com/xuebinqin/DIS).

In [None]:
# Parameters

# Data
imgUrl = r'https://upload.wikimedia.org/wikipedia/commons/thumb/c/c1/Racing_Terriers_%282490056817%29.jpg/1280px-Racing_Terriers_%282490056817%29.jpg'
imgUrl = 'https://i.postimg.cc/KvwnNg3J/Dogs-Running001.jpg' #<! Wikipedia
imgUrl = 'https://i.imgur.com/XillQsz.jpeg'                  #<! Wikipedia
imgUrl = 'https://i.postimg.cc/SR5zDwRJ/Dogs-Running002.jpg' #<! 002
imgUrl = 'https://i.imgur.com/Zbwxxwy.jpeg'                  #<! 002
imgUrl = 'https://i.postimg.cc/ncjmCcS7/Dogs-Running003.jpg' #<! 003
imgUrl = 'https://i.imgur.com/WU0k57v.jpeg'                  #<! 003


modelUrl = 'https://technionmail-my.sharepoint.com/:u:/g/personal/royia_technion_ac_il/EfN_b1spF0ZCtBBEwhbjfTYBlGaG1jkQtXRrCoGjMlNDXQ?e=cGVlmq' #<! All models
modelUrl = 'https://technionmail-my.sharepoint.com/:u:/g/personal/royia_technion_ac_il/EdOA62hrFyREuwwim0tXWNIBHX5IInJkJKgknXFJJQZubg?e=0xwYi0' #<! Tiny model

modelDecFileName = 'sam2.1_hiera_tiny_decoder.onnx'
modelEncFileName = 'sam2.1_hiera_tiny_encoder.onnx'

# Pre Processing

# Model
modelName = 'SAM2'

# Points
lPtCoord = [np.array([[420, 440], [200, 500], [525, 400]]), np.array([[360, 275], [370, 210], [300, 450], [320, 400]]), np.array([[810, 440], [1200, 400]]), np.array([[920, 314], [950, 475]])]
lLblMode = [np.array([1, 1, 1]), np.array([1, 1, 1, 1]), np.array([1, 1]), np.array([1, 1])] #<! 1 -> Additive, 0 -> Subtractive

# Data Visualization


## Generate / Load Data

The image is an image of a running dogs.



In [None]:
# Verify Data is Available

modelsPath = os.path.join(BASE_FOLDER, MODEL_FOLDER_PATH)

if not (os.path.isfile(os.path.join(modelsPath, modelDecFileName)) and os.path.isfile(os.path.join(modelsPath, modelEncFileName))):
    # Download, unzip and remove ZIP file
    onedrivedownloader.download(modelUrl, os.path.join(BASE_FOLDER, MODEL_FOLDER_PATH, modelName + '.zip'), unzip = True, clean = True)

In [None]:
# Load / Generate Data 

mI = ski.io.imread(imgUrl)

### Plot Data

In [None]:
# Plot the Data

hF, hA = plt.subplots(1, 1, figsize = (12, 12))
hA.imshow(mI)

* <font color='brown'>(**#**)</font> Some of the images are not well annotated.

## Load Model

The models is based on [ONNX](https://github.com/microsoft/onnxruntime) with a wrapping class.

* <font color='brown'>(**#**)</font> ONNX is a general run time. Though it has optimizations specific for several HW.
* <font color='brown'>(**#**)</font> For NVIDIA based hardware the most optimized Run Time is [TensorRT](https://github.com/NVIDIA/TensorRT).

In [None]:
# Model

oSam = SAM2Image(os.path.join(modelsPath, modelEncFileName), os.path.join(modelsPath, modelDecFileName))

## Inference

In [None]:
# Set the Image -> Generate Embeddings

oSam.set_image(mI) #<! Input should be UINT8

* <font color='brown'>(**#**)</font> The 

In [None]:
# Add Annotations
lMask = []
for lblId, (vPtCoord, lblMode) in enumerate(zip(lPtCoord, lLblMode)):
    for ii in range(lblMode.shape[0]):
        oSam.add_point((vPtCoord[ii][0], vPtCoord[ii][1]), lblMode[ii], lblId)

dMasks = oSam.get_masks()

In [None]:
# Display a Single Mask
plt.imshow(dMasks[0])

In [None]:
# Plot the image with masks
hF, hA = plt.subplots(1, 1, figsize=(12, 12))
hA.imshow(mI)

lClrCamps = ['Blues', 'Greens', 'Oranges', 'Purples', 'Reds']

# Overlay masks
for lblId, mM in dMasks.items():
    for jj, vPt in enumerate(lPtCoord[lblId]):
        hA.scatter(vPt[0], vPt[1], c = lMatPltLibclr[lblId], s = 125, label = f'{lblId}')
    # Work on masks per annotation point
    hA.imshow(mM, alpha = 0.5 * mM, cmap = lClrCamps[lblId])

hA.legend()
plt.show()



### Larger Model Result

![](https://github.com/ibaiGorordo/ONNX-SAM2-Segment-Anything/raw/main/doc/img/sam2_masked_img.jpg)