# Qualitative Evaluation with Grounded SAM 2

#### Course: Deep Neural Engineering (IM1102)
#### Group: Ellen Cordemans, Ilse Harmers & Sem Pepels

The code in this notebook is adapted from [1].


---



**References**

[1] Gallagher, J. (2024, July 31). How to Label Data with Grounded SAM 2. Roboflow Blog. Retrieved April 14, 2025, from https://blog.roboflow.com/label-data-with-grounded-sam-2/

## Environment Set-Up

In [None]:
# Checking the availability of Google Colab's GPU. Note that this notebook cannot be run without a GPU backend.
!nvidia-smi

In [None]:
# Installing Grounded SAM 2 from the Autodistill (Python) library.
# Note that this step can take a while (~ 2 minutes during our runs).
!pip install -q autodistill-grounded-sam-2

In [None]:
# Making sure that the Google Colab environment has the right version of the Transformers library installed.
!pip uninstall transformers
!pip install -q transformers==4.49.0

In [None]:
# Making a directory called 'data' where the data will be stored.
# If the reader intends to run all cells in this notebook from scratch, but not in a Google Colab environment, then this step could be skipped
# as long as the "imagepath" variable is adjusted as well. If the reader is running all cells in Google Colab without intending to adjust the
# aforementioned variable, then the images should be uploaded to this new directory
import os
HOME = os.getcwd()
print("HOME:", HOME)

%cd {HOME}
!mkdir {HOME}/data
%cd {HOME}/data

## Grounded SAM 2

In [None]:
# Importing important models, functions and libraries.
# Note that this step can take a while (~ 4 minutes in our runs).
from autodistill_grounded_sam_2 import GroundedSAM2
from autodistill.detection import CaptionOntology
from autodistill.utils import plot
import cv2
import supervision as sv

In [None]:
# This variable sets the path to the image that will processed by Grounded SAM 2.
# The path can be modified if another image is to be processed instead.
image_path = "/content/data/house2.jpg"

In [None]:
# Setting up the Grounded SAM 2 model with our ontology. The ontology has the following structure: {"prompt": "label"}.
# The prompt is given to the grounding model (Florence-2) and the results have the specified label attached to them.
base_model = GroundedSAM2(
	ontology=CaptionOntology(
    	{
        	"door": "door",
          "window": "window",
          "front yard": "front yard"
    	}
	)
)

In [None]:
# Processing the image with Grounded SAM 2.
results = base_model.predict(image_path)

In [None]:
# Setting up the labels for the label annotator in the next cell. We want to display both the label and the confidence score for each detection.
classes = base_model.ontology.classes()

labels = [
    f"{classes[class_id]} {confidence:0.2f}"
    for _, _, confidence, class_id, _, _
    in results
]

In [None]:
# This line ensures that the color codes match between our Grounded SAM 2 results and our Roboflow dataset.
color = sv.ColorPalette.from_hex(['#FE0056', '#8622FF', '#00FFCE'])

# Setting up the annotators for our model's results.
box_annotator = sv.BoxAnnotator(color=color)
mask_annotator = sv.MaskAnnotator(color=color)
label_annotator = sv.LabelAnnotator(color=color, text_color=sv.Color.BLACK)

# Reading image.
image = cv2.imread(image_path)

# Annotating the model's results.
annotated_image = mask_annotator.annotate(scene=image.copy(), detections=results)
annotated_image = box_annotator.annotate(scene=annotated_image, detections=results)
annotated_image = label_annotator.annotate(annotated_image, detections=results, labels=labels)

# Plotting the annotated end result.
sv.plot_image(annotated_image, size=(8, 8))