Skip to content
This repository has been archived by the owner on Nov 21, 2023. It is now read-only.

autodistill/autodistill-sam-clip

Repository files navigation

Autodistill SAM-CLIP

Important

This model has been replaced with the SAM-CLIP combination implemented with the Autodistill model combination API. This API enables you to combine using a detection and classification model for auto-labeling. See the code snippet below for an example of using SAM-CLIP with the new API.

New SAM-CLIP API

First, install the GroundedSAM and CLIP Autodistill modules:

pip install autodistill autodistill-grounded-sam autodistill-clip

To use the new API, choose an abstract class to identify (i.e. "logo") with a base detection model (in the case below, Grounding DINO). Then, choose classes that should be used by the classification model (i.e. "McDonalds", "Burger King"):

from autodistill_clip import CLIP
from autodistill.detection import CaptionOntology
from autodistill_grounded_sam import GroundedSAM
import supervision as sv

from autodistill.core.custom_detection_model import CustomDetectionModel
import cv2

classes = ["McDonalds", "Burger King"]


SAMCLIP = CustomDetectionModel(
    detection_model=GroundedSAM(
        CaptionOntology({"logo": "logo"})
    ),
    classification_model=CLIP(
        CaptionOntology({k: k for k in classes})
    )
)

IMAGE = "logo.jpg"

results = SAMCLIP.predict(IMAGE)

image = cv2.imread(IMAGE)

annotator = sv.MaskAnnotator()
label_annotator = sv.LabelAnnotator()

labels = [
    f"{classes[class_id]} {confidence:0.2f}"
    for _, _, confidence, class_id, _ in results
]

annotated_frame = annotator.annotate(
    scene=image.copy(), detections=results
)
annotated_frame = label_annotator.annotate(
    scene=annotated_frame, labels=labels, detections=results
)

sv.plot_image(annotated_frame, size=(8, 8))

Archived Contents

This repository contains the code supporting the SAM-CLIP base model for use with Autodistill.

SAM-CLIP uses the Segment Anything Model to identify objects in an image and assign labels to each image. Then, CLIP is used to find masks that are related to the given prompt.

Read the full Autodistill documentation.

Read the SAM-CLIP Autodistill documentation.

Installation

To use the SAM-CLIP base model, you will need to install the following dependency:

pip3 install autodistill-sam-clip

Quickstart

from autodistill_sam_clip import SAMCLIP
from autodistill_yolov8 import YOLOv8


# define an ontology to map class names to our CLIP prompt
# the ontology dictionary has the format {caption: class}
# where caption is the prompt sent to the base model, and class is the label that will
# be saved for that caption in the generated annotations
# then, load the model
base_model = SAMCLIP(ontology=CaptionOntology({"shipping container": "container"}))

# label all images in a folder called `context_images`
base_model.label("./context_images", extension=".jpeg")

License

The code in this repository is licensed under an Apache 2.0 license.

🏆 Contributing

We love your input! Please see the core Autodistill contributing guide to get started. Thank you 🙏 to all our contributors!