# Zero-shot object detection

We show an example of running zero-shot object detection, namely Google's OWL-ViT model ([HF model link](https://huggingface.co/google/owlvit-base-patch32). In this example, we will also quickly externalize into a Gradio app. This notebook is tested on Pytorch 2.0.0 Python 3.10 GPU image on SageMaker Studio

In [None]:
!pip install transformers

## Model setup
We pull the model from HuggingFace Model Hub, and show some examples

In [None]:
from transformers import pipeline
checkpoint = "google/owlvit-base-patch32"
detector = pipeline(model=checkpoint, task="zero-shot-object-detection")

In [None]:
import numpy as np
from PIL import Image
image = Image.open('meme.jpg').convert("RGB")

In [None]:
image

In [None]:
predictions = detector(
    image,
    candidate_labels=["human face", "plaid shirt"],
    threshold=0.05,
)

In [None]:
predictions

In [None]:
from PIL import ImageDraw
draw = ImageDraw.Draw(image)

for prediction in predictions:
    box = prediction["box"]
    label = prediction["label"]
    score = prediction["score"]

    xmin, ymin, xmax, ymax = box.values()
    draw.rectangle((xmin, ymin, xmax, ymax), outline="red", width=1)
    draw.text((xmin, ymin), f"{label}: {round(score,2)}", fill="white")

In [None]:
image

In [None]:
import numpy as np
from PIL import Image
image = Image.open('keanu.jpeg').convert("RGB")

In [None]:
image

In [None]:
predictions = detector(
    image,
    candidate_labels=["jeans", "blazer", "human face", "food"],
    threshold=0.05,
)

In [None]:
predictions

In [None]:
from PIL import ImageDraw
draw = ImageDraw.Draw(image)

for prediction in predictions:
    box = prediction["box"]
    label = prediction["label"]
    score = prediction["score"]

    xmin, ymin, xmax, ymax = box.values()
    draw.rectangle((xmin, ymin, xmax, ymax), outline="red", width=1)
    draw.text((xmin, ymin), f"{label}: {round(score,2)}", fill="white")

In [None]:
image

## Quick externalization for demonstration

Gradio provides a quick way to externalize your demo to the outside world. Here's a simple example

In [None]:
!pip install gradio  --upgrade

In [None]:
import gradio as gr
def detect_zeroshot(labels,image):
    predictions = detector(
        image,
        candidate_labels=labels.split(','),
        threshold=0.05,
    )
    draw = ImageDraw.Draw(image)
    for prediction in predictions:
        box = prediction["box"]
        label = prediction["label"]
        score = prediction["score"]
        xmin, ymin, xmax, ymax = box.values()
        draw.rectangle((xmin, ymin, xmax, ymax), outline="red", width=1)
        draw.text((xmin, ymin), f"{label}: {round(score,2)}", fill="white")
    return image, predictions

demo= gr.Interface(fn=detect_zeroshot, 
             inputs=[gr.Textbox(label='comma separated labels'), gr.Image(type="pil")],
             outputs=[gr.Image(type="pil"), gr.JSON()])

demo.launch()