<a href="https://colab.research.google.com/github/JibbyGeorge-DB/HuggingFace/blob/main/HuggingFace_Pipeline_OD.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Object Detections with HuggingFace OSS model AND Gradio

In [None]:
import requests
from PIL import Image
from io import BytesIO
from transformers import pipeline

url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/cat.jpg"

# The standard way:
response = requests.get(url)
# converts the image into an in-memory binary stream using BytesIO, then opens this stream as a PIL (Pillow) Image object. Finally, it converts the image to the 'RGB' color format
raw_image = Image.open(BytesIO(response.content)).convert("RGB")
#raw_image

In [None]:

od_pipe = pipeline("object-detection", model="facebook/detr-resnet-101")
pipeline_output = od_pipe(raw_image)

In [None]:
pipeline_output

[{'score': 0.9987227320671082,
  'label': 'cat',
  'box': {'xmin': 79, 'ymin': 13, 'xmax': 532, 'ymax': 431}}]

Visualize Results (Draw Bounding Boxes)
When an object detection model gives you results, it returns a list of coordinates. The render_results_in_image helper uses matplotlib to draw these. Here is the standard way to do it manually:

In [None]:
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from PIL import Image
from io import BytesIO

def render_results_in_image(pil_img, results):
    fig, ax = plt.subplots(1, figsize=(10, 6))
    ax.imshow(pil_img)

    for res in results:
        # 1. Get coordinates from the result
        box = res['box']
        label = res['label']
        score = res['score']

        # 2. Draw the rectangle (xmin, ymin, width, height)
        rect = patches.Rectangle(
            (box['xmin'], box['ymin']),
            box['xmax'] - box['xmin'],
            box['ymax'] - box['ymin'],
            linewidth=2, edgecolor='red', facecolor='none'
        )
        ax.add_patch(rect)

        # 3. Add the label text
        plt.text(box['xmin'], box['ymin'], f"{label} {score:.2f}",
                 bbox=dict(facecolor='yellow', alpha=0.5))

    ax.axis('off')

    buf = BytesIO()
    plt.savefig(buf, format='png', bbox_inches='tight', pad_inches=0)
    buf.seek(0)
    plt.close(fig) # Close the plot to prevent it from displaying twice
    return Image.open(buf).convert("RGB")

In [None]:
render_results_in_image(raw_image, pipeline_output)

In [None]:
import gradio as gr
import os

def get_pipeline_prediction(raw_image):

    pipeline_output = od_pipe(raw_image)
    processed_image = render_results_in_image(raw_image, pipeline_output)
    return processed_image

demo = gr.Interface(
  fn=get_pipeline_prediction,
  inputs=gr.Image(label="Input image",
                  type="pil"),
  outputs=gr.Image(label="Output image with predicted instances",
                   type="pil")
)

In [None]:
demo.launch(debug=True, share=True)