<a href="https://colab.research.google.com/github/dionny/ai-tutorial-notebooks/blob/main/chopper.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Element Chopping Using Object Detection

This notebook uses object detection to chop elements from a screenshot from a purely visual input.

After running the next cell, use the widget to upload an image of a screen from your local file system. The image may be a browser screenshot from a website or a screenshot from a mobile device. The image must be a PNG file.

In [None]:
!rm -rf helpers
!git clone https://github.com/dionny/ai-tutorial-notebooks.git helpers
from helpers import notebook_utils

import ipywidgets

uploader = ipywidgets.FileUpload(
    accept='.png',
    multiple=False,
)

display(uploader)

Visualize the uploaded screenshot.

In [None]:
image = notebook_utils.get_uploaded_image(uploader)
display(ipywidgets.Image(value=notebook_utils.convert_image_to_bytes(image), width='25%'))

Send the uploaded image to the object detection service to chop the image. Please be patient, this request can take up to 30 seconds.

In [None]:
import io
import requests

from IPython.display import Markdown

host = 'https://deep-vision.dionny.dev'
url = f'{host}/deep_vision/'

# Encode the uploaded image as multipart/form-data
buffer = io.BytesIO()
image.save(buffer, format='PNG')
buffer.seek(0)

# Send request to the deep vision service
files = {'file': ('image.png', buffer, 'image/png')}
response = requests.post(url, files=files)

# Display success or failure message
message = 'Request succeeded.'
if response.status_code != 200:
    message = f'Request failed with status code {response.status_code}.'
display(Markdown(message))

The service uses objection detection to identify elements from the screenshot. Run the cell below to view the JSON response.

In [None]:
import json
json_response = response.json()
print(json.dumps(json_response, indent=True))

Run the cell below to display the chopped elements and screenshots with bounding boxes.

In [None]:
import ipywidgets
from IPython.display import display

elements = json_response

# Create a grid to display screenshots with bounding boxes, the detected element name, and confidence.
num_elements = len(elements)
num_columns = 3  # Adjusted for an additional column
grid_gap = '30px'
title_grid = ipywidgets.GridspecLayout(1, num_columns, grid_gap=grid_gap)
grid = ipywidgets.GridspecLayout(num_elements, num_columns, grid_gap=grid_gap)

# Populate column titles.
column_titles = ['Screenshot with Bounding Box', 'Type', 'Confidence']
for i, column_title in enumerate(column_titles):
    title_grid[0, i] = ipywidgets.HTML(value=f'<h1>{column_title}</h1>')

# Populate grid.
for i, element in enumerate(elements):
    name = element['name']
    confidence = f"{element['confidence']:.2f}"
    box = element['box']
    top_left = (box['x1'], box['y1'])
    bottom_right = (box['x2'], box['y2'])

    screenshot_with_bounding_box = notebook_utils.draw_bounding_box(image, top_left, bottom_right)

    grid[i, 0] = ipywidgets.Image(value=notebook_utils.convert_image_to_bytes(screenshot_with_bounding_box), max_width='50%')
    grid[i, 1] = ipywidgets.HTML(value=f'<h2>{name}</h2>')
    grid[i, 2] = ipywidgets.HTML(value=f'<h2>{confidence}</h2>')

# Display grid.
display(title_grid)
display(grid)
