# Install required libraries

In [None]:
!pip install ultralytics

# Load YOLO v8

In [None]:
from ultralytics import YOLO
import cv2

# Load a pre-trained YOLOv8 model
model = YOLO('yolov8n.pt') # 'n' is for the nano version, the smallest and fastest



In [None]:


# Load other pre-trained YOLOv8 models

# Load the small version of YOLOv8
# model_s = YOLO('yolov8s.pt')

# Load the medium version of YOLOv8
# model_m = YOLO('yolov8m.pt')

# Load the large version of YOLOv8
# model_l = YOLO('yolov8l.pt')

# Load the extra large version of YOLOv8
model_x = YOLO('yolov8x.pt')

# Example of how to use a different model (uncomment the line above to load it)
model = model_x # For example, if you want to use the small model


In [None]:
# prompt: run ls -la but display file sizes in MG / GB

!ls -lah

# Run prediction

In [None]:
# Path to your image
image_path = '/content/photo_2025-07-13 11.06.57.jpeg'

# Run inference on the image
results = model(image_path)

# The 'results' object contains the detections.
# We can visualize them directly.
# The plot() method returns a NumPy array of the image with detections.
annotated_image = results[0].plot()



# Review the rqs results

In [None]:
results

This output from YOLOv8 describes **two objects** that your model detected in an image. Both objects were identified as belonging to the **same class (class `0`)**, but with low confidence scores.

Here is a breakdown of what each attribute means.

***

### Core Detection Information

* **`shape: torch.Size([2, 6])`**
    This tells you the dimensions of the main results tensor. It means there are **2** detected objects, and for each object, there are **6** associated values.

* **`data`**: `tensor([[333.52, 1083.7, 354.98, 1142.0, 0.39918, 0.0], [361.72, 1082.8, 383.89, 1152.8, 0.31925, 0.0]])`
    This is the raw data for the two detections. Each row `[x1, y1, x2, y2, confidence, class]` represents one object.
    * **Detection 1**: Bounding box from `(333, 1083)` to `(355, 1142)`, confidence `0.399`, class `0`.
    * **Detection 2**: Bounding box from `(361, 1082)` to `(383, 1152)`, confidence `0.319`, class `0`.

* **`cls: tensor([0., 0.])`**
    This tensor lists the **class index** for each of the two detected objects. Both were classified as `0`. The class index `0` typically corresponds to the first class name in your dataset's `.yaml` file (e.g., 'person').

* **`conf: tensor([0.3992, 0.3192])`**
    This shows the **confidence score** for each detection.
    * The first object has a confidence of **~40%**.
    * The second object has a confidence of **~32%**.
    These are relatively low scores, suggesting the model is not very certain about these detections.

***

### Bounding Box Formats 📦

YOLO provides the bounding box coordinates in multiple convenient formats. All values are in pixels unless they end with 'n' (normalized).

* **`xyxy`**: `tensor([[ 333.52, 1083.67, 354.97, 1142.00], [ 361.71, 1082.81, 383.89, 1152.81]])`
    The bounding box coordinates in `[x_min, y_min, x_max, y_max]` format. `(x_min, y_min)` is the top-left corner and `(x_max, y_max)` is the bottom-right corner.

* **`xywh`**: `tensor([[ 344.24, 1112.83, 21.45, 58.32], [ 372.80, 1117.81, 22.17, 70.00]])`
    The bounding box in `[x_center, y_center, width, height]` format.

* **`xyxyn`** and **`xywhn`**:
    These are the **normalized** versions of the formats above. The coordinates are scaled to a range of `[0, 1]` by dividing them by the original image dimensions. This makes them independent of the image resolution.

***

### Image & Tracking Information

* **`orig_shape: (1280, 960)`**
    The dimensions of the original input image: **1280 pixels in height** and **960 pixels in width**.

* **`id: None`** and **`is_track: False`**
    These attributes are used for object tracking. Since `id` is `None`, it means you ran a standard prediction, not object tracking. The `id` would otherwise assign a unique, consistent ID to each object across multiple frames.

In [None]:
results[0].boxes

In [None]:
annotated_image

# Display the image

In [None]:

from IPython.display import Image, display
# Convert the annotated image (NumPy array) to a format displayable in Colab
# OpenCV images are BGR, convert to RGB for PIL/display
annotated_image_rgb = cv2.cvtColor(annotated_image, cv2.COLOR_BGR2RGB)

# Use PIL to save the image to a BytesIO object and then display it
from PIL import Image as PILImage
import io

pil_img = PILImage.fromarray(annotated_image_rgb)
byte_arr = io.BytesIO()
pil_img.save(byte_arr, format='PNG') # Or 'JPEG'
display(Image(byte_arr.getvalue()))