## Problem:
1. YOLO is not reliable in competition due to limited training data, and unseen obstacles.
2. Things to look out for:
- false positives, things like dark shadows, lamp post poles, people walking, any boundaries in parking lot

- false negative: anything depth related (ramp), areas with low light, any obstacles that might be slightly obscured (like area with multiple cones clustered together)

- There were some unexpected obstacles that appeared in the course that we were unprepared for (EX: white/orange barricades, neon green/yellow cones)

## Project Objective:
1. Develop a generalizable object detection algorithm taking advantage of existing foundational models.
2. This pipeline is a parallel method to the RANSAC method to identify obstacles and generate a binary mask.
3. The transformation from binary mask to occupancy grid will be handled downstream


# Task
Set up an object detection pipeline using Ultralytics YOLO to process your images or videos from Google Drive, aiming to detect obstacles and visualize the inference results for improved generalizability.

## Install Ultralytics

### Subtask:
Install the Ultralytics library, which includes YOLO, using pip. This step will ensure all necessary dependencies are met.


**Reasoning**:
The subtask is to install the Ultralytics library with opencv-python dependency. I will use a pip install command within a code block to achieve this.



In [1]:
!pip install ultralytics[opencv-python]
#!pip install torch torchvision==0.15.2 --extra-index-url https://download.pytorch.org/whl/cu118

Collecting ultralytics[opencv-python]
  Downloading ultralytics-8.3.223-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.18 (from ultralytics[opencv-python])
  Downloading ultralytics_thop-2.0.18-py3-none-any.whl.metadata (14 kB)
Downloading ultralytics_thop-2.0.18-py3-none-any.whl (28 kB)
Downloading ultralytics-8.3.223-py3-none-any.whl (1.1 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.1/1.1 MB[0m [31m32.6 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: ultralytics-thop, ultralytics
Successfully installed ultralytics-8.3.223 ultralytics-thop-2.0.18


## Import YOLO and Verify Installation

### Subtask:
Import the `YOLO` class from the `ultralytics` library and verify that the installation is working correctly by checking the library version.

**Reasoning**:
Now that the Ultralytics library is installed, I will import the YOLO class and print the Ultralytics version to verify the installation as described in the subtask.



In [2]:
from ultralytics import YOLO
import ultralytics

print("Ultralytics YOLO imported successfully.")
print(f"Ultralytics version: {ultralytics.__version__}")

Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
Ultralytics YOLO imported successfully.
Ultralytics version: 8.3.223


## Mount Google Drive

### Subtask:
Mount your Google Drive to this Colab environment. This will allow the notebook to access images or videos stored in your Drive for object detection.

**Reasoning**:
To achieve the subtask of mounting Google Drive, I will use the `google.colab` library's `drive.mount()` function, which will prompt the user for authentication.



In [3]:
from google.colab import drive

drive.mount('/content/gdrive')


Mounted at /content/gdrive


## Load YOLO Model and Define Data Paths

### Subtask:
Load a pre-trained YOLO model and define the paths to your input images or videos and the desired output directory, both located within your mounted Google Drive. This sets up the model and specifies where to find and save your data.

#### Instructions:
1. Initialize a `YOLO` model using a pre-trained weight (e.g., `'yolov8n.pt'`).
2. Define the `input_path` variable to point to the directory in your Google Drive containing the images or videos you want to process.
3. Define the `output_path` variable to specify a directory in your Google Drive where the inference results will be saved. Ensure this directory exists or will be created by the model.

**Reasoning**:
Following the subtask instructions, I will initialize a YOLO model with a pre-trained weight and define placeholder paths for input data and output results within the mounted Google Drive.



In [5]:
model = YOLO('/content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/YOLO Weights/Copy of bestDriveableArea.pt')  # Load a pre-trained YOLOv8n model

# Define paths within your Google Drive
# IMPORTANT: Replace these with your actual paths!
input_path = '/content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/CV Onboarding/Train-Checkpoint-CV.mp4'  # e.g., folder with images/videos
output_path = '/content/gdrive/MyDrive/out/'  # e.g., folder to save results

print(f"YOLO model loaded: {model.model.yaml.get('name', 'Unknown model')}")
print(f"Input path set to: {input_path}")
print(f"Output path set to: {output_path}")

YOLO model loaded: Unknown model
Input path set to: /content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/CV Onboarding/Train-Checkpoint-CV.mp4
Output path set to: /content/gdrive/MyDrive/out/


## Perform Object Detection (Inference)

### Subtask:
Run the YOLO model inference on the input images or videos located in your Google Drive and save the detection results, including annotated images/videos, to the specified output directory.

#### Instructions:
1. Use the `model.predict()` method to perform inference. Specify the `source` as your `input_path` and the `save_dir` as your `output_path`.
2. Consider adding arguments like `conf` for confidence thresholding and `save_txt` or `save_conf` if you need detailed detection results in text format.
3. Make sure to set `save=True` to save the annotated output.

**Reasoning**:
Following the subtask instructions, I will use the `model.predict()` method with the defined `input_path` and `output_path` to perform inference and save the annotated results. I will also include `conf` for setting a confidence threshold and `save_txt` to save the detection coordinates.



In [6]:
results = model.predict(
    source=input_path,
    save=True,
    save_txt=True,
    conf=0.25,  # Set a confidence threshold, e.g., 0.25
    project=output_path,
    name='inference_results' # Subdirectory within output_path for this run
)

print(f"Inference complete. Results saved to: {output_path}/inference_results")


inference results will accumulate in RAM unless `stream=True` is passed, causing potential out-of-memory
errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

video 1/1 (frame 1/617) /content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/CV Onboarding/Train-Checkpoint-CV.mp4: 384x640 1 driveable, 54.3ms
video 1/1 (frame 2/617) /content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/CV Onboarding/Train-Checkpoint-CV.mp4: 384x640 1 driveable, 13.9ms
video 1/1 (frame 3/617) /content/gdrive/Shareddrives/STUDENT-Robotics | UMARV/2025-2026/Computer Vision/CV Onboarding/Train-Ch

In [7]:
# Accessing bounding box information from the results object
for r in results:
    # r is a Results object for a single image or frame
    boxes = r.boxes  # Boxes object

    for box in boxes:
        # box is a Box object for a single detection
        class_id = box.cls.item()  # Class ID
        confidence = box.conf.item()  # Confidence score
        xyxy = box.xyxy[0].tolist()  # Bounding box coordinates in [x1, y1, x2, y2] format

        print(f"Detected object: Class ID {class_id}, Confidence: {confidence:.2f}, Bounding Box: {xyxy}")

        # You can also access other formats:
        # xywh = box.xywh[0].tolist() # [x_center, y_center, width, height]
        # xywhn = box.xywhn[0].tolist() # Normalized [x_center, y_center, width, height]

Detected object: Class ID 0.0, Confidence: 0.40, Bounding Box: [0.0, 118.16436767578125, 1277.439453125, 720.0]
Detected object: Class ID 0.0, Confidence: 0.44, Bounding Box: [0.0, 115.88153076171875, 1277.47900390625, 719.587646484375]
Detected object: Class ID 0.0, Confidence: 0.44, Bounding Box: [0.0, 116.72137451171875, 1277.4969482421875, 719.4033813476562]
Detected object: Class ID 0.0, Confidence: 0.44, Bounding Box: [0.0, 115.0098876953125, 1277.4794921875, 719.537109375]
Detected object: Class ID 0.0, Confidence: 0.45, Bounding Box: [0.0, 115.61968994140625, 1277.460693359375, 719.5526123046875]
Detected object: Class ID 0.0, Confidence: 0.44, Bounding Box: [0.0, 115.646240234375, 1277.4688720703125, 719.741455078125]
Detected object: Class ID 0.0, Confidence: 0.43, Bounding Box: [0.0, 116.09521484375, 1277.48291015625, 719.7427978515625]
Detected object: Class ID 0.0, Confidence: 0.44, Bounding Box: [0.0, 116.5897216796875, 1277.467529296875, 719.7259521484375]
Detected objec

**Reasoning**:
The previous `model.predict()` call failed because the specified `input_path` (`/content/gdrive/MyDrive/YOLO_Data/images`) does not exist on Google Drive. To resolve this, I will update the `input_path` to point to an existing video file found in the Google Drive (e.g., `/content/gdrive/MyDrive/2025-05-11 23-08-41.mkv`) for a runnable example. This will allow the inference step to proceed, but I will explicitly remind the user to replace this path with their actual data source, which could be a specific image, video, or a directory containing multiple images/videos.



**Reasoning**:
The `input_path` has been corrected to point to an existing video file. I will now re-attempt the object detection inference using the `model.predict()` method with the updated `input_path` and the previously defined `output_path`, including the `conf` and `save_txt` arguments to ensure the subtask of performing and saving the detection results is completed.



## Visualize Results

### Subtask:
Display the annotated video or images generated by the YOLO inference directly within the notebook to review the detection results.


**Reasoning**:
To display the inference results, I need to find the generated video file within the specified output directory. I will construct the full path to the output directory and use `glob` to locate the video file. If a video is found, I'll use `IPython.display.Video` to show it in the notebook. Since the previous step was processing a video, I will primarily look for video files. Also, it is common for Ultralytics to put results in a subdirectory like 'predict' or 'predictN', so I will try to search within those directories as well.



## Final Task

### Subtask:
Summarize the setup, inference, and visualization process, providing instructions on how to interpret the results and potential next steps for further analysis or fine-tuning.
