# 🚀 **Waldo AI: The Ultimate Hide-and-Seek Showdown**

[![MARS AI](https://github.com/kaopanboonyuen/kaopanboonyuen.github.io/raw/main/files/logo/NAC2025_WaldoAI.png)](https://github.com/kaopanboonyuen/where-is-waldo)

## By Kao Panboonyuen

### This Colab notebook will guide you through:

* ✅ Preparing and loading the dataset
* ✅ Exploring images and labels
* ✅ Performing exploratory data analysis (EDA)
* ✅ Training AI (small model) for segmentation
* ✅ Evaluating performance with accuracy, confusion matrix, precision, recall, F1-score
* ✅ Performing inference and error analysis

### 📌 **Step 1: Install & Import Required Libraries**

In this step, we'll install the necessary libraries for training YOLOv12. You'll need to install the YOLOv12 package, and PyTorch to enable GPU acceleration.

In [None]:
!pip install torch==2.0.0+cu117 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
!pip install ultralytics

In [None]:
!pip install --upgrade torchvision

In [None]:
import os
import shutil
import zipfile
import matplotlib.pyplot as plt
import cv2
from sklearn.metrics import precision_recall_fscore_support, confusion_matrix
import seaborn as sns

import warnings
warnings.filterwarnings("ignore")

import locale
locale.getpreferredencoding = lambda: "UTF-8"

In [None]:
# Check if CUDA (GPU support) is available for PyTorch and print the result (True/False).
print(torch.cuda.is_available())

# Print the number of CUDA-compatible devices (GPUs) available on the system.
print(torch.cuda.device_count())

# Print the name of the first CUDA-compatible device (GPU) available (index 0).

In [None]:
# Write your code here

In [None]:
# Importing the PyTorch library, which provides tools for machine learning and deep learning.

In [None]:
# Write your code here

In [None]:
# Importing the YOLO (You Only Look Once) model from the 'ultralytics' package for object detection tasks.

In [None]:
# Write your code here

### 📌 **Step 2: Download and Prepare the Dataset**

Next, we will download the dataset from the link provided and extract it.

In [None]:
# # # Optional Dataset (Type: A)

# # Define the global variables for the developer name and repository name
# DEV_NAME = 'kaopanboonyuen'
# REPO_NAME = 'where-is-waldo'

# # Use f-string to create the dynamic URL for the dataset
# url = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-a.zip'

# # Download the zip file using wget
# # !wget https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-a.zip -O waldo-dataset-a.zip
# !wget {url} -O waldo-dataset-a.zip

# # Unzip the downloaded file
# # !unzip /content/waldo-dataset-a.zip >> logs.log
# !unzip /content/waldo-dataset-a.zip >> logs.log

In [None]:
# Define the global variable for the repository name
DEV_NAME = # Write your code here
REPO_NAME = # Write your code here

# Use f-string to create the dynamic URLs
url_part1 = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-b.zip.part1'
url_part2 = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-b.zip.part2'
url_part3 = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-b.zip.part3'
url_part4 = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-b.zip.part4'
url_part5 = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-b.zip.part5'

# Use wget with the generated URLs for each part
!wget {url_part1} -O waldo-dataset-b.zip.part1
!wget {url_part2} -O waldo-dataset-b.zip.part2
!wget {url_part3} -O waldo-dataset-b.zip.part3
!wget {url_part4} -O waldo-dataset-b.zip.part4
!wget {url_part5} -O waldo-dataset-b.zip.part5

In [None]:
def reassemble_zip(parts, output_file):
    with open(output_file, 'wb') as output_zip:
        for part in parts:
            with open(part, 'rb') as part_file:
                output_zip.write(part_file.read())
    print(f"Reassembled to {output_file}.")

parts = ['waldo-dataset-b.zip.part1', 'waldo-dataset-b.zip.part2',
         'waldo-dataset-b.zip.part3', 'waldo-dataset-b.zip.part4', 'waldo-dataset-b.zip.part5']  # List of split parts
output_file = 'waldo-dataset-b.zip'  # Output ZIP file

In [None]:
# Call Reassemble the parts of a split ZIP file into a single ZIP file.

# Write your code here

In [None]:
with zipfile.ZipFile('write-your-file-name.zip', 'r') as zip_ref:
    zip_ref.extractall('/content/')

### 📌 **Step 3: Organize Data in YOLOv12 Format**

YOLOv12 requires the data to be organized in a specific format. We'll make sure that the labels and images are correctly set up.

In [None]:
os.makedirs('/content/dataset', exist_ok=True)
os.makedirs('/content/dataset/train', exist_ok=True)
os.makedirs('/content/dataset/valid', exist_ok=True)

shutil.move('/content/waldo-dataset-b/train/images', '/content/dataset/train/images')
shutil.move('/content/waldo-dataset-b/train/labels', '/content/dataset/train/labels')
shutil.move('/content/waldo-dataset-b/valid/images', '/content/dataset/valid/images')
shutil.move('/content/waldo-dataset-b/valid/labels', '/content/dataset/valid/labels')

### 📌 Step 4: Preview Dataset with Random Image Samples

In [None]:
import random
import os
import cv2
import numpy as np
from matplotlib import pyplot as plt

def show_random_label_overlay(train_images_dir, train_labels_dir, num_images=4):
    """
    Function to show random images with their labels overlaid in a grid (1 row, 4 columns).

    :param train_images_dir: Path to the directory containing the images
    :param train_labels_dir: Path to the directory containing the labels
    :param num_images: Number of random images to show in the grid (default is 4)
    """
    fig, axs = plt.subplots(1, num_images, figsize=(15, 5))

    for i in range(num_images):
        # Randomly select an image and its corresponding label
        image_file = random.choice(os.listdir(train_images_dir))
        label_file = image_file.replace('.jpg', '.txt')  # Assuming the labels are .txt files with the same name as the image

        # Load the image
        image = cv2.imread(os.path.join(train_images_dir, image_file))
        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

        # Load the label file
        label_path = os.path.join(train_labels_dir, label_file)
        with open(label_path, 'r') as file:
            labels = file.readlines()

        # Loop through each label and draw the bounding box on the image
        for label in labels:
            parts = label.strip().split()
            class_id = int(parts[0])
            x_center, y_center, width, height = map(float, parts[1:])

            # Convert normalized coordinates to pixel values
            img_height, img_width, _ = image.shape
            x_center = int(x_center * img_width)
            y_center = int(y_center * img_height)
            width = int(width * img_width)
            height = int(height * img_height)

            # Calculate the top-left and bottom-right corners of the bounding box
            x1 = x_center - width // 2
            y1 = y_center - height // 2
            x2 = x_center + width // 2
            y2 = y_center + height // 2

            # Draw the bounding box on the image
            cv2.rectangle(image, (x1, y1), (x2, y2), (255, 0, 0), 2)  # Red color for bounding box

        # Display the image with bounding boxes overlaid
        axs[i].imshow(image)
        axs[i].axis('off')  # Turn off axis

    plt.tight_layout()
    plt.show()

In [None]:
# Call the function to show random images with label overlays

In [None]:
# Write your code here

### 📌 **Step 5: Configure the Dataset YAML File**

For YOLOv12 to know how to process the dataset, we need to create a `data.yaml` configuration file. This file specifies where the dataset is located and defines the class names.

In [None]:
# ## FOR WALDO-A DATASET

# data_yaml = """
# train: /content/dataset/train/images
# val: /content/dataset/valid/images

# nc: 4
# names: ['odlaw', 'wally', 'wenda', 'wizard_whitebeard']
# """
# with open("/content/dataset/data.yaml", "w") as f:
#     f.write(data_yaml)

In [None]:
## FOR WALDO-B DATASET

data_yaml = """
train: # Write your path here
val: # Write your path here

nc: # Write your code here
names: # Write your code here
"""
with open("/content/dataset/data.yaml", "w") as f:
    f.write(data_yaml)

### 📌 **Step 6: Load the YOLOv12 Model**

Now, load the YOLOv12 small model (YOLOv12n) from the ultralytics package.

In [None]:
# Download the pretrained YOLOv12 model (if available)
from ultralytics import YOLO

In [None]:
# Load the pretrained model (adjust the URL or model name if necessary)
model = YOLO("Write your model here")  # Replace this with the correct path or model

### 📌 **Step 7: Train the Model**

Now that we have everything ready, we can start training the model. We will fine-tune the pre-trained YOLOv12 small model on the pothole dataset.

In [None]:
# Write your trainer code here

### 📌 **Step 8: Evaluate the Model**

Once training is complete, evaluate the model's performance on the validation dataset.

In [None]:
results = # Write your code here  # Evaluate on validation set

### 📌 **Step 9: Inference and Error Analysis**

After evaluation, it's time to run inference on some images and analyze the errors. Let’s run inference on a few images and visualize the predictions.

In [None]:
# Perform inference on validation images
results = # Write your code here

In [None]:
# Load the trained model
model = YOLO('# Write your best of AI model path here')  # Replace with the correct path to your trained model

# Perform inference on the image
img_path = 'Write your inference path here'
results = model(img_path)

# Access the first result from the list (since results is a list)
result = results[0]

# Render the results (bounding boxes, labels, and confidence scores)
# Write your code here  # This will display the image with bounding boxes and labels overlaid

### 📌 **Step 10: Confusion Matrix and Precision-Recall-F1 Analysis**

Evaluate the predictions using metrics like Precision, Recall, and F1-Score:

In [None]:
from IPython.display import Image, display

# Path to the confusion matrix image
confusion_matrix_path = 'Write your CF matrix image here'

# Display the confusion matrix image
display(Image(filename=confusion_matrix_path))

### 📌 **Step 11: Summary of Waldo Faces Detected**

🎉 **Yeah, it's the final step!** We've been on an exciting journey, and now it's time to find out who found the most Waldo faces! 🕵️‍♂️

In [None]:
DEV_NAME = # Write your code here
REPO_NAME = # Write your code here

# Use f-string to create the dynamic URL
url = f'https://github.com/{DEV_NAME}/{REPO_NAME}/raw/main/dataset/waldo-dataset-test.zip'

# Now use wget with the generated URL
!wget {url} -O waldo-dataset-test.zip

In [None]:
with zipfile.ZipFile('Write your file here', 'r') as zip_ref:
    zip_ref.extractall('/content/')

In [None]:
import os
import cv2
from pathlib import Path
import torch
from matplotlib import pyplot as plt

# Load YOLO model with the best weights
model = YOLO('Write your best of AI model here')

# Path to the test dataset
test_images_dir = Path('Write your path here')

# Initialize counters
total_waldo_faces = 0

# Loop through the test images
for img_path in test_images_dir.glob('*.jpg'):
    print(f"Processing {img_path.name}... 🕵️‍♂️")

    # Read the image
    img = cv2.imread(str(img_path))
    img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

    # Make predictions with YOLO
    results = model(img_rgb)

    # Extract bounding boxes and labels
    pred = results[0]  # Get the first result from the list
    boxes = pred.boxes.xywh  # Box coordinates (x, y, width, height)
    names = pred.names  # Class names

    # Loop through predictions and filter for 'waldorotation' and 'wendarotation'
    waldo_faces = []
    for i, box in enumerate(boxes):
        label = pred.boxes.cls[i].item()  # Access the class index for this detection
        if names[label] in ['waldorotation', 'wendarotation']:  # Check if it's either 'waldorotation' or 'wendarotation'
            waldo_faces.append(box)

    # Count the number of Waldo faces found
    waldo_count = len(waldo_faces)
    total_waldo_faces += waldo_count

    # Draw the bounding boxes directly on the original image (img_rgb)
    for box in waldo_faces:
        x, y, w, h = box
        x1, y1, x2, y2 = int(x - w / 2), int(y - h / 2), int(x + w / 2), int(y + h / 2)
        cv2.rectangle(img_rgb, (x1, y1), (x2, y2), (0, 255, 0), 2)

    # Show the image with bounding boxes overlayed
    plt.figure(figsize=(10, 10))
    plt.imshow(img_rgb)
    plt.title(f"{img_path.name} - Found {waldo_count} Waldo Face{'s' if waldo_count != 1 else ''} 😎")
    plt.axis('off')
    plt.show()

# Summary after processing all images
print(f"\n🎉🎉🎉 Total Waldo Faces Found: {total_waldo_faces} 😎")
print("The winner will be the student who finds the most Waldo faces! 🏆")

# DONE