#PLEASE READ BEFORE YOU START!!!!


##PLEASE CREATE AN ACCOUNT AT ROBOFLOW IN ORDER FOR THIS WORKSHOP TO GO SMOOTHLY!!!

# Pretrained Weights Section (COCO Dataset)


## Step 1:
###Below are the dependencies that you will need in order to follow the workshop!

######TIP: Run the cell by clicking on the button right here

##⇓

In [None]:
#This installs the deep learning framework (PyTorch), and its image processing tools (torchvision), and its audio tools since we are messing with videos.
!pip install torch torchvision torchaudio

#This installs a library for real-time computer vision, image processing, and video manipulation (OpenCV).
!pip install opencv-python

#This Downloads the YOLOv5 repository from GitHub, which is the model that can parse through our custom weights for projects. (YOLOv5 is known for being good at real-time processing).
!git clone https://github.com/ultralytics/yolov5

#This enters the directory and downloads the other needed files to run YOLOv5.
%cd yolov5
!pip install -r requirements.txt

Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch)
  Downloading nvidia_cublas_cu12-12.4.5.8-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cufft-cu12==11.2.1.3 (from torch)
  Downloading nvidia_cufft_cu12-11.2.1.3-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-curand-cu12==10.3.5.147 (from torch)
  Downloading nvidia_curand_cu12-10.3.5

## Step 2:
Here are the functions that you will need to call in order to follow our workshop.  

In [None]:
import torch
import cv2
import numpy as np
import os
from google.colab import files

## Step 3:
Checks if you're using the CPU or GPU for processing. This can be very important because GPU significantly accelerates model training compared to a CPU, bc of the parallel architecture optimized for deep learning workloads (meaning it can parse many things at once) compared to a CPU, which processes tasks sequentially (one after another).

In [None]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

Using device: cpu


## Step 4:
Here is how you load the pretrained model that Yolo provides for you. v5 comes with detection for a dataset called COCO (Common Objects in Context). It detects 80 things, these being random items like:

buses, trains, cows, bears, skateboards, bananas, people, laptops, and even teddy bears!

In [None]:
model = torch.hub.load('ultralytics/yolov5', 'yolov5s', pretrained=True)
model.to(device)

Downloading: "https://github.com/ultralytics/yolov5/zipball/master" to /root/.cache/torch/hub/master.zip


Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.


YOLOv5 🚀 2025-2-22 Python-3.11.11 torch-2.5.1+cu124 CPU

Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to yolov5s.pt...
100%|██████████| 14.1M/14.1M [00:00<00:00, 76.1MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs
Adding AutoShape... 


AutoShape(
  (model): DetectMultiBackend(
    (model): DetectionModel(
      (model): Sequential(
        (0): Conv(
          (conv): Conv2d(3, 32, kernel_size=(6, 6), stride=(2, 2), padding=(2, 2))
          (act): SiLU(inplace=True)
        )
        (1): Conv(
          (conv): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
          (act): SiLU(inplace=True)
        )
        (2): C3(
          (cv1): Conv(
            (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1))
            (act): SiLU(inplace=True)
          )
          (cv2): Conv(
            (conv): Conv2d(64, 32, kernel_size=(1, 1), stride=(1, 1))
            (act): SiLU(inplace=True)
          )
          (cv3): Conv(
            (conv): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
            (act): SiLU(inplace=True)
          )
          (m): Sequential(
            (0): Bottleneck(
              (cv1): Conv(
                (conv): Conv2d(32, 32, kernel_size=(1, 1), stride=(1, 1))
  

In [None]:
custom_path = '/content/drive/MyDrive/my_yolov5s.pt' #@param {type:"string"}

model = torch.hub.load('.', 'custom', path=custom_path, source='local', force_reload=True)
model.to(device)

YOLOv5 🚀 2025-2-22 Python-3.11.11 torch-2.5.1+cu124 CPU



Exception: [Errno 2] No such file or directory: '/content/drive/MyDrive/my_yolov5s.pt'. Cache may be out of date, try `force_reload=True` or see https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading for help.

## Step 5:
We are going to ask you to upload a video in order to test the COCO trained YOLOv5 model. We will give you some time to take a video if needed. It will work with any video ever though, the time it takes to process is determined on length.

In [None]:
print("Please upload a video file")
uploaded = files.upload()
video_path = list(uploaded.keys())[0]
print(f"Video uploaded: {video_path}")

Please upload a video file


IndexError: list index out of range

## Step 6:
Here we take your video input and run the model inference throughout the video. Since Google Colab does not allow live video inference sadly. We save it as 'output_video.avi' please change it to whatever you would like. Please do be careful with the video size and length as this process could take a while. And that you ran every cell before. (The warnings filter is since YOLOv5 is somewhat outdated and has a really annoying message that spams with each frame).  

In [None]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

cap = cv2.VideoCapture(video_path)
output_path = 'output_video.avi'
fourcc = cv2.VideoWriter_fourcc(*'XVID')
fps = int(cap.get(cv2.CAP_PROP_FPS))
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
out = cv2.VideoWriter(output_path, fourcc, fps, (frame_width, frame_height))

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

    results = model(frame_rgb)

    annotated_frame = results.render()[0]

    annotated_frame_bgr = cv2.cvtColor(annotated_frame, cv2.COLOR_RGB2BGR)
    out.write(annotated_frame_bgr)

cap.release()
out.release()
cv2.destroyAllWindows()

print(f"Processed video saved as {output_path}")

## Step 7:
Download it and check it out!!!  

In [None]:
files.download(output_path)

# Custom Weights Section (Test)

## Step 1:

Now it's time to get a little more creative, most things will be the exact same so we don't even need to download anything else. Here we are going to use custom data in order to train the model to do our bidding.

First we are going to use roboflow and create a workflow. I reccommend using their website for their dynamic layout, but for the sake of time we will use the API.

In order to utilize the API we will need to create a account at Roboflow, create a project, and then go to Settings -> API Key. Keep it safe.

Also make sure to download all libraries.

In [None]:
from google.colab import output
output.eval_js('window.open("https://app.roboflow.com")')

MessageError: DataCloneError: Failed to execute 'postMessage' on 'MessagePort': [object Window] could not be cloned.

##Step 2:
Now that you have acquired your API key, we can begin creating the customly trained model.

In [None]:
!pip install roboflow
!git clone https://github.com/vc1492a/Hey-Waldo.git
!cd Hey-Waldo


In [None]:
!mkdir -p waldo_dataset/images waldo_dataset/annotations
!mv 128/* waldo_dataset/images/

mv: cannot stat '128/*': No such file or directory


##Step 3:
Now that we isntalled all of the roboflow instances that we need, navigated into the directory and added all the important files in the the annotations folder.

In [None]:
import os
import getpass
from roboflow import Roboflow

api_key = getpass.getpass("Enter your Roboflow API Key: ")
rf = Roboflow(api_key=api_key)

Enter your Roboflow API Key: ··········


In [None]:
workspace_id = getpass.getpass("Enter your Workspace ID: ")
workspace = rf.workspace(f"{workspace_id}")

Enter your Workspace ID: ··········
loading Roboflow workspace...


In [None]:
project_id = getpass.getpass("Enter your Project ID: ")
project = rf.project(f"{project_id}")

Enter your Project ID: ··········


In [None]:
dataset_path = "Hey-Waldo/128"
annotation_path = "waldo_dataset/annotations"

for category in ["notwaldo"]:
    image_folder = os.path.join(dataset_path, category)
    annotation_folder = os.path.join(annotation_path, category)

    for filename in os.listdir(image_folder):
        if filename.endswith(".jpg") or filename.endswith(".png"):
            image_file = os.path.join(image_folder, filename)
            annotation_file = os.path.join(annotation_folder, filename.replace('.jpg', '.txt'))

            if os.path.exists(annotation_file):
                project.upload(image_file, annotation_path=annotation_file)
                print(f"✅ Uploaded {filename} with annotation ({category})")
            else:
                project.upload(image_file)
                print(f"⚠️ Uploaded {filename} WITHOUT annotation ({category})")

print("🎯 Dataset Upload Complete!")

⚠️ Uploaded 7_2_6.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 16_4_7.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 7_5_5.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 20_6_6.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 1_6_7.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 21_4_6.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 9_3_2.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 4_1_0.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 10_3_4.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 11_4_5.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 12_1_3.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 13_7_5.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 4_3_7.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 16_7_0.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 20_0_4.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 6_6_0.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 4_3_5.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 14_2_2.jpg WITHOUT annotation (notwaldo)
⚠️ Uploaded 18_1_5.jpg WITHOUT annotation (notwaldo)
⚠

##Step 4:

Now that we uploaded all the files that we would like to annotate, we should go to the website and either use their automated annotations (which works very well with items that are already detected in COCO and other big datasets, but since Waldo is quite niche manual annotations are better).

But since that takes a while as you can see, we did all the dirty work behind the scenes.

Download the dataset from this link.

In [None]:
output.eval_js('window.open("https://app.roboflow.com/wheres-waldo-fk6zk/detect-and-classify-object-detection-3zmfh/2")')

Okay now that we are here we will use roboflows model training for simplicity. You can always download hte zip file yourself and train your own annotations using a command like prompt and control the hyperparameters example below.

####python train.py --img 640 --batch 16 --epochs 50 --data ../waldo_dataset dataset.yaml --weights yolov5s.pt

But also once youre training ur model you have to think about how your data is processed. For example randomly cropped data can help a machien not associate detection with size, rotated data can help it not assosiate with degree of turn, etc. These small varying changes in data can help the machien recognize data mroe discreetly.