# **Computer Vision**

# **YOLO (You Only Look Once)**
It is an advanced deep learning model used to object detection in videos and real timefeed. Its the fifth version of the original YOLO model. It is open sourced and available on github.

* **Single Neural Network:** YOLOv5 is based onN CNN to predict multiple bounding boxes and classes.
* **Image Processing:** It will take the input and convert it into a fixed size frame(640X640) pixels. It will further divide the frame into multiple grids of smaller size.
* **Boundary Box Prediction and class prediction:** Boundary box prediction refcers to grid evaluation to decide the start and end of a class. Class prediction means predicting what class that object belongs to.
* **Non-max suppression:** To reduce the redundancy of drawing boxes or boundaries, YOLOv5 users a technique called non-max suppresion. It will eliminate the overlap between multiple boundaries.

In [2]:
import os
import torch
import numpy as np
import cv2


In [3]:
!git clone https://github.com/ultralytics/yolov5

Cloning into 'yolov5'...
remote: Enumerating objects: 16575, done.[K
remote: Counting objects: 100% (53/53), done.[K
remote: Compressing objects: 100% (35/35), done.[K
remote: Total 16575 (delta 28), reused 37 (delta 18), pack-reused 16522[K
Receiving objects: 100% (16575/16575), 15.03 MiB | 14.05 MiB/s, done.
Resolving deltas: 100% (11387/11387), done.


In [4]:
%cd yolov5

/content/yolov5


In [5]:
!pip install -qr requirements.txt

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.3/207.3 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m4.5/4.5 MB[0m [31m96.0 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m755.1/755.1 kB[0m [31m70.2 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.7/62.7 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
imageio 2.31.6 requires pillow<10.1.0,>=8.3.2, but you have pillow 10.3.0 which is incompatible.[0m[31m
[0m

In [6]:
from models.common import DetectMultiBackend
from utils.general import non_max_suppression, check_file

In [7]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

In [10]:
def load_model(model_name='yolov5s.pt'):
  model_path = os.path.join('weights', model_name)
  if not os.path.isfile(model_path):
    url = f'https://github.com/ultralytics/yolov5/releases/download/v6.0/{model_name}'
    !wget {url} -O {model_path}
  model = DetectMultiBackend(model_path,device = device)
  return model
model = load_model()





weights/yolov5s.pt: No such file or directory


Downloading https://github.com/ultralytics/yolov5/releases/download/v7.0/yolov5s.pt to weights/yolov5s.pt...
100%|██████████| 14.1M/14.1M [00:00<00:00, 360MB/s]

Fusing layers... 
YOLOv5s summary: 213 layers, 7225885 parameters, 0 gradients, 16.4 GFLOPs


In [11]:
%cd ..

/content


In [15]:
def process_frame(frame, model, size=640):
  orig_h , orig_w = frame.shape[:2]
  frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
  frame_resized = cv2.resize(frame_rgb, (size , size))
  frame_transposed = np.transpose(frame_resized , (2,0,1))
  frame_norm = frame_transposed / 255.0
  frame_tensor = torch.from_numpy(frame_norm).float().unsqueeze(0).to(device)
  pred = model(frame_tensor)
  pred = non_max_suppression(pred,0.25,0.45,classes=None)
  frame_output = frame_rgb.copy()
  if pred[0] is not None and len(pred[0]):
    det = pred[0]
    scale_factors = torch.tensor([orig_w/size,orig_h / size,orig_w / size,orig_h / size],device=device)
    det[:, :4] = det[:, :4] * scale_factors
    for *xyxy, conf , cls in det:
      x1, y1, x2 , y2 = map(int, xyxy)
      cv2.rectangle(frame_output, (x1,y1), (x2 , y2), (255,0,0), 2)
      cv2.putText(frame_output, f'{model.names[int(cls)]},{conf:.2f}', (x1,y1 -10), cv2.FONT_HERSHEY_SIMPLEX,0.9,(255,0,0), 2)
    return cv2.cvtColor(frame_output, cv2.COLOR_RGB2BGR)

















In [16]:
def process_video(video_path, output_path, model):
  cap = cv2.VideoCapture(video_path)
  vid_write = cv2.VideoWriter(output_path, cv2.VideoWriter_fourcc(*'mp4v'),20, (int(cap.get(3)), int(cap.get(4))))
  while cap.isOpened():
    ret , frame = cap.read()
    if not ret:
      break
    processed_frame = process_frame(frame , model)
    vid_write.write(processed_frame)
  cap.release()
  vid_write.release()
  print(f"Processing Completed Succesfully! The saved file is in {output_path}")




In [None]:
video_path='/content/video1.mp4'
output_path='/content/ProcessedVideo1.mp4'
process_video(video_path,output_path,model)

In [None]:
video_path='/content/video2.mp4'
output_path='/content/ProcessedVideo2.mp4'
process_video(video_path,output_path,model)