<a href="https://colab.research.google.com/github/pranjaldatta/vehicle-detection-assignment/blob/main/ml_vehicledet_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
# Check the GPU we are running

!nvidia-smi

Thu Nov  5 10:01:52 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.32.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   36C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
|                               |                      |                 ERR! |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Clone The Repositories

Here, we clone the required repositories. The two repos being cloned are, 
* PyVision: Ready to use computer vision algorithms available with pretrained weights
* vehicle-detection-assignment: The Repository containing the test video and other required files for this assignment

In [None]:
# Clone the PyVision repository for ready-to-use code
# Author: Pranjal Datta

!git clone https://github.com/pranjaldatta/PyVision.git
!git clone https://github.com/pranjaldatta/vehicle-detection-assignment.git

# Fixed tqdm for colab
# IGNORE the errors
!pip install --force https://github.com/chengs/tqdm/archive/colab.zip -qqq

## Imports

Before we proceed any furthur, we need to make the necessary imports.

In [13]:
# Importing other dependecies. (Ignore the tqdm warnings)

import cv2
import numpy as np 
from google.colab.patches import cv2_imshow

import os
from IPython.display import HTML
from base64 import b64encode
from tqdm.autonotebook import tqdm
import torch

## Instantiate the Model 

For this assignment, we would be using the **You Only Look Once (v3)** Detection algorithm, better known as YOLOv3. It's a single-shot detector that has had a tremendous impact in the field of Object Detection. The paper can be found [here](https://pjreddie.com/media/files/papers/YOLOv3.pdf). 

The model implementation being used here is taken from the [PyVision Repository](https://github.com/pranjaldatta/PyVision). For documentation, please visit [here](https://github.com/pranjaldatta/PyVision/tree/master/pyvision/detection/yolov3). 
The implementation is built in a manner, that it automatically handles pretrained model download, loading, pre and post processing during inference *in the background, without any user intervention*. Hence, the user only has to *import, instantiate and call the API*!

In [15]:
# Instantiate the model with pretrained with weights
# We use the YOLOv3 model for object detection

from PyVision.pyvision.detection import yolov3


# checking if CUDA is enabled
if torch.cuda.is_available():
  device = "gpu"
else:
  device = "cpu"

yolo = yolov3.YOLOv3(device=device, confidence=0.8, nms_thresh=0.6)

Weight file exists.


## Video Parameters

Here we define the path to the video, codecs etc, necessary for writing the video to a file. 

Finally we load the video into a cap to read during the inference loop.

In [5]:
video_path = "/content/vehicle-detection-assignment/road-traffic.mp4"

In [6]:
saveName = "detect.mp4"
compressedName = "detect_compressed.mp4"

In [7]:
size = 800
fourcc = cv2.VideoWriter_fourcc(*'MP4V')

writer = cv2.VideoWriter(filename=saveName, fourcc=fourcc, fps=30, frameSize=(size, size))

In [8]:
cap = cv2.VideoCapture(video_path)


## Inference Loop

Finally, we run inference.For model return values, check [docs](https://github.com/pranjaldatta/PyVision/tree/master/pyvision/detection/yolov3). We store the detected frames in a list to be later written into a video by VideoWriter.

In [9]:
counter = 0
MAX_FRAMES = 700 
det_frames = []

if MAX_FRAMES > 3000:
  print("[WARN]: test video contains 3000 frames. Setting MAX_FRAMES to 3000")
  MAX_FRAMES = 3000

progress_bar = tqdm(range(0, MAX_FRAMES))
for idx,  _ in enumerate(progress_bar):
  
  ret, frame = cap.read()
  frame = cv2.resize(frame, (800, 800))
  
  try:
    time, imgs, objs = yolo.detect(frame, save=False, show=False)
  except:
    pass
  
  #print(f"Frame Id: {counter}/{MAX_FRAMES}: {time} secs")
  progress_bar.set_description(
      "Frame Id: {}/{}: {:.4f} secs".format(idx+1, MAX_FRAMES, time)
  )
  det_frames.append(imgs)
  






In [10]:
## Write the detected frames into a video. The detected video is ready! Can be downloaded from the File Explorer tab!

for frame in tqdm(det_frames):
  writer.write(frame)
writer.release()




In [11]:
# Compression step! Just to make it work on colab!

if os.path.exists(compressedName):
  os.remove(compressedName)
compressed_path =  compressedName
os.system(f"ffmpeg -i {saveName} -vcodec libx264 {compressed_path}")


0

In [12]:
if MAX_FRAMES > 700:
  print("Too many frames! Colab takes too much time to display large videos! Download the video from 'File Explorer' and play it locally!")
  raise ValueError("Too many frames to display! Please download and play the video locally!")

# Show video
mp4 = open(compressed_path,'rb').read()
decoded_vid = "data:video/mp4;base64," + b64encode(mp4).decode()
HTML(f'<video width=800 controls><source src={decoded_vid} type="video/mp4"></video>')