# **Object Detection in Video**

Object detection techniques are a trend nowadays in the field of computer vision. There are many techniques which are used to detect objects in the scene images and videos. Each of these techniques has its own advantages and limitations in terms of resources and execution time. Detecting objects in videos also requires a lot of technical knowledge and resources. 

So, there is always a search for an easy and quick method for object detection. In this article, we will show how to detect objects seen in the videos in only 5 steps. We will use the pixellib library in this task which detects objects using instance segmentation. We will also use the pre-trained Mask R-CNN model to identify the objects seen in the videos. In this implementation, we will detect vehicle objects in traffic video.

To read about the model, please refer [this](https://analyticsindiamag.com/hands-on-guide-to-detect-objects-in-video-in-5-steps/) article.

# **Code Implementation**

Now, we will discuss the steps through which we will detect the objects in a video.

1. Install library and dependencies

In the first step, we need to install the pixellib library and its dependencies.

In [None]:

!python -m pip install pip --upgrade --user -q --no-warn-script-location
!python -m pip install numpy pandas seaborn matplotlib scipy statsmodels sklearn nltk opencv-python tensorflow keras torch torchvision \
    tqdm scikit-image pixellib pytube --user -q --no-warn-script-location

import IPython
IPython.Application.instance().kernel.do_shutdown(True)


2. Load pre-trained Mask-RCNN weights

As we are going to use the Mas R-CNN model to detect objects, we will download its pre-trained weights.



In [2]:
# !wget --quiet https://github.com/matterport/Mask_RCNN/releases/download/v2.0/mask_rcnn_coco.h5

3. Import libraries

Now, we will import the pixellib library that has been installed. We will also import the instance_segmentation class because we are going to detect objects using instance segmentation method.



In [3]:
import pixellib
from pixellib.instance import instance_segmentation

4. Instantiate instance segmentation model and load Mask-RCNN weights

In this step, we will instantiate the instance segmentation class that is provided by pixellib and load the Mask R-CNN model with its pre-trained weights.

In [4]:
segment_video = instance_segmentation()
segment_video.load_model("https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/raw/main/object_detection/mask_rcnn_coco.h5")

Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.
Instructions for updating:
Use fn_output_signature instead


5. Detect objects

In this step, we will process the object detection task by Mask R-CNN in a video. A random traffic video is used in which we want to detect vehicle objects.

In this method, we set the frames per second that are the number of frames per second output video will have. 

In [8]:
import pytube  
from pytube import YouTube  
video_url = 'https://www.youtube.com/watch?v=G7FZu0-q0j0'   
youtube = pytube.YouTube(video_url)  
video = youtube.streams.first()  
video.download()  

'/content/Vehicle Detection from Traffic Video using Mask RCNN.mp4'

In [None]:
segment_video.process_video("Vehicle Detection from Traffic Video using Mask RCNN.mp4", show_bboxes = True, frames_per_second= 15, output_video_name="object_detect.mp4")



No. of frames: 1
No. of frames: 2
No. of frames: 3
No. of frames: 4
No. of frames: 5
No. of frames: 6
No. of frames: 7
No. of frames: 8
No. of frames: 9
No. of frames: 10
No. of frames: 11
No. of frames: 12
No. of frames: 13
No. of frames: 14
No. of frames: 15
No. of frames: 16
No. of frames: 17
No. of frames: 18
No. of frames: 19
No. of frames: 20
No. of frames: 21
No. of frames: 22
No. of frames: 23
No. of frames: 24
No. of frames: 25
No. of frames: 26
No. of frames: 27
No. of frames: 28
No. of frames: 29
No. of frames: 30
No. of frames: 31
No. of frames: 32
No. of frames: 33
No. of frames: 34
No. of frames: 35
No. of frames: 36
No. of frames: 37
No. of frames: 38
No. of frames: 39
No. of frames: 40
No. of frames: 41
No. of frames: 42
No. of frames: 43
No. of frames: 44
No. of frames: 45
No. of frames: 46
No. of frames: 47
No. of frames: 48
No. of frames: 49
No. of frames: 50
No. of frames: 51
No. of frames: 52
No. of frames: 53
No. of frames: 54
No. of frames: 55
No. of frames: 56
N

({'class_ids': array([7, 3, 3, 1, 3], dtype=int32),
  'masks': array([[[False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False],
          ...,
          [False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False]],
  
         [[False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False],
          ...,
          [False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False]],
  
         [[False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False],
          ...,
          [False, False, False, False, False],
          [False, False, False, False, False],
          [False, False, False, False, False]],
  
         ...,
  
         [[False, F

In [None]:
from IPython.display import Video
Video("object_detect.mp4")