# Object detection with ImageAI

In this tutorial we will be using the [ImageAI library](https://github.com/OlafenwaMoses/ImageAI).
It provides classes and pre-trained models for both object detection and image recognition, as well as, the ability to train your custom models.

## Install ImageAI

In [2]:
# !pip install -U imageai

In [3]:
# NOTE: Large files were not included in Repo due to large file sizes in Github.
# Please make sure to download from the wget commands below.

## Upload the models to Colab

Download them to your local computer and then upload to Colab.

 * [YOLO v3](https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/yolov3.pt)
 * [Tiny YOLO v3](https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/tiny-yolov3.pt)
 * [RetinaNet](https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/retinanet_resnet50_fpn_coco-eeacb38b.pth)


```Python
from google.colab import files

uploaded = files.upload()
```

## Set runtime to GPU

 Runtime > Change runtime type > Hardware acceleratior > GPU

## The data

Upload `tutorial_images.zip` and upzip them.

In [4]:
# from google.colab import files
# uploaded = files.upload()

In [5]:
# # Google Collab code to download all the files.
# # Download retinanet
# !wget https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/retinanet_resnet50_fpn_coco-eeacb38b.pth

# # Download YOLOv3
# !wget https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/yolov3.pt

# # Download Tiny YOLOv3
# !wget https://github.com/OlafenwaMoses/ImageAI/releases/download/3.0.0-pretrained/tiny-yolov3.pt

In [6]:
# Since the files are already set up in my directory, I will skip this.

# !mkdir -p data/images
# !unzip tutorial_images.zip -d data/images
# image_path = 'data/images'

In [7]:
# pip install imageai
# pip install opencv-python
# pip install torch torchvision

In [8]:
import os
from imageai.Detection import ObjectDetection

Create a new instance of the `ObjectDetection` class.

In [9]:
detector = ObjectDetection()

Set the model type of the object detection instance.

In [10]:
# YOLOv3
detector.setModelTypeAsYOLOv3()

Set the path to the downloaded, pre-trained model.

In [11]:
# YOLOv3
detector.setModelPath("yolov3.pt")

Load the model

In [12]:
detector.loadModel()

  state_dict = torch.load(self.__model_path, map_location=self.__device)


## Test the model

In [13]:
detections = detector.detectObjectsFromImage(input_image=os.path.join("/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images", "11.jpg"),
                                             output_image_path="11_detected.jpg",
                                             minimum_percentage_probability=30)

In [14]:
for eachObject in detections:
    print(eachObject["name"] , " : ", eachObject["percentage_probability"], " : ", eachObject["box_points"] )
    print("--------------------------------")

person  :  99.99  :  [322, 37, 548, 359]
--------------------------------
person  :  99.99  :  [99, 52, 302, 362]
--------------------------------
person  :  100.0  :  [577, 72, 798, 457]
--------------------------------
person  :  99.97  :  [0, 91, 280, 525]
--------------------------------
cup  :  99.84  :  [565, 403, 627, 486]
--------------------------------
cup  :  99.76  :  [466, 321, 508, 380]
--------------------------------
cup  :  99.66  :  [334, 390, 384, 458]
--------------------------------
cup  :  97.75  :  [507, 339, 543, 397]
--------------------------------
laptop  :  99.99  :  [169, 256, 418, 413]
--------------------------------
cell phone  :  99.28  :  [507, 399, 562, 422]
--------------------------------


In [15]:
# YOLOV3
detector.setModelPath("/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/retinanet_resnet50_fpn_coco-eeacb38b.pth")

ValueError: invalid path, path not pointing to a valid file.

In [10]:
#
detector.setModelTypeAsRetinaNet()
detector.setModelPath("/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/retinanet_resnet50_fpn_coco-eeacb38b.pth")
detector.loadModel()

# detections = detector.detectObjectsFromImage(input_image =os_path.join())

detections = detector.detectObjectsFromImage(input_image=os.path.join("/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images", "11.jpg"),
                                             output_image_path="11_detected.jpg",
                                             minimum_percentage_probability=30)

  state_dict = torch.load(self.__model_path, map_location=self.__device)


## Task 1

Test the performance on all images in `tutorial_images` using `YOLOv3` and `minimum_percentage_probability = 30`.

In [None]:
# Did this above
my_images = ["6.jpg", "7.jpg", "8.jpg"]

In [23]:
image_path = "/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images"

In [24]:
detector = ObjectDetection()

# YOLOv3
detector.setModelTypeAsYOLOv3()
detector.setModelPath("yolov3.pt")
detector.loadModel()


  state_dict = torch.load(self.__model_path, map_location=self.__device)


In [25]:
# Import Statements
from glob import glob
fnames = glob(image_path +"/*.jpg")
print(fnames)

['/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/01.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/03.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/02.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/12.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/06.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/07.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/13.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/05.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/11.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500/lecture_notes/lecture_5/data/images/10.jpg', '/Users/christine/VSCode/MMAI5500_remote/MMAI5500

In [20]:
len(fnames)

13

In [22]:
output_folder= "/Users/christine/VSCode/MMAI5500_remote/MMAI5500/output_path"
i = 0

for item in fnames:
    i = i + 1
    output_image_path = os.path.join(output_folder, f"output_image_{i}.jpg")
    detections = detector.detectObjectsFromImage(input_image=item,
                                             output_image_path= output_image_path,
                                             minimum_percentage_probability=30)
    # Results for each image:
    
    
    for eachObject in detections:
        print(eachObject["name"] , " : ", eachObject["percentage_probability"], " : ", eachObject["box_points"] )
        print("--------------------------------")

person  :  100.0  :  [174, 0, 669, 431]
--------------------------------
person  :  100.0  :  [93, 10, 333, 396]
--------------------------------
person  :  100.0  :  [476, 41, 703, 373]
--------------------------------
person  :  99.94  :  [351, 85, 379, 171]
--------------------------------
person  :  99.52  :  [1, 78, 31, 247]
--------------------------------
person  :  98.75  :  [80, 82, 125, 233]
--------------------------------
person  :  99.95  :  [372, 81, 399, 159]
--------------------------------
bicycle  :  99.81  :  [401, 135, 712, 421]
--------------------------------
bicycle  :  99.83  :  [67, 145, 320, 421]
--------------------------------
car  :  95.39  :  [455, 91, 531, 160]
--------------------------------
traffic light  :  99.97  :  [658, 10, 685, 24]
--------------------------------
person  :  100.0  :  [282, 5, 442, 167]
--------------------------------
person  :  99.94  :  [281, 49, 335, 150]
--------------------------------
bottle  :  43.97  :  [227, 130, 241, 15

**Tutorial question 1**: How many objects were detected in images 6, 7 and 9?

**Tutorial question 2**: What is detected in image 4?

In [None]:
# Objects identifed = 1, 1 , 1

In [None]:
# person


## Task 2
Compare the performance of the 3 models.

In [None]:
# Put in a for-loop

# Find a list of images, feed into the image one by one, detected with better name
# Output will be 1 detected, image 2 will be 2 detected

In [None]:
# Which model is best? Any of them, really. But we saw in the Waterfall img that it accidentally deleted a ledge as a person.

**Tutorial question 3**: Which model was best?

## Object detection in video

In [27]:
!mkdir -p data/video
!unzip tutorial_video.zip -d data/video
video_path = 'data/video'

Archive:  tutorial_video.zip
  inflating: data/video/traffic.mp4  
  inflating: data/video/traffic-mini.mp4  
  inflating: data/video/holo1.mp4    


In [26]:
from imageai.Detection import VideoObjectDetection
detector = VideoObjectDetection()

In [28]:
detector.setModelTypeAsYOLOv3()
detector.setModelPath("yolov3.pt")

In [29]:
detector.loadModel()

  state_dict = torch.load(self.__model_path, map_location=self.__device)


In [30]:
video_path = detector.detectObjectsFromVideo(os.path.join(video_path, "traffic-mini.mp4"),
                                             output_file_path="traffic-mini_detected",
                                             frames_per_second=5, log_progress=True)
print(video_path)

OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)'
OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v'


Processing Frame :  1
Processing Frame :  2
Processing Frame :  3
Processing Frame :  4
Processing Frame :  5
Processing Frame :  6
Processing Frame :  7
Processing Frame :  8
Processing Frame :  9
Processing Frame :  10
Processing Frame :  11
Processing Frame :  12
Processing Frame :  13
Processing Frame :  14
Processing Frame :  15
Processing Frame :  16
Processing Frame :  17
Processing Frame :  18
Processing Frame :  19
Processing Frame :  20
Processing Frame :  21
Processing Frame :  22
Processing Frame :  23
Processing Frame :  24
Processing Frame :  25
Processing Frame :  26
Processing Frame :  27
Processing Frame :  28
Processing Frame :  29
Processing Frame :  30
Processing Frame :  31
Processing Frame :  32
Processing Frame :  33
Processing Frame :  34
Processing Frame :  35
Processing Frame :  36
Processing Frame :  37
Processing Frame :  38
Processing Frame :  39
Processing Frame :  40
Processing Frame :  41
Processing Frame :  42
Processing Frame :  43
Processing Frame :  