#**[tinyvision.ai](https://https://tinyvision.ai/) Object Detection Tutorial**
---
*Thank you for purchasing a tinyvision.ai Visual SoM development kit. We are eager to see how you put this "tiny" yet powerful object detector to solve everyday problems.*

This tutorial focuses on training a brand-new object detection model for detecting human beings in an image stream from a camera. The process can be conveniently extended to the problem of multi-class object detection, for instance detecting different fruits in a bag. The list of objects can be expanded as per your choice. We would be utilizing the Open Images database for training and testing images in this tutorial. So, it will be a nice exercise to search the Open Images database for the object class that you might be interested in. If the object class is covered in Open Images database, then annotated training and testing images will be obtained easily. However, if your chosen object class is absent from the Open Images database, then the user of this tutorial will have to obtain and annotate the training and testing images themselves. There are a variety of methods and tools by which you can obtain training images and annotate those images, for instance the software package LabelImg. 

This tutorial uses Tensorflow GPU version 1.14.0. We need this version of TensorFlow in order to ensure compatibility with the Lattice software packages which will be used after this model training exercise.

#Installing and loading necessary packages


Note that Google Colab currently supports TensorFlow 2.0 but Tensorflow 1.14.0 will be utilized in this tutorial.

**NOTE** After executing the next cell, you must restart your runtime in order to utilize the installed packages. If no more installations are needed, then you will be able to proceed to the next cell without any hurdle. Else, you might have to restart your runtime again. It takes a couple of seconds to restart your runtime.



In [0]:
!apt-get install tree
!pip install -U tensorflow-gpu==1.14.0 awscli easydict numpy pandas matplotlib opencv-python tqdm

In [0]:
from __future__ import print_function, absolute_import
from __future__ import division, unicode_literals
import warnings
warnings.filterwarnings("ignore")

import tensorflow as tf
print(tf.__version__)
# %load_ext tensorboard

import numpy as np
import os, sys, glob, cv2
from tqdm import notebook
tqdm = notebook.tqdm
import datetime, time

#**Downloading OIDv4 Toolkit**

Here, we download the Github repository containing the python scripts needed for downloading the images of certain object classes from the Open Images database. 

In [0]:
## Copy the Github reporsitory for Open Images Downloader Toolkit
!git clone https://github.com/EscVM/OIDv4_ToolKit.git

#**Define the Project and Get Data**

We define the project name and the objects that would be detected with the model being trained here. We choose the objects and we set the maximum number of images that we want for each class. Most of the projects will have a multiple object classes. So we will use the option of downloading all the images of all the objects in the same folder. Hence, the "--multiclass 1" parameter specification. 

**Note:** the images and labels downloaded in this step will be available in the Google Colab 	workspace. This workspace is temporary, in the sense that the downloaded data is stored 	only for a particular session/runtime. When the session/runtime is restarted or 	changed, the data is lost

In [0]:
#### Case of multi-class object detection problem
# PROJECT_NAME = "Fruits"
# !python ./OIDv4_ToolKit/main.py downloader --classes Apple Orange Banana Mango --limit 800 --type_csv all --multiclass 1 -y

#### Case of single class object detection problem
PROJECT_NAME = "Humans"
!python ./OIDv4_ToolKit/main.py downloader --classes Person --limit 2000 --type_csv all --multiclass 1 -y

**Training and Testing Data**

Creating the necessary folder structure to separate the training and testing datasets. Here, we combine the training and validation images and labels into a single folder and treat it as training data, whereas all the testing data is kept for evaluating the trained object detection model. The ratio between training and testing data generally hovers around 70:30 in our examples, which is a ratio (approximately) commonly found in the machine learning literature. 

In [0]:
#### making directories to store the resized data
BASE_PATH = "/content/Resized_Data/" + PROJECT_NAME
!mkdir -p $BASE_PATH"/training/images"
!mkdir -p $BASE_PATH"/training/labels"
!mkdir -p $BASE_PATH"/testing/images"
!mkdir -p $BASE_PATH"/testing/labels"
!mkdir -p $BASE_PATH"/ImageSets"

In [0]:
### Find the folder name that is relevant to your project
folders_OID = os.listdir("./OID/Dataset/train/")

### Choosing the Fruits dataset, hence using the word "Apple" to detect the right folder. Change the search keyword
### as per your project's requirements.
### You can replace "Apple" by "Person" to search for raw data needed for "Human Detection Project".

### Make sure you have only one folder that contains your training data. If the following assert check fails, 
### try changing the search phrase, so that you can get the unique folder that you need for this project.

folder_OID = [folder for folder in folders_OID if "Person" in folder.split("_")]
print("The folder in demand: ", folder_OID)

assert len(folder_OID) == 1
folder_OID = folder_OID[0]

The folder in demand:  ['Person']


#**Image and Bounding Box Resizing**

In this step, we resize the images downloaded from OID to some user-defined values. Note that these image dimensions will also be used for the training algorithm. After the image resizing, we also need to ensure that the bounding box surrounding the objects in the image are resized as well. So a Python script in one of the upcoming cells does this task for each image and its corresponding label text file.

**NOTE** After the resizing operation, make sure you execute the step of copying the resized training and testing images and bounding boxes to your Google Drive account. If you quit the session after resizing the images and the boxes, the data will lost from the Google Colab workspace (temporary) and you will have to start from the dowload stage described above.

In [0]:
### setting up the folders for accessing the images and labels for resizing operation
### Make sure you are in "/content" directory which is equivalent to /root here.
%cd "/content"
folders = ["train", "test", "validation"]
train_folder = os.path.join("/content/OID/Dataset/train", folder_OID)
test_folder = os.path.join("/content/OID/Dataset/test", folder_OID)
valid_folder = os.path.join("/content/OID/Dataset/validation", folder_OID)

store_train_folder = os.path.join(BASE_PATH, "training")
store_test_folder = os.path.join(BASE_PATH, "testing")

In [0]:
# the target image size is 64 x 64 pixels. 
IMAGE_WIDTH = 64
IMAGE_HEIGHT = 64
TARGET_SIZE = (IMAGE_WIDTH, IMAGE_HEIGHT)


tasks = ["train", "valid", "test"]
#### add functionality to check what all actions have been completed. Start where you stopped function
for task in tasks:
    if task in ("train", "valid"):
        ### Ensure that the training and validation images will be used for training, rest for testing
        image_save_folder = os.path.join(store_train_folder, "images")
        label_save_folder = os.path.join(store_train_folder, "labels")
    elif task in ("test"):
        image_save_folder = os.path.join(store_test_folder, "images")
        label_save_folder = os.path.join(store_test_folder, "labels")

    ## Rest of the portion of the for loop is independent of the task
    ## (once the image/label folders have been specified)
    
    ### find the names (without extensions) of the image files
    image_folder = eval(task + "_folder")
    files = os.listdir(image_folder)
    filenames = [f.split(".")[0] for f in files if f.endswith(".jpg")]
    ### the image annotations are found using the following path
    label_folder = os.path.join(image_folder, "Label")
    
    for i in tqdm(range(len(filenames)), desc=task.upper()):
        ## Load the image and resize to TARGET_SIZE
        img = cv2.imread(os.path.join(image_folder, filenames[i] + ".jpg"))
        height, width, channel = img.shape
        ratio_x, ratio_y = TARGET_SIZE[0]/width, TARGET_SIZE[1]/height
        img = cv2.resize(img, TARGET_SIZE, interpolation=cv2.INTER_AREA)

        ## Save the resized image file
        cv2.imwrite(os.path.join(image_save_folder, filenames[i] + ".jpg"), img)
        del img

        ## Resizing the bounding box information for the above loaded image
        with open(os.path.join(label_folder, filenames[i] + ".txt"), mode="r") as f:
            txt = f.readlines()
        f.close()
        txt = [line.replace("\n", "") for line in txt]
        txt_new = []
        for line in txt:
            line = line.split(" ")
            ### storing the labels in the KITTI format, as required by the squeezeDet algorithm
            objects = [word for word in line if word.isalpha()]
            if len(objects) > 1: 
                objects = "_".join(objects)
            else:
                objects = objects[0]

            nums = [float(word) for word in line if not word.isalpha()]
            nums[0], nums[1] = nums[0] * ratio_x, nums[1] * ratio_y
            nums[2], nums[3] = nums[2] * ratio_x, nums[3] * ratio_y 
            nums = list(map(str, nums))
            line = " ".join([objects] + ["0.00", "0", "0.00"] + nums + ["0.00"]*7)
            txt_new.append(line)

        ## saving the resized bouding box numbers
        with open(os.path.join(label_save_folder, filenames[i] + ".txt"), mode="w") as f_w:
            for line in txt_new:
                f_w.write(line)
                f_w.write("\n")
        f_w.close()

### Generating image sets for training and testing
!ls $BASE_PATH"/training/images/" | grep ".jpg" | sed s/.jpg// > $BASE_PATH"/ImageSets/train.txt"
!ls $BASE_PATH"/testing/images/" | grep ".jpg" | sed s/.jpg// > $BASE_PATH"/ImageSets/test.txt"

#**Connect your Google Drive Account**

Connect your Google drive account to store the resized image and object label files. 

We are not interested in storing the high resolution images (downloaded from OID) to Google drive. There are two reasons for such a decision:
1. We do not want to fill up the free space in your Google account. Its better to store a 4 KB image than a 1 MB image. 
2. During training the images are anyways resized to a much smaller size than what we receive from OID. So it makes sense to store the images in the final (desired) dimensions directly before the training step. Now, if we start the resizing operation by repeatedly transport data between Colab and Drive, then we will observe extremely low speeds of image resizing and storage in Drive. 

Solution: Since the OID images and labels are present in the temporary workspace of Colab, we can resize the images and labels right here. Then, we can send the small-sized images to a specific Drive folder. That way, the image resizing happens at very high speeds. And you can store the raw data for future training purposes without cluttering up your Google account space.

**NOTE**: In the upcoming cell (after Drive mount), we copy the resized data from the Colab workspace to a spacific folder in Drive. Note that the transfer of data from Colab to Drive takes time. Hence, it is recommended that after initiating the copy command to send data to the Drive folder, we need the Drive to finish storing the data which is some sense, in the pipeline between Colab and Drive. "It has left Colab and has not yet reached Drive".

In [0]:
from google.colab import drive
drive.mount('/content/drive')

Creating a new folder "TinyVsion" in your Google Drive to store the training and testing data, as well as the Python codes needed for training. For the Python codes, we import a GitHub repository. In that repository, we have to create a folder named "data" (case sensitive). This newly created folder will store the (resized) training data.

In [0]:
DEST_PATH = "/content/drive/My\ Drive/TinyVision/tinyvision_ai_Object_Detection"
%mkdir -p $DEST_PATH
%cd $DEST_PATH
!git clone https://github.com/chatterjeesandipan/SqueezeDet_Quantized.git
%mkdir -p ./SqueezeDet_Quantized/data

In [0]:
### Copying the Resized_Data to ./squeezeDet_Quantized/data/
%cd "/content"
!rsync -recursive --progress $BASE_PATH $DEST_PATH"/SqueezeDet_Quantized/data"

### Do NOT REMOVE the statements below:
SECONDS_WAIT = int(20 * 60)  ### first number shows the number of minutes
for i in tqdm(range(SECONDS_WAIT), desc="Uploading to Drive"):
    time.sleep(1)

### After copying the resized data, you should see the same folder structures at the source as well as the destination folders
### Source folder
!tree -d "/content/Resized_Data"

# ### Destination folder
%cd $DEST_PATH
!tree -d "./SqueezeDet_Quantized/data"

#**Training Step**

Here we execute the training code. The training code uses the framework developed by [Bichen Wu](https://github.com/BichenWuUCB/squeezeDet). This framework trains an object detection model that uses considerably reduced number of parameters than a typical VGG or ResNet models. The paper published by Wu *et al.* can be found [here](https://arxiv.org/abs/1612.01051).

Note that the training code needs the class names for which the object detection model will be trained. Ensure that all the class names are written in lower case and separated by a blank space. The names will be processed in the training code before they are made available to the main training framework.

Unfortunately in Colab, tensorboard visualization of the training process is not available because of the sequential processing of the code cells. Hence, the only option is to wait for the training process to complete and then launch a tensorboard window to visualize the graph, the loss scalars and performance on some sample training images.

In [0]:
%cd $DEST_PATH
LOG_PATH = "./SqueezeDet_Quantized/LOGS/" + PROJECT_NAME + "/train"
DATA_PATH = "./SqueezeDet_Quantized/data/" + PROJECT_NAME

# ### Check if you have any atleast 400 training samples in the DATA_PATH training/images directory
assert len(os.listdir(os.path.join(DATA_PATH, "training/images"))) > 800,"Too few training samples, Get more data"
       
### Generating image sets for training and testing
!ls $DATA_PATH"/training/images/" | grep ".jpg" | sed s/.jpg// > $DATA_PATH"/ImageSets/train.txt"
!ls $DATA_PATH"/testing/images/" | grep ".jpg" | sed s/.jpg// > $DATA_PATH"/ImageSets/test.txt"

### For the class names, look for the names you used to download training data from OIDv4.
### In this tutorial, we have the "Fruits" and "Humans" projects.

### In this example, for the fruit detection case
# --classes = "apple banana orange mango" 

### In this example, for the human detection case
# --classes="person" 

!python ./SqueezeDet_Quantized/src/train.py --dataset=KITTI \
--data_path=$DATA_PATH --image_set=train --batch_size=128 \
--net=squeezeDet  --classes="person" --gpu=0 \
--train_dir=$LOG_PATH --learning_rate=0.01 \
--summary_step=250 --checkpoint_step=500 --max_steps=10000 
# --image_width=$IMAGE_WIDTH --image_height=$IMAGE_HEIGHT   

#**Tensorboard visualization**

We visualize the model training metrics on tensorboard, along with the computation graph and some sample images on which inference is performed while training. Unfortunately, because the cells operate serially in Colab, hence visualizing the training process on tensorboard is not possible at this moment.


In [0]:
%load_ext tensorboard
!tensorboard --logdir=$LOG_PATH

In [0]:
### The tensorboard header mentions the code needed to stop the tensorboard visualization. Implement that code here.
### It generally appears like the following. Note that the number (606) will be different for each tensorboard run.
!kill 606

#**Generating Frozen Inference Graph (.pb File)**

Here we use a function "genpb.py" to obtain the frozen infernece graph of the trained model. This gives us the .pbtxt and .pb files. After this step, follow the instructions below for downloading the Lattice softwares.....TO BE DISCUSSED LATER. 

In [0]:
LOG_PATH = "./SqueezeDet_Quantized/LOGS/" + PROJECT_NAME + "/train"
DATA_PATH = "./SqueezeDet_Quantized/data/" + PROJECT_NAME
%cd $DEST_PATH

#### Note class names must be provided in lower case
#### if project_name is "Humans", then enter --classes person
#### if project_name is "Fruits", then enter --classes apple banana orange mango

!python ./SqueezeDet_Quantized/src/genpb.py --ckpt_dir $LOG_PATH --freeze True \
--classes person --image_width 64 --image_height 64 