<a href="https://colab.research.google.com/github/elephantscale/E2E-Object-Detection-in-TFLite/blob/master/colab_training/Tutorial_Object_Detection_with_TFLite.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tutorial: Object Detection with TFLite

## Introduction

Imagine that you have a niece or a nephew and you want to give them a present.
When you were growing up, your ant gave you a "Find the Duck" book. You had lots of fun
finding the duck on every page of this board book. Today, you want to make this book
into a computer game. For that, you need to be able to teach the computer how to find
the duck. This is what this tutorial will teach you.

<img src="find-the-duck-50.png"/>

## Our plan

The task that you are about to undertake is called "Object Detection." The good news is that the
Google library called TensorFlow already does most of the groundwork for object detection.
Furthermore, the TensorFlow Lite part of the library will help you to put your application on a
phone or a device app. The end result of your object detection will look like a screenshot below,
where you will be able to detect, out of a known set of objects, which ones are present
in our picture and what are their locations.

<img src="object-detection.png"/>

We will do it in three steps. First, you will have to prepare the data: those objects that you will be looking to identify.
After you got the objects, you will have to convert them to TFrecord format that Object Detection API expects.
Then, you will train the model with this data. And finally, you will export the model
to TFLite, preparing it to be used in your phone app. In the next tutorial,
we will teach you how to use the resulting TFLite model in your phone app. So, let us start.

## Data collection

Dataset homepage: https://www.kaggle.com/mbkinaci/fruit-images-for-object-detection

In [None]:
!wget https://github.com/elephantscale/E2E-Object-Detection-in-TFLite/raw/master/data/Fruit_Images_for_Object_Detection.zip

In [None]:
!unzip -qq Fruit_Images_for_Object_Detection.zip

## Generate intermediate files

To be able to generate TFRecords from our fruits dataset we first generate a `.csv` file that would contain the following fields - 
- filename
- width
- height
- class
- xmin
- ymin
- xmax
- ymax

In [None]:
!wget https://raw.githubusercontent.com/elephantscale/E2E-Object-Detection-in-TFLite/master/colab_training/xml_to_csv.py

In [None]:
!python xml_to_csv.py

In [None]:
!head -5 /content/train_zip/train_labels.csv

In [None]:
!head -5 /content/test_zip/test_labels.csv

Now that we have `.csv` files we can do some basic exploratory data analysis (EDA) to better understand the dataset.

## Basic EDA

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import cv2
import os

In [None]:
train_df = pd.read_csv("/content/train_zip/train_labels.csv")
test_df = pd.read_csv("/content/test_zip/test_labels.csv")

In [None]:
train_df.head()

In [None]:
test_df.head()

In [None]:
train_df["class"].value_counts()

In [None]:
test_df["class"].value_counts()

In [None]:
def show_images(df, is_train=True):
    if is_train:
        root = "/content/train_zip/train"
    else:
        root = "/content/test_zip/test"
    plt.figure(figsize=(15,15))
    for i in range(10):
        n = np.random.choice(df.shape[0], 1)
        plt.subplot(5,5,i+1)
        plt.xticks([])
        plt.yticks([])
        plt.grid(True)
        image = plt.imread(os.path.join(root, df["filename"][int(n)]))
        plt.imshow(image)
        label = df["class"][int(n)]
        plt.xlabel(label)
    plt.show()

In [None]:
show_images(train_df)

In [None]:
show_images(test_df, is_train=False)

In [None]:
def verify_annotations(df, is_train=True):
    if is_train:
        root = "/content/train_zip/train"
    else:
        root = "/content/test_zip/test"
    
    plt.figure(figsize=(12,12))
    for i in range(3):
        n = np.random.choice(df.shape[0], 1)
        plt.subplot(1,3,i+1)
        plt.xticks([])
        plt.yticks([])
        
        image = plt.imread(os.path.join(root, df["filename"][int(n)]))
        xmin, ymin = int(df["xmin"][int(n)]), int(df["ymin"][int(n)])
        xmax, ymax = int(df["xmax"][int(n)]), int(df["ymax"][int(n)])
        cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (255,0,0), 3)
        plt.imshow(image)
    
    plt.show()

In [None]:
verify_annotations(train_df, is_train=True)

In [None]:
verify_annotations(test_df, is_train=False)

As we can see the dataset has annotation issues. So, our model training can suffer a lot from this. So, one can expect a model trained on this dataset might yield unexpected results. 

## Generate TFRecords and `.pbtxt`

Explaining the steps of creating TFRecords is out of scope here. Please follow this Kaggle kernel that sheds some light on the process. 

The utility scripts that I used in the following cells were adapted from [this repository](https://github.com/anirbankonar123/CorrosionDetector). 

In [None]:
%tensorflow_version 1.x
import tensorflow as tf 
print(tf.__version__)

!git clone https://github.com/tensorflow/models.git

% cd models/research
!pip install --upgrade pip
# Compile protos.
!protoc object_detection/protos/*.proto --python_out=.
# Install TensorFlow Object Detection API.
!cp object_detection/packages/tf1/setup.py .
!python -m pip install --use-feature=2020-resolver .

In [None]:
!wget https://raw.githubusercontent.com/elephantscale/E2E-Object-Detection-in-TFLite/master/colab_training/generate_tfrecord.py

In [None]:
!python generate_tfrecord.py \
    --csv_input=/content/train_zip/train_labels.csv \
    --output_path=/content/train_zip/train.record

Before the running the cell below please edit the `path` variable in the `main()` function of `generate_tfrecord.py`. `generate_tfrecord.py` should be located here - `/content/models/research`. 

In [None]:
!python generate_tfrecord.py \
    --csv_input=/content/test_zip/test_labels.csv \
    --output_path=/content/test_zip/test.record

In [None]:
!pwd
!ls -lh /content/test_zip/*.record
!ls -lh /content/train_zip/*.record

Be sure to store these `.record` files to somewhere safe. Next, we need to generate a `.pbtxt` file that defines a mapping between our classes and integers. In the `generate_tfrecord.py` script, we used the following mapping - 

```python
def class_text_to_int(row_label):
    if row_label == 'orange':
        return 1
    elif row_label == 'banana':
        return 2
    elif row_label == 'apple':
        return 3
    else:
    	return None
```

In [None]:
label_encodings = {
    "orange": 1,
    "banana": 2,
    "apple": 3
}

f = open("/content/label_map.pbtxt", "w")

for (k, v) in label_encodings.items():
    item = ("item {\n"
            "\tid: " + str(v) + "\n"
            "\tname: '" + k + "'\n"
            "}\n")
    f.write(item)

f.close()

!cat /content/label_map.pbtxt

Be sure to save this file as well. Next we will proceed toward training a custom detection model with what we have so far. Follow the steps in [this notebook](https://colab.research.google.com/github/sayakpaul/E2E-Object-Detection-in-TFLite/blob/master/colab_training/Training_MobileDet_Custom_Dataset.ipynb).