<a href="https://colab.research.google.com/github/Oreobird/person_detect/blob/main/train_person_detect.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup Environment

Install conda and create python3.6 env to install tensorflow==1.15.0:

In [None]:
%env PYTHONPATH = # /env/python


In [None]:
!wget https://repo.anaconda.com/miniconda/Miniconda3-py38_4.12.0-Linux-x86_64.sh
!chmod +x Miniconda3-py38_4.12.0-Linux-x86_64.sh
!./Miniconda3-py38_4.12.0-Linux-x86_64.sh -b -f -p /usr/local
!conda update conda -y

In [6]:
import sys
sys.path.append('/usr/local/lib/python3.8/site-packages')

In [None]:
!conda create -n myenv python=3.6 -y

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
pip install tensorflow==1.15
pip install contextlib2
pip install Pillow
pip install tf_slim
pip install matplotlib
pip install ipykernel

## Fetch source code

Mount your google drive:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Clone the TensorFlow models github repository:

In [None]:
! git clone --depth=1 https://github.com/tensorflow/models.git /content/drive/MyDrive/person_detect

Clone TensorFlow github repository and checkout to v1.15.0 for graph frozon:

In [None]:
! git clone https://github.com/tensorflow/tensorflow /content/drive/MyDrive/tensorflow

In [None]:
! cd /content/drive/MyDrive/tensorflow && git checkout -f v1.15.0 && cd -

Modify /content/drive/MyDrive/person_detect/research/slim/nets/mobilenet_v1.py to enable global average pool and use Conv2d_11_pointwise as the final endpoint:
```
net, end_points = mobilenet_v1_base(inputs, scope=scope, min_depth=min_depth, depth_multiplier=depth_multiplier, conv_defs=conv_defs, final_endpoint='Conv2d_11_pointwise')
...
mobilenet_v1_025 = wrapped_partial(mobilenet_v1, depth_multiplier=0.25, global_pool=True)
```

## Download the Dataset
The Visual Wake Words dataset contains images which belong to two classes: person (labelled as 1) and not-person (labelled as 0) and it is derived from the COCO dataset containing 80 categories (eg: cat, dog, umbrella, etc).

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python3 /content/drive/MyDrive/person_detect/research/slim/download_and_convert_data.py \
    --logtostderr \
    --dataset_name=visualwakewords \
    --dataset_dir=person_detection_dataset \
    --foreground_class_of_interest='person' \
    --small_object_area_threshold=0.005

When it's done, you'll have a set of TFRecords in the person_detection_dataset/ directory holding the labeled image information.

## Tensorboard

In [None]:
%load_ext tensorboard
%tensorboard --logdir /content/drive/MyDrive/person_detect/train

## Train the model

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv

python3 /content/drive/MyDrive/person_detect/research/slim/train_image_classifier.py \
    --clone_on_cpu=True \
    --alsologtostderr \
    --dataset_name=visualwakewords \
    --dataset_dir=person_detection_dataset \
    --dataset_split_name=train \
    --train_image_size=96 \
    --use_grayscale=True \
    --preprocessing_name=mobilenet_v1 \
    --model_name=mobilenet_v1_025 \
    --train_dir=/content/drive/MyDrive/person_detect/train \
    --save_summaries_secs=300 \
    --learning_rate=0.045 \
    --label_smoothing=0.1 \
    --learning_rate_decay_factor=0.98 \
    --num_epochs_per_decay=2.5 \
    --moving_average_decay=0.9999 \
    --batch_size=96 \
    --max_number_of_steps=1000000

## Evaluate the model

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python3 /content/drive/MyDrive/person_detect/research/slim/eval_image_classifier.py \
    --alsologtostderr \
    --dataset_name=visualwakewords \
    --dataset_dir=person_detection_train \
    --dataset_split_name=val \
    --eval_image_size=96 \
    --use_grayscale=True \
    --preprocessing_name=mobilenet_v1 \
    --model_name=mobilenet_v1_025 \
    --train_dir=person_detection_train \
    --checkpoint_path=/content/drive/MyDrive/person_detect/train/model.ckpt-123456

## Convert the TF model to a TF Lite model for Inference

### Generate the model graph

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python3 /content/drive/MyDrive/person_detect/research/slim/export_inference_graph.py \
    --alsologtostderr \
    --dataset_name=visualwakewords \
    --image_size=96 \
    --use_grayscale=True \
    --model_name=mobilenet_v1_025 \
    --output_file=/content/drive/MyDrive/person_detect/train/person_detection_graph.pb

### Generate the frozen model graph (combine model graph and trained weights)

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python3 /content/drive/MyDrive/tensorflow/tensorflow/python/tools/freeze_graph.py \
    --input_graph=/content/drive/MyDrive/person_detect/train/person_detection_graph.pb \
    --input_checkpoint=/content/drive/MyDrive/person_detect/train/model.ckpt-0 \
    --input_binary=true \
    --output_node_names=MobilenetV1/Predictions/Reshape_1,MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Conv2D \
    --output_graph=/content/drive/MyDrive/person_detect/train/person_detection_frozen_graph.pb

### Generate the TensorFlow Lite File with Quantization

Create file gen_quant_tflite.py and edit as:

In [None]:
import tensorflow.compat.v1 as tf
import io
import PIL
import numpy as np

def representative_dataset_gen():

  record_iterator = tf.python_io.tf_record_iterator(path='/content/person_detection_dataset/val.record-00000-of-00010')

  for _ in range(250):
    string_record = next(record_iterator)
    example = tf.train.Example()
    example.ParseFromString(string_record)
    image_stream = io.BytesIO(example.features.feature['image/encoded'].bytes_list.value[0])
    image = PIL.Image.open(image_stream)
    image = image.resize((96, 96))
    image = image.convert('L')
    array = np.array(image)
    array = np.expand_dims(array, axis=2)
    array = np.expand_dims(array, axis=0)
    array = ((array / 127.5) - 1.0).astype(np.float32)
    yield([array])

converter = tf.lite.TFLiteConverter.from_frozen_graph(
'/content/drive/MyDrive/person_detect/train/person_detection_frozen_graph.pb',
['input'],
['MobilenetV1/Predictions/Reshape_1',
'MobilenetV1/MobilenetV1/Conv2d_11_pointwise/Conv2D'])
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8

tflite_quant_model = converter.convert()
open("/content/drive/MyDrive/person_detect/train/person_detection_model.tflite", "wb").write(tflite_quant_model)

Run quantization:

In [None]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv
python3 /content/drive/MyDrive/person_detect/gen_quant_tflite.py

### Generate the C source file

In [42]:
%%shell
eval "$(conda shell.bash hook)"
conda activate myenv

# Install xxd if it is not available
apt-get -qq install xxd
# Save the file as a C source file
xxd -i /content/drive/MyDrive/person_detect/train/person_detection_model.tflite > /content/drive/MyDrive/person_detect/train/person_detect_model_data.cc

