# Face Detection using TAO FaceNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet18 model and train a ResNet-18 FaceNet model on the WIDERFACE dataset
* Prune the trained FaceNet model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Run Inference on the trained model


### Table of Contents

This notebook shows an example use case of Face Detection using FaceNet in the Train Adapt Optimize (TAO) Toolkit.

1. [Set up env variables, map drives, and install dependencies](#head-0)
2. [Prepare dataset and pre-trained model](#head-2)
    1. [Verify and prepare dataset](#head-2-1)
    2. [Prepare tfrecords from kitti format dataset](#head-2-2)
    3. [Download pre-trained model](#head-2-3)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained model](#head-8)
9. [Visualize inferences](#head-9)


## 1. Set up env variables, map drives and install dependencies <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

*Note: Please make sure to remove any stray artifacts/files from the `$LOCAL_PROJECT_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users' workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/facenet/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/facenet`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [1]:
# Setting up env variables for cleaner command-line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/facenet

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset is expected to be present in $LOCAL_PROJECT_DIR/facenet/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/facenet
# !PLEASE MAKE SURE TO UPDATE THE LOCAL_PROJECT_DIR!.
%env LOCAL_PROJECT_DIR=/home/jupyter/imported_files/files

# $PROJECT_DIR is the path to the sample notebook folder and the dependency folder
# $PROJECT_DIR/deps should exist for dependency installation

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "facenet/data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "facenet"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "facenet/specs"
)

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR
!ls -rlt $LOCAL_DATASET_SPECS_DIR

env: KEY=nvidia_tlt
env: NUM_GPUS=1
env: LOCAL_PROJECT_DIR=/home/jupyter/imported_files/files
total 24
-rw-r--r-- 1 jupyter jupyter 1045 Jun 15 19:23 facenet_inference_kitti_etlt.txt
-rw-r--r-- 1 jupyter jupyter  340 Sep  6 17:41 facenet_tfrecords_kitti_train.txt
-rw-r--r-- 1 jupyter jupyter  344 Sep  6 17:54 facenet_tfrecords_kitti_val.txt
-rw-r--r-- 1 jupyter jupyter 3271 Sep  6 20:20 facenet_train_resnet18_kitti.txt
-rw-r--r-- 1 jupyter jupyter 3334 Sep  6 21:25 facenet_retrain_resnet18_kitti.txt
-rw-r--r-- 1 jupyter jupyter 1023 Sep  6 22:01 facenet_inference_kitti_tlt.txt
total 730568
-rw-r--r-- 1 jupyter jupyter      6971 Jun 15 19:23 convert_wider_to_kitti.py
drwxr-xr-x 8 jupyter jupyter      4096 Sep  6 17:42 data
drwxr-xr-x 3 jupyter jupyter      4096 Sep  6 18:06 pretrain_models
drwxr-xr-x 4 jupyter jupyter      4096 Sep  6 20:55 experiment_dir_unpruned
drwxr-xr-x 2 jupyter jupyter      4096 Sep  6 21:03 experiment_dir_pruned
drwxr-xr-x 4 jupyter jupyter      4096 Sep  6 21:4

In [4]:
# Install requirement
!pip3 install -r $LOCAL_PROJECT_DIR/deps/requirements-pip.txt

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting numpy<=1.17.5,>=1.17.0
  Downloading numpy-1.17.5-cp36-cp36m-manylinux1_x86_64.whl (20.0 MB)
     |████████████████████████████████| 20.0 MB 5.0 MB/s            
Collecting h5py<=3.6.0,>=3.1.0
  Downloading h5py-3.1.0-cp36-cp36m-manylinux1_x86_64.whl (4.0 MB)
     |████████████████████████████████| 4.0 MB 69.3 MB/s            
Collecting pycocotools<=2.0.4,>=2.0.2
  Downloading pycocotools-2.0.4.tar.gz (106 kB)
     |████████████████████████████████| 106 kB 65.7 MB/s            
[?25h  Installing build dependencies ... [?25ldone
[?25h  Getting requirements to build wheel ... [?25ldone
[?25h  Preparing metadata (pyproject.toml) ... [?25ldone
Collecting cached-property
  Downloading cached_property-1.5.2-py2.py3-none-any.whl (7.6 kB)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (pypr

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

 We will be using the Wider Face dataset for the tutorial. To find more details, please visit http://shuoyang1213.me/WIDERFACE/. Please download the [training](https://drive.google.com/file/d/0B6eKvaijfFUDQUUwd21EckhUbWs/view?usp=sharing) and [validation](https://drive.google.com/file/d/0B6eKvaijfFUDd3dIRmpvSk8tLUk/view?usp=sharing) set images, and the ground truth [labels](http://mmlab.ie.cuhk.edu.hk/projects/WIDERFace/support/bbx_annotation/wider_face_split.zip), and place the zip files in `$DATA_DOWNLOAD_DIR`. 

Notes: 
1. The detection workflow requires dataset to be in kitti format and we will need to convert it.
2. The pretrained facenet is trained on grayscale images of height 416 pixels and width 736 pixels. We will convert wider dataset to this format.


### 2.1. Verify and prepare dataset <a class="anchor" id="head-2-1"></a>

In [5]:
# Check the dataset is present
!mkdir -p $LOCAL_DATA_DIR
!if [ ! -f $LOCAL_DATA_DIR/WIDER_train.zip ]; then echo 'Train Image zip file not found, please download.'; else echo 'Found Train Image zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/WIDER_val.zip ]; then echo 'Validation Image zip file not found, please download.'; else echo 'Found Validation Image zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/wider_face_split.zip ]; then echo 'Label zip file not found, please download.'; else echo 'Found Labels zip file.';fi

Found Train Image zip file.
Found Validation Image zip file.
Found Labels zip file.


In [6]:
# unpack downloaded datasets to $DATA_DOWNLOAD_DIR.
!unzip -u $LOCAL_DATA_DIR/WIDER_train.zip -d $LOCAL_DATA_DIR
!unzip -u $LOCAL_DATA_DIR/WIDER_val.zip -d $LOCAL_DATA_DIR
!unzip -u $LOCAL_DATA_DIR/wider_face_split.zip -d $LOCAL_DATA_DIR

Archive:  /home/jupyter/imported_files/files/facenet/data/WIDER_train.zip
   creating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/
   creating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/
   creating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/
  inflating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/0_Parade_marchingband_1_100.jpg  
  inflating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/0_Parade_marchingband_1_1015.jpg  
  inflating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/0_Parade_marchingband_1_1018.jpg  
  inflating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/0_Parade_marchingband_1_1022.jpg  
  inflating: /home/jupyter/imported_files/files/facenet/data/WIDER_train/images/0--Parade/0_Parade_marchingband_1_1030.jpg  
  inflating: /home/jupyter/imported_files/files/facenet/

In [7]:
# verify
!ls -l $LOCAL_DATA_DIR/WIDER_train/
!ls -l $LOCAL_DATA_DIR/WIDER_val/
!ls -l $LOCAL_DATA_DIR/wider_face_split/

total 4
drwxr-xr-x 63 jupyter jupyter 4096 Nov 18  2015 images
total 4
drwxr-xr-x 63 jupyter jupyter 4096 Nov 18  2015 images
total 8904
-rwxr-xr-x 1 jupyter jupyter     589 Mar 31  2017 readme.txt
-rwxr-xr-x 1 jupyter jupyter   91674 Nov 18  2015 wider_face_test.mat
-rwxr-xr-x 1 jupyter jupyter  877727 Mar 31  2017 wider_face_test_filelist.txt
-rwxr-xr-x 1 jupyter jupyter 1554123 Mar 31  2017 wider_face_train.mat
-rwxr-xr-x 1 jupyter jupyter 4947163 Apr  1  2019 wider_face_train_bbx_gt.txt
-rwxr-xr-x 1 jupyter jupyter  397768 Mar 31  2017 wider_face_val.mat
-rwxr-xr-x 1 jupyter jupyter 1231252 Apr  1  2019 wider_face_val_bbx_gt.txt


In [3]:
# Convert wider train dataset to kitti format
!python3 convert_wider_to_kitti.py --input_image_dir=$LOCAL_DATA_DIR/WIDER_train/images \
                                   --input_label_file=$LOCAL_DATA_DIR/wider_face_split/wider_face_train_bbx_gt.txt \
                                   --output_dir=$LOCAL_DATA_DIR/training/ \
                                   --image_height=416 --image_width=736 --grayscale

Total 12880 samples in dataset


In [4]:
# Convert wider validation dataset to kitti format
!python3 convert_wider_to_kitti.py --input_image_dir=$LOCAL_DATA_DIR/WIDER_val/images \
                                   --input_label_file=$LOCAL_DATA_DIR/wider_face_split/wider_face_val_bbx_gt.txt \
                                   --output_dir=$LOCAL_DATA_DIR/validation/ \
                                   --image_height=416 --image_width=736 --grayscale

Total 3226 samples in dataset


In [2]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "training/images")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "training/labels")))
num_val_images = len(os.listdir(os.path.join(DATA_DIR, "validation/images")))
num_val_labels = len(os.listdir(os.path.join(DATA_DIR, "validation/labels")))
print("Number of images in the training set. {}".format(num_training_images))
print("Number of labels in the training set. {}".format(num_training_labels))
print("Number of images in the validation set. {}".format(num_val_images))
print("Number of labels in the validation set. {}".format(num_val_labels))

Number of images in the training set. 12880
Number of labels in the training set. 12880
Number of images in the validation set. 3226
Number of labels in the validation set. 3226


In [6]:
# Sample kitti label.
!cat $LOCAL_DATA_DIR/training/labels/30_Surgeons_Surgeons_30_227.txt

face 0 0 0 159 51 327 302 0 0 0 0 0 0 0


### 2.2. Prepare tf records from kitti format dataset <a class="anchor" id="head-2-2"></a>

* Update the tfrecords spec file to take in your kitti format dataset
* Create the tfrecords using the detectnet_v2 dataset_convert 

*Note: TfRecords only need to be generated once.*

In [3]:
print("TFrecords conversion spec file for kitti training")
!cat $LOCAL_SPECS_DIR/facenet_tfrecords_kitti_train.txt

TFrecords conversion spec file for kitti training
kitti_config {
  root_directory_path: "/home/jupyter/imported_files/files/facenet/data/training"
  image_dir_name: "images"
  label_dir_name: "labels"
  image_extension: ".png"
  partition_mode: "random"
  num_partitions: 2
  val_split: 10
  num_shards: 10
}
image_directory_path: "/home/jupyter/imported_files/files/facenet/data/training"


In [4]:
# Creating a new directory for the output tfrecords dump.
print("Converting Tfrecords for wider train dataset")
!mkdir -p $LOCAL_DATA_DIR/tfrecords && rm -rf $LOCAL_DATA_DIR/tfrecords/*
!detectnet_v2 dataset_convert \
                  -d $LOCAL_SPECS_DIR/facenet_tfrecords_kitti_train.txt \
                  -o $LOCAL_DATA_DIR/tfrecords/training/kitti_train

Converting Tfrecords for wider train dataset
Using TensorFlow backend.
Using TensorFlow backend.
2022-09-06 17:53:57,251 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a kitti converter
2022-09-06 17:53:57,251 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Creating output directory /home/jupyter/imported_files/files/facenet/data/tfrecords/training
2022-09-06 17:53:57,296 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: Num images in
Train: 11592	Val: 1288
2022-09-06 17:53:57,296 [INFO] iva.detectnet_v2.dataio.kitti_converter_lib: Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2022-09-06 17:53:57,307 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 0


2022-09-06 17:53:58,463 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 1
2022-09-06 17:54:00,288 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Writing partition 0, shard 2
20

In [None]:
# Creating a new directory for the output tfrecords dump.
print("Converting Tfrecords for wider validation dataset")
!detectnet_v2 dataset_convert \
                  -d $LOCAL_SPECS_DIR/facenet_tfrecords_kitti_val.txt \
                  -o $LOCAL_DATA_DIR/tfrecords/validation/kitti_val

Converting Tfrecords for wider validation dataset
Using TensorFlow backend.


In [10]:
!ls -rlt $LOCAL_DATA_DIR/tfrecords/training/

total 16464
-rw-r--r-- 1 jupyter jupyter  151192 Sep  6 17:53 kitti_train-fold-000-of-002-shard-00000-of-00010
-rw-r--r-- 1 jupyter jupyter  160778 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00001-of-00010
-rw-r--r-- 1 jupyter jupyter  143162 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00002-of-00010
-rw-r--r-- 1 jupyter jupyter  159020 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00003-of-00010
-rw-r--r-- 1 jupyter jupyter  193272 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00004-of-00010
-rw-r--r-- 1 jupyter jupyter  193114 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00005-of-00010
-rw-r--r-- 1 jupyter jupyter  149012 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00006-of-00010
-rw-r--r-- 1 jupyter jupyter  301980 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00007-of-00010
-rw-r--r-- 1 jupyter jupyter  170560 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00008-of-00010
-rw-r--r-- 1 jupyter jupyter  154122 Sep  6 17:54 kitti_train-fold-000-of-002-shard-00009-of-0001

In [None]:
!ls -rlt $LOCAL_DATA_DIR/tfrecords/validation/

### 2.3. Download pre-trained model <a class="anchor" id="head-2-3"></a>

Download the correct pretrained model from the NGC model registry for your experiment. Please note that for DetectNet_v2, the input is expected to be 0-1 normalized.

For FaceNet pretrain model please download from: `nvidia/tao/facenet`

After downloading the pre-trained model, please place the files in $LOCAL_EXPERIMENT_DIR
You will then have the following path

* pretrain model in `$LOCAL_EXPERIMENT_DIR/pretrain_models/facenet_vunpruned_v2.0/model.tlt`

In [28]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_cat_linux.zip
--2022-09-06 22:43:15--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 13.32.164.19, 13.32.164.13, 13.32.164.118, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.32.164.19|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34018893 (32M) [application/zip]
Saving to: ‘/home/jupyter/imported_files/files/ngccli/ngccli_cat_linux.zip’


2022-09-06 22:43:16 (69.9 MB/s) - ‘/home/jupyter/imported_files/files/ngccli/ngccli_cat_linux.zip’ saved [34018893/34018893]

Archive:  /home/jupyter/imported_files/files/ngccli/ngccli_cat_linux.zip
   creating: /home/jupyter/imported_files/files/ngccli/ngc-cli/
   creating: /home/jupyter/imported_files/files/ngccli/ngc-cli/yarl/
  inflating: /home/jupyter/imported_files/files/ngccli/ngc-cli/yarl/_quoting_c.cpython-39-x86_64-linux-gnu.so  
  inflating: /home/jupyter/imported_files/files/ngccli/ngc-cli/libselinux.so.1  
   creating: /home/jupyter/imp

In [29]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/facenet:unpruned*

+-------+-------+-------+-------+-------+-------+------+-------+-------+
| Versi | Accur | Epoch | Batch | GPU   | Memor | File | Statu | Creat |
| on    | acy   | s     | Size  | Model | y Foo | Size | s     | ed    |
|       |       |       |       |       | tprin |      |       | Date  |
|       |       |       |       |       | t     |      |       |       |
+-------+-------+-------+-------+-------+-------+------+-------+-------+
| unpru | 83.87 | 70    | 1     | V100  | 44.3  | 44.3 | UPLOA | Aug   |
| ned_v |       |       |       |       |       | MB   | D_COM | 19,   |
| 2.0   |       |       |       |       |       |      | PLETE | 2021  |
+-------+-------+-------+-------+-------+-------+------+-------+-------+


In [14]:
# Position the pretrain model to the target destination.
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrain_models

In [15]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/facenet:unpruned_v2.0 \
    --dest $LOCAL_EXPERIMENT_DIR/pretrain_models

Downloaded 44.31 MB in 5s, Download speed: 8.85 MB/s                
--------------------------------------------------------------------------------
   Transfer id: facenet_vunpruned_v2.0
   Download status: Completed
   Downloaded local path: /home/jupyter/imported_files/files/facenet/pretrain_models/facenet_vunpruned_v2.0
   Total files downloaded: 1
   Total downloaded size: 44.31 MB
   Started at: 2022-09-06 18:06:14.861136
   Completed at: 2022-09-06 18:06:19.875264
   Duration taken: 5s
--------------------------------------------------------------------------------


In [5]:
# Check the pretrained model is present
!if [ ! -f $LOCAL_EXPERIMENT_DIR/pretrain_models/facenet_vunpruned_v2.0/model.tlt ]; then echo 'Pretrain model file not found, please download.'; else echo 'Found Pretrain model file.';fi

Found Pretrain model file.


## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Tfrecords for the train datasets: In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$LOCAL_SPECS_DIR/facenet_train_resnet18_kitti.txt` 
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

*Note: Please set the `load_graph` option to `true` in the model_config to load the pretrained facenet model.*

In [6]:
!cat $LOCAL_SPECS_DIR/facenet_train_resnet18_kitti.txt

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/home/jupyter/imported_files/files/facenet/data/tfrecords/training/kitti_train-*"
    image_directory_path: "/home/jupyter/imported_files/files/facenet/data/training"
  }
  validation_data_source {
    tfrecords_path: "/home/jupyter/imported_files/files/facenet/data/tfrecords/validation/kitti_val-*"
    image_directory_path: "/home/jupyter/imported_files/files/facenet/data/validation"
  }
  image_extension: "png"
  target_class_mapping {
    key: "face"
    value: "face"
  }
}
augmentation_config {
  preprocessing {
    output_image_width: 736
    output_image_height: 416
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 0.7
    zoom_max: 1.8
    translate_max_x: 12.0
    translate_max_y: 12.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.2
    contrast_scale_max: 0.1
    contrast_cent

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models

*Note: The training may take hours to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode. When run in multi-GPU mode, please expect to update the pruning and inference steps with new pruning thresholds and updated parameters in the clusterfile.json accordingly for optimum performance.*

*Detectnet_v2 now supports restart from checkpoint. In case, the training job is killed prematurely, you may resume training from the closest checkpoint by simply re-running the **same** command line. Please do make sure to use the <u>**same number of GPUs**</u> when restarting the training.*

*When running the training with NUM_GPUs>1, you may need to modify the `batch_size_per_gpu` and `learning_rate` to get similar mAP as a 1GPU training run. In most cases, scaling down the batch-size by a factor of NUM_GPU's or scaling up the learning rate by a factor of NUM_GPU's would be a good place to start.* 

In [21]:
!pip install h5py==2.10.0 

Defaulting to user installation because normal site-packages is not writeable
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting h5py==2.10.0
  Downloading h5py-2.10.0-cp36-cp36m-manylinux1_x86_64.whl (2.9 MB)
     |████████████████████████████████| 2.9 MB 4.8 MB/s            
Installing collected packages: h5py
  Attempting uninstall: h5py
    Found existing installation: h5py 3.1.0
    Uninstalling h5py-3.1.0:
      Successfully uninstalled h5py-3.1.0
Successfully installed h5py-2.10.0


First, we evaluate the pretrained Face Detect network on Wider validation set

In [7]:
!detectnet_v2 evaluate -e $LOCAL_SPECS_DIR/facenet_train_resnet18_kitti.txt\
                           -m $LOCAL_EXPERIMENT_DIR/pretrain_models/facenet_vunpruned_v2.0/model.tlt\
                           -k $KEY

Using TensorFlow backend.
Using TensorFlow backend.


2022-09-06 20:26:22,537 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_train_resnet18_kitti.txt




2022-09-06 20:26:23,143 [INFO] root: Loading model weights.














2022-09-06 20:26:25,995 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used.
2022-09-06 20:26:25,996 [INFO] root: Building dataloader.
2022-09-06 20:26:27,324 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2022-09-06 20:26:27,325 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2022-09-06 20:26:27,325 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2022-09-06 20:26:27,325 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2022-09-06 20:26:27,326 [INFO] m

Next, we train model on Wider train set starting with the pretrained Face Detect model weights

In [8]:
!detectnet_v2 train -e $LOCAL_SPECS_DIR/facenet_train_resnet18_kitti.txt \
                        -r $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k $KEY \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS

Using TensorFlow backend.
Using TensorFlow backend.




2022-09-06 20:29:18,592 [INFO] iva.common.logging.logging: Log file already exists at /home/jupyter/imported_files/files/facenet/experiment_dir_unpruned/status.json
2022-09-06 20:29:18,593 [INFO] root: Starting DetectNet_v2 Training job
2022-09-06 20:29:18,593 [INFO] __main__: Loading experiment spec at /home/jupyter/imported_files/files/facenet/specs/facenet_train_resnet18_kitti.txt.
2022-09-06 20:29:18,595 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_train_resnet18_kitti.txt
2022-09-06 20:29:18,600 [INFO] root: Training gridbox model.


2022-09-06 20:29:19,862 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2022-09-06 20:29:19,862 [INFO] __main__: Cannot iterate over exactly 12880 samples with a batch size of 32; each epoch will therefore take one extra step.






2022-09-06 20:29:19,896 [INFO] root: Building DetectNet

In [9]:
print('Model for each epoch:')
print('---------------------')
!ls -lrthR $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/

Model for each epoch:
---------------------
/home/jupyter/imported_files/files/facenet/experiment_dir_unpruned/:
total 1.2G
-rw-r--r-- 1 jupyter jupyter 180M Sep  6 18:36 model.step-0.ckzip
-rw-r--r-- 1 jupyter jupyter  45M Sep  6 18:37 model.step-0.tlt
-rw-r--r-- 1 jupyter jupyter 180M Sep  6 19:04 model.step-4030.ckzip
-rw-r--r-- 1 jupyter jupyter  45M Sep  6 19:04 model.step-4030.tlt
-rw-r--r-- 1 jupyter jupyter 180M Sep  6 19:30 model.step-8060.ckzip
-rw-r--r-- 1 jupyter jupyter  45M Sep  6 19:30 model.step-8060.tlt
-rw-r--r-- 1 jupyter jupyter  12M Sep  6 20:19 events.out.tfevents.1662489397.07aec2efe5bc
-rw-r--r-- 1 jupyter jupyter 3.4K Sep  6 20:29 experiment_spec.txt
drwxr-xr-x 2 jupyter jupyter 4.0K Sep  6 20:29 events
-rw-r--r-- 1 jupyter jupyter 8.6M Sep  6 20:29 graph.pbtxt
-rw-r--r-- 1 jupyter jupyter 180M Sep  6 20:29 model.step-12090.ckzip
-rw-r--r-- 1 jupyter jupyter  45M Sep  6 20:30 model.step-12090.tlt
-rw-r--r-- 1 jupyter jupyter 180M Sep  6 20:55 model.step-16120.c

## 5. Evaluate the trained model <a class="anchor" id="head-5"></a>

In [10]:
!detectnet_v2 evaluate -e $LOCAL_SPECS_DIR/facenet_train_resnet18_kitti.txt\
                           -m $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
                           -k $KEY

Using TensorFlow backend.
Using TensorFlow backend.


2022-09-06 20:57:47,472 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_train_resnet18_kitti.txt




2022-09-06 20:57:48,029 [INFO] root: Loading model weights.














2022-09-06 20:57:50,649 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used.
2022-09-06 20:57:50,650 [INFO] root: Building dataloader.
2022-09-06 20:57:52,695 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2022-09-06 20:57:52,696 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2022-09-06 20:57:52,696 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2022-09-06 20:57:52,697 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2022-09-06 20:57:52,697 [INFO] m

## 6. Prune the trained model <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

*Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use depends on the dataset. A pth value `5.2e-6` is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.*

*For some internal studies, we have noticed that a pth value of 0.01 is a good starting point for detectnet_v2 models.*

In [11]:
# Create an output directory if it doesn't exist.
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned

In [12]:
!detectnet_v2 prune \
                  -m $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
                  -o $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt \
                  -eq union \
                  -pth 0.0000052 \
                  -k $KEY

Using TensorFlow backend.
Using TensorFlow backend.
2022-09-06 21:02:58,462 [INFO] modulus.pruning.pruning: Exploring graph for retainable indices
2022-09-06 21:02:59,314 [INFO] modulus.pruning.pruning: Pruning model and appending pruned nodes to new graph
2022-09-06 21:03:23,200 [INFO] iva.common.magnet_prune: Pruning ratio (pruned model / original model): 0.9430816061809462


In [13]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned/

total 42792
-rw-r--r-- 1 jupyter jupyter 43816808 Sep  6 21:03 resnet18_nopool_bn_detectnet_v2_pruned.tlt


## 7. Retrain the pruned model <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification with pretrained weights as pruned model.

*Note: For retraining, please set the `load_graph` option to `true` in the model_config to load the pruned model graph. Also, if after retraining, the model shows some decrease in mAP, it could be that the originally trained model, was pruned a little too much. Please try reducing the pruning threshold, thereby reducing the pruning ratio, and use the new model to retrain.*

In [14]:
# Printing the retrain experiment file. 
# Note: We have updated the experiment file to include the 
# newly pruned model as a pretrained weights and, the
# load_graph option is set to true 
!cat $LOCAL_SPECS_DIR/facenet_retrain_resnet18_kitti.txt

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/home/jupyter/imported_files/files/facenet/data/tfrecords/training/kitti_train-*"
    image_directory_path: "/home/jupyter/imported_files/files/facenet/data/training"
  }
  validation_data_source {
    tfrecords_path: "/home/jupyter/imported_files/files/facenet/data/tfrecords/validation/kitti_val-*"
    image_directory_path: "/home/jupyter/imported_files/files/facenet/data/validation"
  }
  image_extension: "png"
  target_class_mapping {
    key: "face"
    value: "face"
  }
}
augmentation_config {
  preprocessing {
    output_image_width: 736
    output_image_height: 416
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    zoom_min: 0.7
    zoom_max: 1.8
    translate_max_x: 12.0
    translate_max_y: 12.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contrast_scale_max: 0.100000001

In [4]:
# Retraining using the pruned model as pretrained weights 
!detectnet_v2 train -e $LOCAL_SPECS_DIR/facenet_retrain_resnet18_kitti.txt \
                        -r $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain \
                        -k $KEY \
                        -n resnet18_detector_pruned \
                        --gpus $NUM_GPUS

Using TensorFlow backend.
Using TensorFlow backend.




2022-09-06 21:26:04,068 [INFO] iva.common.logging.logging: Log file already exists at /home/jupyter/imported_files/files/facenet/experiment_dir_retrain/status.json
2022-09-06 21:26:04,068 [INFO] root: Starting DetectNet_v2 Training job
2022-09-06 21:26:04,068 [INFO] __main__: Loading experiment spec at /home/jupyter/imported_files/files/facenet/specs/facenet_retrain_resnet18_kitti.txt.
2022-09-06 21:26:04,070 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_retrain_resnet18_kitti.txt
2022-09-06 21:26:04,075 [INFO] root: Training gridbox model.


2022-09-06 21:26:05,392 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2022-09-06 21:26:05,392 [INFO] __main__: Cannot iterate over exactly 12880 samples with a batch size of 32; each epoch will therefore take one extra step.






2022-09-06 21:26:05,428 [INFO] root: Building Detect

In [5]:
# Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/weights

total 42788
-rw-r--r-- 1 jupyter jupyter 43813032 Sep  6 21:40 resnet18_detector_pruned.tlt


## 8. Evaluate the retrained model <a class="anchor" id="head-8"></a>

This section evaluates the pruned and retrained model, using `tao-evaluate`.

In [6]:
!detectnet_v2 evaluate -e $LOCAL_SPECS_DIR/facenet_retrain_resnet18_kitti.txt \
                           -m $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
                           -k $KEY

Using TensorFlow backend.
Using TensorFlow backend.


2022-09-06 21:57:36,781 [INFO] iva.detectnet_v2.spec_handler.spec_loader: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_retrain_resnet18_kitti.txt




2022-09-06 21:57:37,251 [INFO] root: Loading model weights.














2022-09-06 21:57:39,591 [INFO] iva.detectnet_v2.objectives.bbox_objective: Default L1 loss function will be used.
2022-09-06 21:57:39,591 [INFO] root: Building dataloader.
2022-09-06 21:57:40,897 [INFO] root: Sampling mode of the dataloader was set to user_defined.
2022-09-06 21:57:40,898 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Serial augmentation enabled = False
2022-09-06 21:57:40,898 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Pseudo sharding enabled = False
2022-09-06 21:57:40,898 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: Max Image Dimensions (all sources): (0, 0)
2022-09-06 21:57:40,898 [INFO]

## 9. Visualize inferences <a class="anchor" id="head-9"></a>
In this section, we run the `inference` tool to generate inferences on the trained models.

In [7]:
# Running inference for detection on n images
!detectnet_v2 inference -e $LOCAL_SPECS_DIR/facenet_inference_kitti_tlt.txt \
                            -o $LOCAL_EXPERIMENT_DIR/tlt_infer_testing \
                            -i $LOCAL_DATA_DIR/validation/images \
                            -k $KEY

Using TensorFlow backend.
Using TensorFlow backend.
INFO: Merging specification from /home/jupyter/imported_files/files/facenet/specs/facenet_inference_kitti_tlt.txt
INFO: Creating output inference directory
INFO: Overlain images will be saved in the output path.
INFO: Constructing inferencer




INFO: Loading model from /home/jupyter/imported_files/files/facenet/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt:


















_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 3, 416, 736)       0         
_________________________________________________________________
model_1 (Model)              [(None, 1, 26, 46), (None 10893397  
Total params: 10,893,397
Trainable params: 10,882,277
Non-trainable params: 11,120
_________________________________________________________________
INFO: Initialized model
INFO: Commencing inference
100%|████████████████████

The `infer` tool produces two outputs. 
1. Overlain images in `$USER_EXPERIMENT_DIR/tlt_infer_testing/images_annotated`
2. Frame by frame bbox labels in kitti format located in `$USER_EXPERIMENT_DIR/tlt_infer_testing/labels`

*Note: To run inferences for a single image, simply replace the path to the -i flag in `infer` command with the path to the image.*

In [None]:
# Simple grid visualizer
%matplotlib inline
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the first 12 images.
OUTPUT_PATH = 'tlt_infer_testing/images_annotated' # relative path from $USER_EXPERIMENT_DIR.
COLS = 4 # number of columns in the visualizer grid.
IMAGES = 12 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)