# Face Mask Detection using NVIDIA TLT 

The MIT License (MIT)

Copyright (c) 2019-2020, NVIDIA CORPORATION.

Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"), to deal in
the Software without restriction, including without limitation the rights to
use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of
the Software, and to permit persons to whom the Software is furnished to do so,
subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS
FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

## DetectNet_v2 with ResNet-18 example usecase

The goal of this notebook is to utilize NVIDIA TLT to train and make Face Mask detection model deploy ready.
While working on such application, this notebook will serve as an example usecase of Object Detection using DetectNet_v2 in the Transfer Learning Toolkit.

0. [Set up env variables](#head-0)
1. [Prepare dataset and pre-trained model](#head-1)
    1. [Download dataset and convert in KITTI Format](#head-1-1)
    1. [Prepare tfrecords from kitti format dataset](#head-1-2)
    2. [Download pre-trained model](#head-1-3)
2. [Provide training specification](#head-2)
3. [Run TLT training](#head-3)
4. [Evaluate trained models](#head-4)
5. [Prune trained models](#head-5)
6. [Retrain pruned models](#head-6)
7. [Evaluate retrained model](#head-7)
8. [Visualize inferences](#head-8)
9. [Deploy](#head-9)
    1. [Int8 Optimization](#head-9-1)
    2. [Generate TensorRT engine](#head-9-2)
10. [Verify Deployed Model](#head-10)
    1. [Inference using TensorRT engine](#head-10-1)

![Face Mask Detection Output](https://raw.githubusercontent.com/NVIDIA-AI-IOT/face-mask-detection/master/images/face-mask-detect-output.png)

## 0. Set up env variables <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

In [1]:
# Setting up env variables for cleaner command line commands.
print("Update directory paths if needed")
%env KEY=tlt_encode
# User directory - pre-trained/unpruned/pruned/final models will be saved here
#%env USER_EXPERIMENT_DIR=/home/detectnet_v2 
#%env USER_EXPERIMENT_DIR=/home/detectnet_v2 
%env USER_EXPERIMENT_DIR=/tlt/face-mask-detection

# Download directory - tfrecords will be generated here
#%env DATA_DOWNLOAD_DIR=/home/data_fm_0916           
%env DATA_DOWNLOAD_DIR= /tlt/data/facemask/KITTI/train/    

# Spec Directory
#%env SPECS_DIR=/home/detectnet_v2/specs 
#%env SPECS_DIR=/home/detectnet_v2/specs 
%env SPECS_DIR=/tlt/face-mask-detection/tlt_specs
# Number of GPUs used for training
%env NUM_GPUS=1

Update directory paths if needed
env: KEY=tlt_encode
env: USER_EXPERIMENT_DIR=/tlt/face-mask-detection
env: DATA_DOWNLOAD_DIR=/tlt/data/facemask/KITTI/train/
env: SPECS_DIR=/tlt/face-mask-detection/tlt_specs
env: NUM_GPUS=1


## 1. Prepare dataset and pre-trained model <a class="anchor" id="head-1"></a>

### A. Download dataset and convert in KITTI Format <a class="anchor" id="head-1-1"></a>

In this experiment we will be using 4 different datasets; 

1. Faces with Mask:
    - Kaggle Medical Mask Dataset [Download Link](https://www.kaggle.com/ivandanilovich/medical-masks-dataset-images-tfrecords)
    - MAFA - MAsked FAces [Download Link](https://drive.google.com/drive/folders/1nbtM1n0--iZ3VVbNGhocxbnBGhMau_OG)
2. Faces without Mask:
    - FDDB Dataset [Download Link](http://vis-www.cs.umass.edu/fddb/)
    - WiderFace Dataset [Download Link](http://shuoyang1213.me/WIDERFACE/)

- Download the data using provided links, such that all images and label files are in one folder. We expect in structure noted in GitHub repo.
- Convert dataset to KITTI format 
- Use KITTI format directory as "$DATA_DOWNLOAD_DIR"


Note: We do not use all the images from MAFA and WiderFace. Combining we will use about 6000 faces each with and without mask

### B. Prepare tf records from kitti format dataset <a class="anchor" id="head-1-2"></a>

* Update the tfrecords spec file to take in your kitti format dataset
* Create the tfrecords using the tlt-dataset-convert 

*Note: TfRecords only need to be generated once.*

In [2]:
print("TFrecords conversion spec file for kitti training")
!cat $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt

TFrecords conversion spec file for kitti training
kitti_config {
  root_directory_path: "/tlt/data/facemask/KITTI/train"
  image_dir_name: "images"
  label_dir_name: "labels"
  image_extension: ".jpg"
  partition_mode: "random"
  num_partitions: 2
  val_split: 20
  num_shards: 10 }


In [20]:
# Creating a new directory for the output tfrecords dump.
print("Converting Tfrecords for kitti trainval dataset")
!tlt-dataset-convert -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \
                     -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval

Converting Tfrecords for kitti trainval dataset
2021-11-28 12:13:20.145983: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2021-11-28 12:13:24,609 - iva.detectnet_v2.dataio.build_converter - INFO - Instantiating a kitti converter
2021-11-28 12:13:24,659 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Num images in
Train: 6308	Val: 1576
2021-11-28 12:13:24,659 - iva.detectnet_v2.dataio.kitti_converter_lib - INFO - Validation data in partition 0. Hence, while choosing the validationset during training choose validation_fold 0.
2021-11-28 12:13:24,672 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 0


2021-11-28 12:13:24,971 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 1
2021-11-28 12:13:25,251 - iva.detectnet_v2.dataio.dataset_converter_lib - INFO - Writing partition 0, shard 2
2021-11-28 12:13:25,531 - iva.det

In [21]:
!ls -rlt $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/

total 5196
-rw-r--r-- 1 root root 109916 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00000-of-00010
-rw-r--r-- 1 root root 106034 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00001-of-00010
-rw-r--r-- 1 root root 106561 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00002-of-00010
-rw-r--r-- 1 root root 103324 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00003-of-00010
-rw-r--r-- 1 root root 104889 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00004-of-00010
-rw-r--r-- 1 root root 107046 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00005-of-00010
-rw-r--r-- 1 root root 105324 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00006-of-00010
-rw-r--r-- 1 root root 104674 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00007-of-00010
-rw-r--r-- 1 root root 104523 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00008-of-00010
-rw-r--r-- 1 root root 112047 Nov 28 12:13 kitti_trainval-fold-000-of-002-shard-00009-of-00010
-rw-r--r-- 1 root root 42179

### C. Download pre-trained model <a class="anchor" id="head-1-3"></a>
Download the correct pretrained model from the NGC model registry for your experiment. Please note that for DetectNet_v2, the input is expected to be 0-1 normalized with input channels in RGB order. Therefore, for optimum results please download models with `*_detectnet_v2` in their name string. All other models expect input preprocessing with mean subtraction and input channels in BGR order. Thus, using them as pretrained weights may result in suboptimal performance. 

In [4]:
# List models available in the model registry.
!ngc registry model list nvidia/tlt_pretrained_detectnet_v2:*

+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Versi | Accur | Epoch | Batch | GPU   | Memor | File  | Statu | Creat |
| on    | acy   | s     | Size  | Model | y Foo | Size  | s     | ed    |
|       |       |       |       |       | tprin |       |       | Date  |
|       |       |       |       |       | t     |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| vgg19 | 82.6  | 80    | 1     | V100  | 153.8 | 153.7 | UPLOA | Apr   |
|       |       |       |       |       |       | 7 MB  | D_COM | 29,   |
|       |       |       |       |       |       |       | PLETE | 2020  |
| vgg16 | 82.2  | 80    | 1     | V100  | 113.2 | 113.2 | UPLOA | Apr   |
|       |       |       |       |       |       | MB    | D_COM | 29,   |
|       |       |       |       |       |       |       | PLETE | 2020  |
| squee | 65.67 | 80    | 1     | V100  | 6.5   | 6.46  | UPLOA | Apr   |
| zenet |       |       |

In [5]:
# Create the target destination to download the model.
!mkdir -p $USER_EXPERIMENT_DIR/pretrained_resnet18/

In [6]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tlt_pretrained_detectnet_v2:resnet18 \
    --dest $USER_EXPERIMENT_DIR/pretrained_resnet18

Downloaded 82.28 MB in 27s, Download speed: 3.04 MB/s               
----------------------------------------------------
Transfer id: tlt_pretrained_detectnet_v2_vresnet18 Download status: Completed.
Downloaded local path: /tlt/face-mask-detection/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18-1
Total files downloaded: 1 
Total downloaded size: 82.28 MB
Started at: 2021-11-24 15:34:47.675148
Completed at: 2021-11-24 15:35:14.706673
Duration taken: 27s
----------------------------------------------------


In [17]:
!ls -rlt $USER_EXPERIMENT_DIR/pretrained_resnet18/tlt_pretrained_detectnet_v2_vresnet18

total 91160
-rw------- 1 1001 1001 93345248 Nov 10 07:04 resnet18.hdf5


## 2. Provide training specification <a class="anchor" id="head-2"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt` 
    * Update the fold number to use for evaluation. In case of random data split, please use fold `0` only
    * For sequence-wise split, you may use any fold generated from the dataset convert tool
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [8]:
!cat $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt

random_seed: 42
dataset_config {
  data_sources {
    tfrecords_path: "/tlt/data/facemask/KITTI/train/tfrecords/kitti_trainval/*"
    image_directory_path: "/tlt/data/facemask/KITTI/train"
  }
  image_extension: "jpg"
  target_class_mapping {
    key: "mask"
    value: "mask"
  }
  target_class_mapping {
    key: "no-mask"
    value: "no-mask"
  }
  validation_fold: 0
  #validation_data_source: {
    #tfrecords_path: "/home/data/tfrecords/kitti_val/*"
    #image_directory_path: "/home/data/test"
  #}
}


augmentation_config {
  preprocessing {
    output_image_width: 960
    output_image_height: 544
    min_bbox_width: 1.0
    min_bbox_height: 1.0
    output_image_channel: 3
  }
  spatial_augmentation {
    hflip_probability: 0.5
    vflip_probability: 0.0
    zoom_min: 1.0
    zoom_max: 1.0
    translate_max_x: 8.0
    translate_max_y: 8.0
  }
  color_augmentation {
    hue_rotation_max: 25.0
    saturation_shift_max: 0.20000000298
    contras

## 3. Run TLT training <a class="anchor" id="head-3"></a>
* Provide the sample spec file and the output directory location for models

*Note: The training may take hours to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode. When run in multi-GPU mode, please expect to update the pruning and inference steps with new pruning thresholds and updated parameters in the clusterfile.json accordingly for optimum performance.*

*Detectnet_v2 now supports restart from checkpoint. Incase, the training job is killed prematurely, you may resume training from the closest checkpoint by simply re-running the same command line. Please do make sure to use the same number of GPUs when restarting the training.*

In [22]:
%env NUM_GPUS=4
!CUDA_VISIBLE_DEVICES=0,1,2,3 tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -k $KEY \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS

env: NUM_GPUS=4
2021-11-28 12:13:52.454596: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-11-28 12:13:52.454593: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-11-28 12:13:52.475758: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-11-28 12:13:52.488468: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
Using TensorFlow backend.
Using TensorFlow backend.
Using TensorFlow backend.
2021-11-28 12:13:57.267462: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-11-28 12:13:57.267978: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2021-11-28 12:

2021-11-28 12:14:00,676 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 6308 samples with a batch size of 8; each epoch will therefore take one extra step.
2021-11-28 12:14:00,676 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 197 steps per epoch with 8 processors; each processor will therefore take one extra step per epoch.
2021-11-28 12:14:00,698 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 6308 samples with a batch size of 8; each epoch will therefore take one extra step.
2021-11-28 12:14:00,698 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 197 steps per epoch with 8 processors; each processor will therefore take one extra step per epoch.
2021-11-28 12:14:00,731 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 6308 samples with a batch size of 8; each epoch will therefore take one extra step.
2021-11-28 12:14:00,731 [INFO] iva.detectnet_v2.scripts.train: Cannot iterate over exactly 19

2021-11-28 12:14:18,140 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 3, 544, 960)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 272, 480) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 272, 480) 256         conv1[0][0]                      
__________________________________________________________________________________________

2021-11-28 12:14:18.197996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:06:00.0
2021-11-28 12:14:18.199795: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 1 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:07:00.0
2021-11-28 12:14:18.201555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 2 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:0a:00.0
2021-11-28 12:14:18.203293: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 3 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:0b:00.0
2021-11-28 12:14:18.203348: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 3, 544, 960)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 64, 272, 480) 9472        input_1[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 64, 272, 480) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_1 (Activation)       (None, 64, 272, 480) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________


2021-11-28 12:14:18,891 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: shuffle: True - shard 0 of 4
2021-11-28 12:14:18,903 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: sampling 1 datasets with weights:
2021-11-28 12:14:18,904 [INFO] modulus.blocks.data_loaders.multi_source_loader.data_loader: source: 0 weight: 1.000000
2021-11-28 12:14:18,927 [INFO] iva.detectnet_v2.dataloader.default_dataloader: Bounding box coordinates were detected in the input specification! Bboxes will be automatically converted to polygon coordinates.
2021-11-28 12:14:18.986196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:06:00.0
2021-11-28 12:14:18.987980: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 1 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:07:00

2021-11-28 12:14:26.183767: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1665] Found device 0 with properties: 
name: Tesla V100-SXM2-16GB major: 7 minor: 0 memoryClockRate(GHz): 1.312
pciBusID: 0000:0b:00.0
2021-11-28 12:14:26.183875: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2021-11-28 12:14:26.184032: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-11-28 12:14:26.184112: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10
2021-11-28 12:14:26.184161: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10
2021-11-28 12:14:26.184208: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.11
2021-11-28 12:14:26.184254: I tensorflow/stream_executor/platfo

2021-11-28 12:15:19.803747: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11
2021-11-28 12:15:19.818971: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8
node1:289519:290048 [0] NCCL INFO Bootstrap : Using [0]lo:127.0.0.1<0> [1]enp1s0f0:10.19.222.12<0> [2]enp139s0:11.11.11.11<0> [3]ens1:1.1.1.1<0> [4]veth02c95e2:fe80::89b:2fff:fedc:14ba%veth02c95e2<0> [5]veth88073f6:fe80::14ad:a8ff:feea:be33%veth88073f6<0> [6]vethbf68aa2:fe80::c8b:96ff:fed0:3587%vethbf68aa2<0> [7]veth89fc99d:fe80::d87d:e2ff:fe9a:ad1d%veth89fc99d<0> [8]vethb46690e:fe80::e82e:73ff:fe81:19b9%vethb46690e<0> [9]veth050e390:fe80::c0fb:a1ff:fee1:158c%veth050e390<0> [10]veth607b702:fe80::3cb9:ccff:fe70:daa5%veth607b702<0> [11]veth9399886:fe80::c8bc:30ff:fe4e:88a8%veth9399886<0> [12]vethb9e10ef:fe80::1ccd:93ff:fe20:2a99%vethb9e10ef<0> [13]veth618003f:fe80::8cbe:69ff:fe9a:b28%veth618003f<0>
node1:28

node1:289519:290048 [0] NCCL INFO Channel 00 : 0[6000] -> 1[7000] via P2P/IPC
node1:289523:290057 [3] NCCL INFO Channel 00 : 3[b000] -> 0[6000] via P2P/IPC
node1:289521:290050 [1] NCCL INFO Channel 00 : 1[7000] -> 2[a000] via P2P/IPC
node1:289522:290049 [2] NCCL INFO Channel 00 : 2[a000] -> 1[7000] via P2P/IPC
node1:289523:290057 [3] NCCL INFO Channel 00 : 3[b000] -> 2[a000] via P2P/IPC
node1:289519:290048 [0] NCCL INFO Channel 00 : 0[6000] -> 3[b000] via P2P/IPC
node1:289521:290050 [1] NCCL INFO Channel 01 : 1[7000] -> 0[6000] via P2P/IPC
node1:289522:290049 [2] NCCL INFO Channel 01 : 2[a000] -> 1[7000] via P2P/IPC
node1:289523:290057 [3] NCCL INFO Channel 01 : 3[b000] -> 2[a000] via P2P/IPC
node1:289519:290048 [0] NCCL INFO Channel 01 : 0[6000] -> 3[b000] via P2P/IPC
node1:289521:290050 [1] NCCL INFO Channel 01 : 1[7000] -> 2[a000] via P2P/IPC
node1:289522:290049 [2] NCCL INFO Channel 01 : 2[a000] -> 3[b000] via P2P/IPC
node1:289523:290057 [3] NCCL INFO Channel 01 : 3[b000] -> 0[6000

2021-11-28 12:17:04,098 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 3/120: loss: 0.00157 Time taken: 0:00:29.745577 ETA: 0:58:00.232548
2021-11-28 12:17:04,839 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.956
2021-11-28 12:17:08,720 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 206.134
2021-11-28 12:17:12,531 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.979
2021-11-28 12:17:16,353 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.341
2021-11-28 12:17:20,046 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.651
2021-11-28 12:17:23,835 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.130
2021-11-28 12:17:27,603 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.359
2021-11-28 12:17:31,412 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.066
2021-11-28 12:17:34,149 [INFO] /usr/local/lib/pyth

2021-11-28 12:26:57,249 [INFO] iva.detectnet_v2.evaluation.evaluation: step 160 / 197, 2.34s/step
2021-11-28 12:27:20,345 [INFO] iva.detectnet_v2.evaluation.evaluation: step 170 / 197, 2.31s/step
2021-11-28 12:27:43,588 [INFO] iva.detectnet_v2.evaluation.evaluation: step 180 / 197, 2.32s/step
2021-11-28 12:28:07,532 [INFO] iva.detectnet_v2.evaluation.evaluation: step 190 / 197, 2.39s/step
Matching predictions to ground truth, class 1/2.: 100% 416520/416520 [00:38<00:00, 10761.93it/s]
Matching predictions to ground truth, class 2/2.: 100% 61406/61406 [00:07<00:00, 8417.07it/s]
Epoch 10/120

Validation cost: 0.000441
Mean average_precision (in %): 15.7306

class name      average precision (in %)
------------  --------------------------
mask                             2.70893
no-mask                         28.7523

Median Inference Time: 0.006170
2021-11-28 12:29:30,701 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 10/120: loss: 0.000

2021-11-28 12:33:23,929 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.162
2021-11-28 12:33:27,749 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.481
2021-11-28 12:33:30,038 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 18/120: loss: 0.00073 Time taken: 0:00:30.027291 ETA: 0:51:02.783664
2021-11-28 12:33:31,519 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.207
2021-11-28 12:33:35,342 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.302
2021-11-28 12:33:39,113 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.201
2021-11-28 12:33:42,855 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.798
2021-11-28 12:33:46,649 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.917
2021-11-28 12:33:50,544 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 205.382
2021-11-28 12:33:54,372 [INFO] modulus.hooks.samp

2021-11-28 12:37:31,063 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 206.377
2021-11-28 12:37:31,534 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 24/120: loss: 0.00075 Time taken: 0:00:30.434929 ETA: 0:48:41.753174
2021-11-28 12:37:34,895 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 208.782
2021-11-28 12:37:38,715 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.484
2021-11-28 12:37:42,489 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.962
2021-11-28 12:37:46,263 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.039
2021-11-28 12:37:50,063 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.522
2021-11-28 12:37:53,816 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.193
2021-11-28 12:37:57,646 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 208.937
2021-11-28 12:38:01,461 [INFO] modulus.hooks.samp

2021-11-28 12:41:17,236 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 16.963
2021-11-28 12:41:21,068 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 208.837
2021-11-28 12:41:24,928 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 207.255
2021-11-28 12:41:28,709 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.614
2021-11-28 12:41:32,469 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.834
2021-11-28 12:41:36,286 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.579
2021-11-28 12:41:40,086 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.576
2021-11-28 12:41:43,810 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.825
2021-11-28 12:41:45,932 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 31/120: loss: 0.00080 Time taken: 0:00:30.041361 ETA: 0:44:33.681095
2021-11-28 12:41:47,626 [INFO] modulus.hooks.sampl

2021-11-28 12:45:47,496 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 39/120: loss: 0.00037 Time taken: 0:00:30.213867 ETA: 0:40:47.323242
2021-11-28 12:45:47,804 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.653
2021-11-28 12:45:51,542 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.050
2021-11-28 12:45:55,416 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 206.525
2021-11-28 12:45:59,247 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 208.852
2021-11-28 12:46:03,071 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.252
2021-11-28 12:46:06,856 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.365
2021-11-28 12:46:10,714 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 207.376
2021-11-28 12:46:14,455 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.893
2021-11-28 12:46:24,356 [INFO] iva.detectnet_v2.e

2021-11-28 12:49:32,209 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 201.840
2021-11-28 12:49:36,129 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 204.094
2021-11-28 12:49:40,067 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 203.157
2021-11-28 12:49:43,995 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 203.708
2021-11-28 12:49:48,010 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 199.288
2021-11-28 12:49:51,987 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 201.225
2021-11-28 12:49:55,917 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 203.594
2021-11-28 12:49:59,935 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 199.111
2021-11-28 12:50:01,353 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 46/120: loss: 0.00073 Time taken: 0:00:31.345527 ETA: 0:38:39.569011
2021-11-28 12:50:03,842 [INFO] modulus.hooks.samp

2021-11-28 12:59:24,882 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.443
2021-11-28 12:59:28,606 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.860
2021-11-28 12:59:32,349 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.722
2021-11-28 12:59:36,036 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 217.044
2021-11-28 12:59:39,818 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.536
2021-11-28 12:59:43,082 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 52/120: loss: 0.00055 Time taken: 0:00:29.555578 ETA: 0:33:29.779304
2021-11-28 12:59:43,548 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.480
2021-11-28 12:59:47,275 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.711
2021-11-28 12:59:51,030 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.117
2021-11-28 12:59:54,799 [INFO] modulus.hooks.samp

2021-11-28 13:04:17,321 [INFO] iva.detectnet_v2.evaluation.evaluation: step 20 / 197, 0.12s/step
2021-11-28 13:04:18,754 [INFO] iva.detectnet_v2.evaluation.evaluation: step 30 / 197, 0.14s/step
2021-11-28 13:04:20,634 [INFO] iva.detectnet_v2.evaluation.evaluation: step 40 / 197, 0.19s/step
2021-11-28 13:04:22,027 [INFO] iva.detectnet_v2.evaluation.evaluation: step 50 / 197, 0.14s/step
2021-11-28 13:04:23,183 [INFO] iva.detectnet_v2.evaluation.evaluation: step 60 / 197, 0.12s/step
2021-11-28 13:04:24,222 [INFO] iva.detectnet_v2.evaluation.evaluation: step 70 / 197, 0.10s/step
2021-11-28 13:04:25,739 [INFO] iva.detectnet_v2.evaluation.evaluation: step 80 / 197, 0.15s/step
2021-11-28 13:04:26,935 [INFO] iva.detectnet_v2.evaluation.evaluation: step 90 / 197, 0.12s/step
2021-11-28 13:04:28,166 [INFO] iva.detectnet_v2.evaluation.evaluation: step 100 / 197, 0.12s/step
2021-11-28 13:04:29,340 [INFO] iva.detectnet_v2.evaluation.evaluation: step 110 / 197, 0.12s/step
2021-11-28 13:04:30,623 [INF

2021-11-28 13:07:48,413 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.620
2021-11-28 13:07:52,225 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.923
2021-11-28 13:07:55,947 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.938
2021-11-28 13:07:59,654 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.833
2021-11-28 13:08:03,394 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.951
2021-11-28 13:08:07,099 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.931
2021-11-28 13:08:09,635 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 67/120: loss: 0.00039 Time taken: 0:00:29.556439 ETA: 0:26:06.491263
2021-11-28 13:08:10,847 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.482
2021-11-28 13:08:14,528 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 217.397
2021-11-28 13:08:18,286 [INFO] modulus.hooks.samp

2021-11-28 13:11:26,380 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.132
2021-11-28 13:11:30,158 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.783
2021-11-28 13:11:33,977 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 209.508
2021-11-28 13:11:37,750 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.065
2021-11-28 13:11:41,535 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.351
2021-11-28 13:11:42,295 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 73/120: loss: 0.00019 Time taken: 0:00:30.005606 ETA: 0:23:30.263479
2021-11-28 13:11:45,323 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.253
2021-11-28 13:11:49,167 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 208.110
2021-11-28 13:11:52,973 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.253
2021-11-28 13:11:56,774 [INFO] modulus.hooks.samp

2021-11-28 13:15:34,077 [INFO] iva.detectnet_v2.evaluation.evaluation: step 120 / 197, 0.16s/step
2021-11-28 13:15:35,627 [INFO] iva.detectnet_v2.evaluation.evaluation: step 130 / 197, 0.16s/step
2021-11-28 13:15:37,193 [INFO] iva.detectnet_v2.evaluation.evaluation: step 140 / 197, 0.16s/step
2021-11-28 13:15:38,740 [INFO] iva.detectnet_v2.evaluation.evaluation: step 150 / 197, 0.15s/step
2021-11-28 13:15:40,371 [INFO] iva.detectnet_v2.evaluation.evaluation: step 160 / 197, 0.16s/step
2021-11-28 13:15:42,004 [INFO] iva.detectnet_v2.evaluation.evaluation: step 170 / 197, 0.16s/step
2021-11-28 13:15:43,527 [INFO] iva.detectnet_v2.evaluation.evaluation: step 180 / 197, 0.15s/step
2021-11-28 13:15:44,974 [INFO] iva.detectnet_v2.evaluation.evaluation: step 190 / 197, 0.14s/step
Matching predictions to ground truth, class 1/2.: 100% 8022/8022 [00:00<00:00, 9386.60it/s]
Matching predictions to ground truth, class 2/2.: 100% 2634/2634 [00:00<00:00, 7104.75it/s]
Epoch 80/120

Validation cost: 0

2021-11-28 13:19:20,958 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.740
2021-11-28 13:19:24,706 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.512
2021-11-28 13:19:28,415 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.697
2021-11-28 13:19:32,139 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.831
2021-11-28 13:19:35,840 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.198
2021-11-28 13:19:39,570 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.501
2021-11-28 13:19:43,330 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 88/120: loss: 0.00017 Time taken: 0:00:29.588490 ETA: 0:15:46.831696
2021-11-28 13:19:43,330 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 212.790
2021-11-28 13:19:46,999 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 218.084
2021-11-28 13:19:50,672 [INFO] modulus.hooks.samp

2021-11-28 13:22:57,400 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.919
2021-11-28 13:23:01,193 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 210.914
2021-11-28 13:23:04,882 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.928
2021-11-28 13:23:08,660 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.747
2021-11-28 13:23:12,400 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.939
2021-11-28 13:23:14,364 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 94/120: loss: 0.00013 Time taken: 0:00:29.637800 ETA: 0:12:50.582787
2021-11-28 13:23:16,156 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.036
2021-11-28 13:23:19,858 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.115
2021-11-28 13:23:23,571 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.493
2021-11-28 13:23:27,288 [INFO] modulus.hooks.samp

Matching predictions to ground truth, class 2/2.: 100% 3124/3124 [00:00<00:00, 7404.89it/s]
Epoch 100/120

Validation cost: 0.000175
Mean average_precision (in %): 83.6791

class name      average precision (in %)
------------  --------------------------
mask                             84.1044
no-mask                          83.2538

Median Inference Time: 0.006171
2021-11-28 13:26:43,887 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 21.908
2021-11-28 13:26:44,025 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 100/120: loss: 0.00012 Time taken: 0:01:02.080986 ETA: 0:20:41.619730
2021-11-28 13:26:47,623 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.230
2021-11-28 13:26:51,371 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.478
2021-11-28 13:26:55,080 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.758
2021-11-28 13:26:58,821 [INFO] modulus.hooks.sample_counter

2021-11-28 13:30:41,780 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.810
2021-11-28 13:30:45,486 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.880
2021-11-28 13:30:49,215 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.576
2021-11-28 13:30:52,962 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.529
2021-11-28 13:30:56,658 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.428
2021-11-28 13:31:00,410 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 213.280
2021-11-28 13:31:04,120 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.657
2021-11-28 13:31:07,783 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 218.401
2021-11-28 13:31:08,995 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 109/120: loss: 0.00013 Time taken: 0:00:29.462610 ETA: 0:05:24.088715
2021-11-28 13:31:11,500 [INFO] modulus.hooks.sam

2021-11-28 13:34:17,037 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 220.479
2021-11-28 13:34:20,718 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 217.366
2021-11-28 13:34:24,502 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 211.478
2021-11-28 13:34:28,230 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 214.588
2021-11-28 13:34:31,910 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 217.446
2021-11-28 13:34:35,610 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 216.211
2021-11-28 13:34:38,730 [INFO] /usr/local/lib/python3.6/dist-packages/modulus/hooks/task_progress_monitor_hook.pyc: Epoch 115/120: loss: 0.00011 Time taken: 0:00:29.339855 ETA: 0:02:26.699277
2021-11-28 13:34:39,324 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.442
2021-11-28 13:34:43,041 [INFO] modulus.hooks.sample_counter_hook: Train Samples / sec: 215.261
2021-11-28 13:34:46,761 [INFO] modulus.hooks.sam

In [10]:
print('Model for each epoch:')
print('---------------------')
!ls -lh $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights

Model for each epoch:
---------------------
total 43M
-rw-r--r-- 1 1001 1001 43M Nov 28 09:54 resnet18_detector.tlt


## 4. Evaluate the trained model <a class="anchor" id="head-4"></a>

In [None]:
!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt\
                           -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
                           -k $KEY

## 5. Prune the trained model <a class="anchor" id="head-5"></a>
* Specify pre-trained model
* Equalization criterion (`Applicable for resnets and mobilenets`)
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

*Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. A pth value `5.2e-6` is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.*

*For some internal studies, we have noticed that a pth value of 0.01 is a good starting point for detectnet_v2 models.*

In [None]:
# Create an output directory if it doesn't exist.
!mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_pruned

In [None]:
print("Change Threshold (-pth) value according to you experiments")

!tlt-prune -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.tlt \
           -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.tlt \
           -eq union \
           -pth 0.8 \
           -k $KEY

In [None]:
!ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_pruned/

## 6. Retrain the pruned model <a class="anchor" id="head-6"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification with pretrained weights as pruned model.

*Note: For retraining, please set the `load_graph` option to `true` in the model_config to load the pruned model graph. Also, if after retraining, the model shows some decrease in mAP, it could be that the originally trained model, was pruned a little too much. Please try reducing the pruning threshold, thereby reducing the pruning ratio, and use the new model to retrain.*

In [None]:
# Printing the retrain experiment file. 
# Note: We have updated the experiment file to include the 
# newly pruned model as a pretrained weights and, the
# load_graph option is set to true 
!cat $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt

In [None]:
# Retraining using the pruned model as pretrained weights 
!tlt-train detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \
                        -k $KEY \
                        -n resnet18_detector_pruned \
                        --gpus $NUM_GPUS

In [None]:
# Listing the newly retrained model.
!ls -rlt $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights

## 7. Evaluate the retrained model <a class="anchor" id="head-7"></a>

This section evaluates the pruned and retrained model, using `tlt-evaluate`.

In [None]:
!tlt-evaluate detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                           -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
                           -k $KEY

## 8. Visualize inferences <a class="anchor" id="head-8"></a>
In this section, we run the `tlt-infer` tool to generate inferences on the trained models. To render bboxes from more classes, please edit the spec file `detectnet_v2_inference_kitti_tlt.txt` to include all the classes you would like to visualize and edit the rest of the file accordingly.

For this you will need to create `test_images` directory containing at least 8 images with masked and no-masked faces, it can be from test data or simply face captures from your own photos. 

In [None]:
# Running inference for detection on n images
!tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_tlt.txt \
                        -o $USER_EXPERIMENT_DIR/tlt_infer_testing \
                        -i $DATA_DOWNLOAD_DIR/test_images \
                        -k $KEY

The `tlt-infer` tool produces two outputs. 
1. Overlain images in `$USER_EXPERIMENT_DIR/tlt_infer_testing/images_annotated`
2. Frame by frame bbox labels in kitti format located in `$USER_EXPERIMENT_DIR/tlt_infer_testing/labels`

*Note: To run inferences for a single image, simply replace the path to the -i flag in `tlt-infer` command with the path to the image.*

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['USER_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx / num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the first 12 images.
OUTPUT_PATH = 'tlt_infer_testing/images_annotated' # relative path from $USER_EXPERIMENT_DIR.
COLS = 4 # number of columns in the visualizer grid.
IMAGES = 8 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 9. Deploy! <a class="anchor" id="head-9"></a>

In [None]:
!mkdir -p $USER_EXPERIMENT_DIR/experiment_dir_final
# Removing a pre-existing copy of the etlt if there has been any.
import os
output_file=os.path.join(os.environ['USER_EXPERIMENT_DIR'],
                         "experiment_dir_final/resnet18_detector.etlt")
if os.path.exists(output_file):
    os.system("rm {}".format(output_file))
!tlt-export detectnet_v2 \
            -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
            -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
            -k $KEY

In [None]:
print('Exported model:')
print('------------')
!ls -lh $USER_EXPERIMENT_DIR/experiment_dir_final

### A. Int8 Optimization <a class="anchor" id="head-9-1"></a>
DetectNet_v2 model supports int8 inference mode in TRT. In order to use int8 mode, we must calibrate the model to run 8-bit inferences. This involves 2 steps

* Generate calibration tensorfile from the training data using tlt-int8-tensorfile
* Use tlt-export to generate int8 calibration table.

*Note: For this example, we generate a calibration tensorfile containing 10 batches of training data.
Ideally, it is best to use atleast 10-20% of the training data to calibrate the model. The more data provided during calibration, the closer int8 inferences are to fp32 inferences.*

In [None]:
!tlt-int8-tensorfile detectnet_v2 -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                                  -m 40 \
                                  -o $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor

In [None]:
!rm -rf $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt
!rm -rf $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin
!tlt-export detectnet_v2 \
            -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.tlt \
            -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
            -k $KEY  \
            --cal_data_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.tensor \
            --data_type int8 \
            --batches 20 \
            --batch_size 4 \
            --max_batch_size 4\
            --engine_file $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt.int8 \
            --cal_cache_file $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
            --verbose

### B. Generate TensorRT engine <a class="anchor" id="head-9-2"></a>
Verify engine generation using the `tlt-converter` utility included with the docker.

The `tlt-converter` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please instantiate this docker and execute the `tlt-converter` command, with the exported `.etlt` file and calibration cache (for int8 mode) on your target device. The converter utility included in this docker only works for x86 devices, with discrete NVIDIA GPU's. 

For the jetson devices, please download the converter for jetson from the dev zone link [here](https://developer.nvidia.com/tlt-converter). 

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
!tlt-converter $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.etlt \
               -k $KEY \
               -c $USER_EXPERIMENT_DIR/experiment_dir_final/calibration.bin \
               -o output_cov/Sigmoid,output_bbox/BiasAdd \
               -d 3,544,960 \
               -i nchw \
               -m 64 \
               -t int8 \
               -e $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.trt \
               -b 4

## 10. Verify Deployed Model <a class="anchor" id="head-10"></a>
Verify the exported model by visualizing inferences on TensorRT.
In addition to running inference on a `.tlt` model in [step 8](#head-8), the `tlt-infer` tool is also capable of consuming the converted `TensorRT engine` from [step 9.B](#head-9-2).

*If after int-8 calibration the accuracy of the int-8 inferences seem to degrade, it could be because the there wasn't enough data in the calibration tensorfile used to calibrate thee model or, the training data is not entirely representative of your test images, and the calibration maybe incorrect. Therefore, you may either regenerate the calibration tensorfile with more batches of the training data, and recalibrate the model, or calibrate the model on a few images from the test set. This may be done using `--cal_image_dir` flag in the `tlt-export` tool. For more information, please follow the instructions in the USER GUIDE.

### A. Inference using TensorRT engine <a class="anchor" id="head-10-1"></a>

In [None]:
!tlt-infer detectnet_v2 -e $SPECS_DIR/detectnet_v2_inference_kitti_etlt.txt \
                        -o $USER_EXPERIMENT_DIR/etlt_infer_testing \
                        -i $DATA_DOWNLOAD_DIR/test_images \
                        -k $KEY

In [None]:
# visualize the first 12 inferenced images.
OUTPUT_PATH = 'etlt_infer_testing/images_annotated' # relative path from $USER_EXPERIMENT_DIR.
COLS = 4 # number of columns in the visualizer grid.
IMAGES = 8 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)