# Instance Segmentation using TLT MaskRCNN

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Transfer Learning Toolkit (TLT) is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TLT to:

* Take a pretrained resnet50 model and train a MaskRCNN model on COCO dataset
* Evaluate the trained model
* Run Inference with the trained model and visualize the result
* Export the trained model to a .etlt file for deployment to DeepStream
* Run inference on the exported. etlt model to verify deployment using TensorRT

### Table of Contents
This notebook shows an example use case for instance segmentation using the Transfer Learning Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TLT Launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TLT training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Visualize inferences](#head-6)
7. [Deploy](#head-7)
8. [Verify the deployed model](#head-8)

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TLT experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/mask_rcnn`. More information on how to set up the dataset and the supported steps in the TLT workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [42]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=c29sMGZyZnVrZGdiOGk1aTExcjB0MHRobGY6NWI5OTI3MzQtYTgzNS00NTQyLTk0YWMtNDU4ODI1MzRjZTQ1
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tlt-experiments/mask_rcnn
%env DATA_DOWNLOAD_DIR=/workspace/tlt-experiments/data

# Set this path if you don't run the notebook from the samples directory.
%env NOTEBOOK_ROOT=/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-samples/mask_rcnn

# Please define this local project directory that needs to be mapped to the TLT docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/mask_rcnn
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
%env LOCAL_PROJECT_DIR=/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "mask_rcnn"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tlt-experiments/mask_rcnn/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

env: KEY=c29sMGZyZnVrZGdiOGk1aTExcjB0MHRobGY6NWI5OTI3MzQtYTgzNS00NTQyLTk0YWMtNDU4ODI1MzRjZTQ1
env: NUM_GPUS=1
env: USER_EXPERIMENT_DIR=/workspace/tlt-experiments/mask_rcnn
env: DATA_DOWNLOAD_DIR=/workspace/tlt-experiments/data
env: NOTEBOOK_ROOT=/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-samples/mask_rcnn
env: LOCAL_PROJECT_DIR=/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments
env: SPECS_DIR=/workspace/tlt-experiments/mask_rcnn/specs
total 32
-rwxrwxrwx 1 luis luis   692 ago 24 23:17 coco_labels.txt
-rwxrwxrwx 1 luis luis  4347 ago 24 23:17 download_and_preprocess_coco.sh
-rwxrwxrwx 1 luis luis 12311 ago 24 23:17 create_coco_tf_record.py
-rwxrwxrwx 1 luis luis  2037 oct  7 12:18 maskrcnn_train_resnet50.txt


In [2]:
!pip3 show nvidia-tlt

Name: nvidia-tlt
Version: 0.1.19
Summary: NVIDIA's Launcher for TAO Toolkit.
Home-page: UNKNOWN
Author: Varun Praveen
Author-email: vpraveen@nvidia.com
License: NVIDIA Proprietary License
Location: /home/luis/.local/lib/python3.8/site-packages
Requires: certifi, docker-pycreds, requests, websocket-client, tabulate, chardet, idna, six, docker, urllib3
Required-by: 


The cell below maps the project directory on your local host to a workspace directory in the TLT docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [43]:
# Mapping up the local directories to the TLT docker.
import json
mounts_file = os.path.expanduser("~/.tlt_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tlt-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [44]:
!cat ~/.tlt_mounts.json

{
    "Mounts": [
        {
            "source": "/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments",
            "destination": "/workspace/tlt-experiments"
        },
        {
            "source": "/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-samples/mask_rcnn/specs",
            "destination": "/workspace/tlt-experiments/mask_rcnn/specs"
        }
    ]
}

## 1. Installing the TLT launcher <a class="anchor" id="head-1"></a>
The TLT launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TLT recommends users to run the TLT launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TLT python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be trigerred to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

Please note that TLT recommends users to run the TLT launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x```
where x >=6 and x <=8.

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TLT python package, please make sure of the following software requirements,

- python >=3.6.9 < 3.8.x
- docker-ce > 19.03.5
- docker-API 1.40
- nvidia-container-toolkit > 1.3.0-1
- nvidia-container-runtime > 3.4.0-1
- nvidia-docker2 > 2.5.0-1
- nvidia-driver > 455+

After setting up your virtual environment with the above requirements, install TLT pip package.

In [6]:
# SKIP this step IF you have already installed the tlt launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tlt

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com


In [5]:
# View the versions of the TLT launcher
!tlt info

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
Configuration of the TAO Toolkit Instance
dockers: ['nvidia/tao/tao-toolkit-tf', 'nvidia/tao/tao-toolkit-pyt', 'nvidia/tao/tao-toolkit-lm']
format_version: 1.0
toolkit_version: 3.21.08
published_date: 08/17/2021


## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

 We will be using the COCO dataset for the tutorial. The following script will download COCO dataset automatically and convert it to TFRecords. 

In [45]:
!tlt mask_rcnn run bash $SPECS_DIR/download_and_preprocess_coco.sh $DATA_DOWNLOAD_DIR

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 12:26:18,873 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
+ '[' -z /workspace/tlt-experiments/data ']'
+ echo 'Cloning Tensorflow models directory (for conversion utilities)'
Cloning Tensorflow models directory (for conversion utilities)
+ '[' '!' -e tf-models ']'
+ git clone http://github.com/tensorflow/models tf-models
Cloning into 'tf-models'...
remote: Enumerating objects: 64176, done.[K
remote: Counting object

In [46]:
!echo $DATA_DOWNLOAD_DIR

/workspace/tlt-experiments/data


In [47]:
# verify
!ls -l $LOCAL_DATA_DIR

total 20571028
drwxrwxrwx 2 root root     4096 oct  6 22:09 annotations
drwxrwxrwx 6 root root     4096 oct  7 12:29 raw-data
-rwxrwxrwx 1 root root 75763948 oct  7 12:36 train-00000-of-00256.tfrecord
-rwxrwxrwx 1 root root 76083838 oct  7 12:36 train-00001-of-00256.tfrecord
-rwxrwxrwx 1 root root 76832268 oct  7 12:36 train-00002-of-00256.tfrecord
-rwxrwxrwx 1 root root 77138520 oct  7 12:36 train-00003-of-00256.tfrecord
-rwxrwxrwx 1 root root 75993126 oct  7 12:36 train-00004-of-00256.tfrecord
-rwxrwxrwx 1 root root 74183890 oct  7 12:36 train-00005-of-00256.tfrecord
-rwxrwxrwx 1 root root 76538425 oct  7 12:36 train-00006-of-00256.tfrecord
-rwxrwxrwx 1 root root 78167315 oct  7 12:36 train-00007-of-00256.tfrecord
-rwxrwxrwx 1 root root 77985892 oct  7 12:36 train-00008-of-00256.tfrecord
-rwxrwxrwx 1 root root 78332484 oct  7 12:36 train-00009-of-00256.tfrecord
-rwxrwxrwx 1 root root 78275302 oct  7 12:36 train-00010-of-00256.tfrecord
-rwxrwxrwx 1 root root 78505411 oct  7 12:36 trai

### Download pretrained model from NGC

 We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar.

In [22]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_reg_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

env: CLI=ngccli_reg_linux.zip
--2021-10-07 11:59:23--  https://ngc.nvidia.com/downloads/ngccli_reg_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 143.204.166.129, 143.204.166.48, 143.204.166.43, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|143.204.166.129|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25097830 (24M) [application/zip]
Saving to: ‘/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/ngccli/ngccli_reg_linux.zip’


2021-10-07 11:59:31 (3.48 MB/s) - ‘/home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/ngccli/ngccli_reg_linux.zip’ saved [25097830/25097830]

Archive:  /home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/ngccli/ngccli_reg_linux.zip
  inflating: /home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/ngccli/ngc  
 extracting: /home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/ngccli/ngc.md5  


In [26]:
!ngc registry model list nvidia/tlt_instance_segmentation:*

+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| Versi | Accur | Epoch | Batch | GPU   | Memor | File  | Statu | Creat |
| on    | acy   | s     | Size  | Model | y Foo | Size  | s     | ed    |
|       |       |       |       |       | tprin |       |       | Date  |
|       |       |       |       |       | t     |       |       |       |
+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| resne | 76.64 | 80    | 1     | V100  | 182.8 | 182.8 | UPLOA | Aug   |
| t50   |       |       |       |       |       | 4 MB  | D_COM | 03,   |
|       |       |       |       |       |       |       | PLETE | 2020  |
| resne | 76.5  | 80    | 1     | V100  | 163.6 | 163.5 | UPLOA | Aug   |
| t34   |       |       |       |       |       | 5 MB  | D_COM | 03,   |
|       |       |       |       |       |       |       | PLETE | 2020  |
| resne | 74.83 | 80    | 1     | V100  | 86.2  | 86.25 | UPLOA | Aug   |
| t18   |       |       |       |     

In [28]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet50/

In [29]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tlt_instance_segmentation:resnet50 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet50

Downloaded 169.2 MB in 54s, Download speed: 3.13 MB/s               
----------------------------------------------------
Transfer id: tlt_instance_segmentation_vresnet50 Download status: Completed.
Downloaded local path: /home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-experiments/mask_rcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50
Total files downloaded: 1 
Total downloaded size: 169.2 MB
Started at: 2021-10-07 12:05:57.016779
Completed at: 2021-10-07 12:06:51.088988
Duration taken: 54s
----------------------------------------------------


In [48]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet50/tlt_instance_segmentation_vresnet50

Check that model is downloaded into dir.
total 187232
-rw------- 1 luis luis 191719744 oct  7 12:06 resnet50.hdf5


## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/maskrcnn_train_resnet50.txt` 
Note that the learning rate in the spec file is set for 4 GPU training. If you have N gpus, you should divide LR by 4/N.
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* **Note that the sample spec is not meant to produce SOTA accuracy on COCO. To reproduce SOTA, you might want to use TLT to train an ImageNet model first and change the total_steps to 100K or above. In one experiment, we got 37+% AP and 34% mask_AP with 8GPU training for 100K.**

In [32]:
!cat $LOCAL_SPECS_DIR

cat: /home/luis/GitHub/rgbd-pepper-pose-estimation/Mask_RCNN/tlt/tlt-samples/mask_rcnn/specs: Is a directory


In [49]:
!cat $LOCAL_SPECS_DIR/maskrcnn_train_resnet50.txt

seed: 123
use_amp: False
warmup_steps: 1000
checkpoint: "/workspace/tlt-experiments/mask_rcnn/pretrained_resnet50/tlt_instance_segmentation_vresnet50/resnet50.hdf5"
learning_rate_steps: "[10000, 15000, 20000]"
learning_rate_decay_levels: "[0.1, 0.02, 0.01]"
total_steps: 25000
train_batch_size: 1
eval_batch_size: 1
num_steps_per_eval: 5000
momentum: 0.9
l2_weight_decay: 0.0001
warmup_learning_rate: 0.000025
init_learning_rate: 0.0025

data_config{
    image_size: "(768, 1152)"
    augment_input_data: False
    eval_samples: 500
    training_file_pattern: "/workspace/tlt-experiments/data/train*.tfrecord"
    validation_file_pattern: "/workspace/tlt-experiments/data/val*.tfrecord"
    val_json_file: "/workspace/tlt-experiments/data/annotations/instances_val2017.json"

    # dataset specific parameters
    num_classes: 91
    skip_crowd_during_training: True
}

maskrcnn_config {
    nlayers: 50
    arch: "resnet"
    freeze_bn: True
    freeze_blocks: "[0,1]"
    gt_mask_size: 112
        

## 4. Train a MaskRCNN model <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [34]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned

In [55]:
print("For multi-GPU, change --gpus based on your machine.")
!tlt mask_rcnn train -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                 -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned\
                 -k $KEY \
                 --gpus 1

For multi-GPU, change --gpus based on your machine.
The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 13:41:50,482 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.


[MaskRCNN] INFO    : Loading weights from /workspace/tlt-experiments/mask_rcnn/experiment_dir_unpruned/model.step-10000.tlt
[MaskRCNN] INFO    : Loading weights from /workspace/tlt-experiments/mask_rcnn/experiment_dir_unpruned/model.step-10000.tlt
[MaskRC

In [39]:
!tlt mask_rcnn

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 12:21:53,064 [INFO] root: Registry: ['nvcr.io']
2021-10-07 12:21:53,138 [INFO] tlt.components.instance_handler.local_instance: No commands provided to the launcher
Kicking off an interactive docker session.
NOTE: This container instance will be terminated when you exit.
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
]0;root@985d5aedf4f3: /workspaceroot@985d5aedf4f3:/workspace# ^C

]0;root@985d5aedf4f3: /workspaceroot@985d5aedf4f3:/workspace# 

In [None]:
print("To resume training from a checkpoint, simply run the same training script. It will pick up from where it's left.")
!tlt mask_rcnn train -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                 -d $USER_EXPERIMENT_DIR/experiment_dir_unpruned\
                 -k $KEY \
                 --gpus 2

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

In [56]:
%env NUM_STEP=15000

env: NUM_STEP=15000


In [57]:
!tlt mask_rcnn evaluate -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                    -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.tlt \
                    -k $KEY

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:10:20,985 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.


[MaskRCNN] INFO    : [eval] AMP is activated - Experiment Feature
[MaskRCNN] INFO    : Starting to evaluate.
[MaskRCNN] INFO    : Loading weights from /workspace/tlt-experiments/mask_rcnn/experiment_dir_unpruned/model.step-15000.tlt
loading annotations into memory...
Done (t=0.49s)
creating index...
index 

## 6. Visualize inferences <a class="anchor" id="head-6"></a>
In this section, we run the tlt-infer tool to generate inferences on the trained models and visualize the results. The `tlt-infer` tool produces annotated image outputs. You can choose to draw bounding boxes only or draw both bboxes and masks.

In [65]:
!echo $NUM_STEP

15000


In [67]:
# Running inference for detection on n images
!tlt mask_rcnn inference -i $DATA_DOWNLOAD_DIR/raw-data/test2017 \
                     -o $USER_EXPERIMENT_DIR/maskrcnn_annotated_images \
                     -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                     -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.tlt \
                     -l $SPECS_DIR/coco_labels.txt \
                     -t 0.5 \
                     -k $KEY \
                     --include_mask

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:21:03,828 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.


Label file does not exist. Skipping...
[MaskRCNN] INFO    : [eval] AMP is activated - Experiment Feature
[MaskRCNN] INFO    : Running inference...
[MaskRCNN] INFO    : Loading weights from /workspace/tlt-experiments/mask_rcnn/experiment_dir_unpruned/model.step-15000.tlt

[MaskRCNN] INFO    : **************

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'maskrcnn_annotated_images' # relative path from $USER_EXPERIMENT_DIR.
COLS = 2 # number of columns in the visualizer grid.
IMAGES = 4 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

## 7. Deploy! <a class="anchor" id="head-7"></a>

In [68]:
# Export in FP32 mode. 
!mkdir -p $LOCAL_EXPERIMENT_DIR/export 
!tlt mask_rcnn export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.tlt \
                  -k $KEY \
                  -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                  --batch_size 1 \
                  --data_type fp32 \
                  --engine_file $USER_EXPERIMENT_DIR/export/model.step-$NUM_STEP.engine

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:24:31,198 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
Environment variab

In [69]:
# Export in INT8 mode. 
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Uncomment to remove existing etlt file
# !rm $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-25000.etlt
!tlt mask_rcnn export -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.tlt \
                  -k $KEY \
                  -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                  --batch_size 1 \
                  --data_type int8 \
                  --cal_image_dir $DATA_DOWNLOAD_DIR/raw-data/val2017 \
                  --batches 10 \
                  --cal_cache_file $USER_EXPERIMENT_DIR/export/maskrcnn.cal \
                  --cal_data_file $USER_EXPERIMENT_DIR/export/maskrcnn.tensorfile

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:26:16,487 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.
Environment variables with the 'NUMBAPRO' prefix are deprecated and consequently ignored, found use of NUMBAPRO_NVVM=/usr/local/cuda/nvvm/lib64/libnvvm.so.

For more information about alternatives visit: ('http://numba.pydata.org/numba-doc/latest/cuda/overview.html', '#cudatoolkit-lookup')
Environment variab

In [70]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/

total 1729696
drwxr-xr-x 2 root root      4096 oct  7 14:12 eval
-rw-r--r-- 1 root root    108696 oct  7 14:05 eval_graph.json
-rw-r--r-- 1 root root  45682791 oct  7 13:38 events.out.tfevents.1633629175.6cf7d83ac95c
-rw-r--r-- 1 root root  45745868 oct  7 14:05 events.out.tfevents.1633632135.b359256ab15a
-rw-r--r-- 1 root root  29004015 oct  7 13:42 graph.pbtxt
-rw-r--r-- 1 root root   1798251 oct  7 14:09 log.txt
-rw-r--r-- 1 root root 367638243 oct  7 12:53 model.step-0.tlt
-rw-r--r-- 1 root root 367654056 oct  7 13:42 model.step-10000.tlt
-rw-r--r-- 1 root root 178129839 oct  7 14:25 model.step-15000.etlt
-rw-r--r-- 1 root root 367654120 oct  7 14:05 model.step-15000.tlt
-rw-r--r-- 1 root root 367638327 oct  7 13:15 model.step-5000.tlt
-rw-r--r-- 1 root root    110490 oct  7 14:07 train_graph.json


Verify engine generation using the `tlt-converter` utility included with the docker.

The `tlt-converter` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please instantiate this docker and execute the `tlt-converter` command, with the exported `.etlt` file and calibration cache (for int8 mode) on your target device. The converter utility included in this docker only works for x86 devices, with discrete NVIDIA GPU's. 

For the jetson devices, please download the converter for jetson from the dev zone link [here](https://developer.nvidia.com/tlt-converter). 

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Please refer to [deepstream dev guide](https://docs.nvidia.com/metropolis/deepstream/dev-guide/index.html) for more details.

In [71]:
print('Exported model:')
print('------------')
!ls -lth $LOCAL_EXPERIMENT_DIR/export

Exported model:
------------
total 289M
-rw-r--r-- 1 root root 289M oct  7 14:25 model.step-15000.engine


In [72]:
# Convert to TensorRT engine(FP16).
!tlt tlt-converter -k $KEY  \
               -d 3,832,1344 \
               -o generate_detections,mask_head/mask_fcn_logits/BiasAdd \
               -e $USER_EXPERIMENT_DIR/export/trt.fp16.engine \
               -t fp16 \
               -i nchw \
               -m 1 \
               $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.etlt

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
usage: tao [-h]
           {list,stop,info,augment,bpnet,classification,converter,detectnet_v2,dssd,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,punctuation_and_capitalization,question_answering,retinanet,speech_to_text,speech_to_text_citrinet,ssd,text_classification,token_classification,unet,yolo_v3,yolo_v4}
           ...
tao: error: invalid choice: 'tlt-converter' (choose from 'list', 'stop', 'info', 'augment', 'bpnet', 'classification', 'converter', 'detectnet_v2', 'dssd', 'emotionnet', 'faster_rcnn', 'fpenet', 'gazenet', 'gesturenet', 'heartratenet', 'intent_slot_classification', 'lprnet', 'mask_rcnn', 'multita

In [75]:
!tlt mask_rcnn

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:30:05,063 [INFO] root: Registry: ['nvcr.io']
2021-10-07 14:30:05,129 [INFO] tlt.components.instance_handler.local_instance: No commands provided to the launcher
Kicking off an interactive docker session.
NOTE: This container instance will be terminated when you exit.
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
]0;root@f651ed3665be: /workspaceroot@f651ed3665be:/workspace# ^C

]0;root@f651ed3665be: /workspaceroot@f651ed3665be:/workspace# 

In [73]:
# Convert to TensorRT engine(INT8).
!tlt tlt-converter -k $KEY  \
               -d 3,832,1344 \
               -o generate_detections,mask_head/mask_fcn_logits/BiasAdd \
               -c $USER_EXPERIMENT_DIR/export/maskrcnn.cal \
               -e $USER_EXPERIMENT_DIR/export/trt.int8.engine \
               -b 8 \
               -m 1 \
               -t int8 \
               -i nchw \
               $USER_EXPERIMENT_DIR/experiment_dir_unpruned/model.step-$NUM_STEP.etlt

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
usage: tao [-h]
           {list,stop,info,augment,bpnet,classification,converter,detectnet_v2,dssd,emotionnet,faster_rcnn,fpenet,gazenet,gesturenet,heartratenet,intent_slot_classification,lprnet,mask_rcnn,multitask_classification,n_gram,punctuation_and_capitalization,question_answering,retinanet,speech_to_text,speech_to_text_citrinet,ssd,text_classification,token_classification,unet,yolo_v3,yolo_v4}
           ...
tao: error: invalid choice: 'tlt-converter' (choose from 'list', 'stop', 'info', 'augment', 'bpnet', 'classification', 'converter', 'detectnet_v2', 'dssd', 'emotionnet', 'faster_rcnn', 'fpenet', 'gazenet', 'gesturenet', 'heartratenet', 'intent_slot_classification', 'lprnet', 'mask_rcnn', 'multita

In [None]:
print('Exported engine:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export/

## 8. Verify the deployed model <a class="anchor" id="head-8"></a>

Verify the converted engine by visualizing TensorRT inferences.

In [76]:
# Running inference for detection on a dir of images
!tlt mask_rcnn inference -i $DATA_DOWNLOAD_DIR/raw-data/test2017 \
                     -o $USER_EXPERIMENT_DIR/maskrcnn_annotated_images \
                     -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                     -m $USER_EXPERIMENT_DIR/export/model.step-$NUM_STEP.engine \
                     -l $USER_EXPERIMENT_DIR/maskrcnn_annotated_labels \
                     -c $SPECS_DIR/coco_labels.txt \
                     -t 0.5 \
                     --include_mask

The `nvidia-tlt` package will be deprecated soon. Going forward please migrate to using the `nvidia-tao` package.

~/.tao_mounts.json wasn't found. Falling back to obtain mount points and docker configs from ~/.tlt_mounts.json.
Please note that this will be deprecated going forward.
2021-10-07 14:38:42,161 [INFO] root: Registry: ['nvcr.io']
Docker will run the commands as root. If you would like to retain your
local host permissions, please add the "user":"UID:GID" in the
DockerOptions portion of the "/home/luis/.tlt_mounts.json" file. You can obtain your
users UID and GID by using the "id -u" and "id -g" commands on the
terminal.
Using TensorFlow backend.
Using TensorFlow backend.


  1%|▌                                    | 577/40670 [02:17<2:39:29,  4.19it/s]^C
Traceback (most recent call last):
  File "/usr/local/bin/mask_rcnn", line 8, in <module>
    sys.exit(main())
  File "/root/.cache/bazel/_bazel_root/ed34e6d125608f91724fda23656f1726/execroot/ai_infra/bazel-out/k8-fastbuild/

In [None]:
!ls -l $LOCAL_EXPERIMENT_DIR/maskrcnn_annotated_images