## Get the TensorRT tar file before running this Notebook

1. Visit https://developer.nvidia.com/tensorrt
2. Clicking `Download now` from step one directs you to https://developer.nvidia.com/nvidia-tensorrt-download where you have to Login/Join Now for Nvidia Developer Program Membership
3. Now, in the download page: Choose TensorRT 8 in available versions
4. Agree to Terms and Conditions
5. Click on TensorRT 8.6 GA to expand the available options
6. Click on 'TensorRT 8.6 GA for Linux x86_64 and CUDA 12.0 and 12.1 TAR Package' to dowload the TAR file
7. Upload the the tar file to your Google Drive

## Connect to GPU Instance

1. Change Runtime type to GPU by Runtime(Top Left tab)->Change Runtime Type->GPU(Hardware Accelerator)
1. Then click on Connect (Top Right)

## Mounting Google drive
Mount your Google drive storage to this Colab instance

In [None]:
import sys
if 'google.colab' in sys.modules:
    %env GOOGLE_COLAB=1
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
else:
    %env GOOGLE_COLAB=0
    print("Warning: Not a Colab Environment")

# Instance Segmentation using TAO MaskRCNN

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet50 model and train a MaskRCNN model on COCO dataset
* Evaluate the trained model
* Run Inference with the trained model and visualize the result
* Export the trained model to a .etlt file for deployment to DeepStream

### Table of Contents
This notebook shows an example use case for instance segmentation using the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables](#head-0)
1. [Prepare dataset and pre-trained model](#head-1)
2. [Setup GPU environment](#head-2) <br>
    2.1 [Setup Python environment](#head-2-1) <br>
3. [Generate tfrecords](#head-3)
4. [Provide training specification](#head-4)
5. [Run TAO training](#head-5)
6. [Evaluate trained models](#head-6)
7. [Prune trained model](#head-7)
8. [Retrain pruned models](#head-8)
9. [Evaluate retrained model](#head-9)
10. [Visualize inferences](#head-10)

#### Note
1. This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly
2. This notebook uses COCO dataset by default, which should be around ~25 GB. If you are limited by Google-Drive storage, we recommend to:

    i. Download the dataset onto the local system

    ii. Run the utility script at $COLAB_NOTEBOOKS/tensorflow/utils/generate_coco_subset.py in your local system

    iii. This generates a subset of coco dataset with number of sample images you wish for

    iv. Upload this subset onto Google Drive

3. Using the default config/spec file provided in this notebook, each weight file size of mask-rcnn created during training will be ~354 MB

## 0. Set up env variables and set FIXME parameters <a class="anchor" id="head-0"></a>

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

#### FIXME
1. NUM_GPUS - set this to <= number of GPU's availble on the instance
1. COLAB_NOTEBOOKS_PATH - for Google Colab environment, set this path where you want to clone the repo to; for local system environment, set this path to the already cloned repo
1. EXPERIMENT_DIR - set this path to a folder location where pretrained models, checkpoints and log files during different model actions will be saved
1. delete_existing_experiments - set to True to remove existing pretrained models, checkpoints and log files of a previous experiment
1. DATA_DIR - set this path to a folder location where you want to dataset to be present
1. delete_existing_data - set this to True to remove existing preprocessed and original data
1. trt_tar_path - set this path of the uploaded TensorRT tar.gz file after browser download
1. trt_untar_folder_path - set to path of the folder where the TensoRT tar.gz file has to be untarred into
1. trt_version - set this to the version of TRT you have downloaded

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env TAO_DOCKER_DISABLE=1

%env KEY=nvidia_tlt
#FIXME1
%env NUM_GPUS=1

#FIXME2
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/nvidia-tao
if os.environ["GOOGLE_COLAB"] == "1":
    if not os.path.exists(os.path.join(os.environ["COLAB_NOTEBOOKS_PATH"])):

      !git clone https://github.com/NVIDIA-AI-IOT/nvidia-tao.git $COLAB_NOTEBOOKS_PATH
else:
    if not os.path.exists(os.environ["COLAB_NOTEBOOKS_PATH"]):
        raise Exception("Error, enter the path of the colab notebooks repo correctly")

#FIXME3
%env EXPERIMENT_DIR=/content/drive/MyDrive/results/mask_rcnn
#FIXME4
delete_existing_experiments = True
#FIXME5
%env DATA_DIR=/content/drive/MyDrive/coco_data/
#FIXME6
delete_existing_data = False

if delete_existing_experiments:
    !sudo rm -rf $EXPERIMENT_DIR
if delete_existing_data:
    !sudo rm -rf $DATA_DIR

SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/mask_rcnn/specs"
%env SPECS_DIR={SPECS_DIR}
# Showing list of specification files.
!ls -rlt $SPECS_DIR

!sudo mkdir -p $DATA_DIR && sudo chmod -R 777 $DATA_DIR
!sudo mkdir -p $EXPERIMENT_DIR && sudo chmod -R 777 $EXPERIMENT_DIR

## 1. Prepare dataset and pre-trained model <a class="anchor" id="head-1"></a>

 We will be using the COCO dataset for the tutorial. The following script will download COCO dataset automatically and convert it to TFRecords. 

In [None]:
# Download and preprocess data
!bash $SPECS_DIR/download_coco.sh $DATA_DIR

Note that the dataset conversion scripts provided in `specs` are intended for the standard COCO dataset. If your data doesn't have `caption` groundtruth or test set, you can modify `download_and_preprocess_coco.sh` and `create_coco_tf_record.py` by commenting out corresponding variables.

In [None]:
# verify
!ls -l $DATA_DIR/raw-data

### Download pretrained model from NGC

 We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env LOCAL_PROJECT_DIR=/ngc_content/
%env CLI=ngccli_cat_linux.zip
!sudo mkdir -p $LOCAL_PROJECT_DIR/ngccli && sudo chmod -R 777 $LOCAL_PROJECT_DIR

# Remove any previously existing CLI installations
!sudo rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget --content-disposition 'https://api.ngc.nvidia.com/v2/resources/nvidia/ngc-apps/ngc_cli/versions/3.23.0/files/ngccli_linux.zip' -P $LOCAL_PROJECT_DIR/ngccli -O $LOCAL_PROJECT_DIR/ngccli/$CLI
!unzip -u -q "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))
!cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 $LOCAL_PROJECT_DIR/ngccli/ngc-cli/libstdc++.so.6

In [None]:
!ngc registry model list nvidia/tao/pretrained_instance_segmentation:*

In [None]:
!mkdir -p $EXPERIMENT_DIR/pretrained_resnet50/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_instance_segmentation:resnet50 --dest $EXPERIMENT_DIR/pretrained_resnet50

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $EXPERIMENT_DIR/pretrained_resnet50/pretrained_instance_segmentation_vresnet50

## 2. Setup GPU environment <a class="anchor" id="head-2"></a>


### 2.1 Setup Python environment <a class="anchor" id="head-2-1"></a>
Setup the environment necessary to run the TAO Networks by running the bash script

In [None]:
# FIXME 7: set this path of the uploaded TensorRT tar.gz file after browser download
trt_tar_path="/content/drive/MyDrive/TensorRT-8.6.1.6.Linux.x86_64-gnu.cuda-12.0.tar.gz"

import os
if not os.path.exists(trt_tar_path):
  raise Exception("TAR file not found in the provided path")

# FIXME 8: set to path of the folder where the TensoRT tar.gz file has to be untarred into
%env trt_untar_folder_path=/content/trt_untar
# FIXME 9: set this to the version of TRT you have downloaded
%env trt_version=8.6.1.6

!sudo mkdir -p $trt_untar_folder_path && sudo chmod -R 777 $trt_untar_folder_path/

import os

untar = True
for fname in os.listdir(os.environ.get("trt_untar_folder_path", None)):
  if fname.startswith("TensorRT-"+os.environ.get("trt_version")) and not fname.endswith(".tar.gz"):
    untar = False

if untar:
  !tar -xzf $trt_tar_path -C /content/trt_untar

if os.environ.get("LD_LIBRARY_PATH","") == "":
  os.environ["LD_LIBRARY_PATH"] = ""
trt_lib_path = f':{os.environ.get("trt_untar_folder_path")}/TensorRT-{os.environ.get("trt_version")}/lib'
os.environ["LD_LIBRARY_PATH"]+=trt_lib_path

In [None]:
import os
if os.environ["GOOGLE_COLAB"] == "1":
    os.environ["bash_script"] = "setup_env.sh"
else:
    os.environ["bash_script"] = "setup_env_desktop.sh"

!sed -i "s|PATH_TO_TRT|$trt_untar_folder_path|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
!sed -i "s|TRT_VERSION|$trt_version|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
!sed -i "s|PATH_TO_COLAB_NOTEBOOKS|$COLAB_NOTEBOOKS_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

!sh $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

## 3. Generate tfrecords <a class="anchor" id="head-3"></a>

In [None]:
 # convert training data to TFRecords
!tao model mask_rcnn dataset_convert -i $DATA_DIR/raw-data/train2017 \
                               -a $DATA_DIR/raw-data/annotations/instances_train2017.json \
                               -o $DATA_DIR --include_masks -t train -s 256

In [None]:
 # convert validation data to TFRecords
!tao model mask_rcnn dataset_convert -i $DATA_DIR/raw-data/val2017 \
                               -a $DATA_DIR/raw-data/annotations/instances_val2017.json \
                               -o $DATA_DIR --include_masks -t val -s 32

## 4. Provide training specification <a class="anchor" id="head-4"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/maskrcnn_train_resnet50.txt` 
Note that the learning rate in the spec file is set for 4 GPU training. If you have N gpus, you should divide LR by 4/N.
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* **Note that the sample spec is not meant to produce SOTA accuracy on COCO. To reproduce SOTA, you might want to use TAO to train an ImageNet model first and change the total_steps to 100K or above. In one experiment, we got 37+% AP and 34% mask_AP with 8GPU training for 100K.**

In [None]:
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/maskrcnn_train_resnet50.txt
!sed -i "s|EXPERIMENT_DIR_PATH|$EXPERIMENT_DIR/|g" $SPECS_DIR/maskrcnn_train_resnet50.txt
!cat $SPECS_DIR/maskrcnn_train_resnet50.txt

## 5. Train a MaskRCNN model <a class="anchor" id="head-5"></a>
* Provide the sample spec file and the output directory location for models
* Evaluation uses COCO metrics. For more info, please refer to: https://cocodataset.org/#detection-eval
* WARNING: training will take several hours or one day to complete

In [None]:
!rm -rf $EXPERIMENT_DIR/experiment_dir_unpruned && mkdir -p $EXPERIMENT_DIR/experiment_dir_unpruned

In [None]:
print("For multi-GPU, change --gpus based on your machine.")
!tao model mask_rcnn train -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                     -d $EXPERIMENT_DIR/experiment_dir_unpruned\
                     -k $KEY \
                     --gpus 1

In [None]:
print("To resume training from a checkpoint, simply run the same training script. It will pick up from where it's left.")
# !tao model mask_rcnn train -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
#                      -d $EXPERIMENT_DIR/experiment_dir_unpruned\
#                      -k $KEY \
#                      --gpus 1

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $EXPERIMENT_DIR/experiment_dir_unpruned/

## 6. Evaluate trained models <a class="anchor" id="head-6"></a>

In [None]:
%env NUM_EPOCH=10

In [None]:
!tao model mask_rcnn evaluate -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                        -m $EXPERIMENT_DIR/experiment_dir_unpruned/model.epoch-$10.tlt \
                        -k $KEY

## 7. Prune <a class="anchor" id="head-7"></a>

- Specify pre-trained model
- Equalization criterion (Only for resnets as they have element wise operations or MobileNets.)
- Threshold for pruning.
- A key to save and load the model
- Output directory to store the model

Usually, you just need to adjust -pth (threshold) for accuracy and model size trade off. Higher pth gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. 0.4 in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
# Create an output directory to save the pruned model.
!mkdir -p $EXPERIMENT_DIR/experiment_dir_pruned

In [None]:
!tao model mask_rcnn prune -m $EXPERIMENT_DIR/experiment_dir_unpruned/model.epoch-$NUM_EPOCH.tlt \
                     -o $EXPERIMENT_DIR/experiment_dir_pruned \
                     -pth 0.7 \
                     -k $KEY

In [None]:
!ls -l $EXPERIMENT_DIR/experiment_dir_pruned

**Note** that you should retrain the pruned model first, as it cannot be directly used for evaluation or inference. 

## 8. Retrain pruned models <a class="anchor" id="head-8"></a>

- Model needs to be re-trained to bring back accuracy after pruning
- Specify re-training specification
- WARNING: training will take several hours or one day to complete

In [None]:
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/maskrcnn_retrain_resnet50.txt
!sed -i "s|EXPERIMENT_DIR_PATH|$EXPERIMENT_DIR/|g" $SPECS_DIR/maskrcnn_retrain_resnet50.txt
!cat $SPECS_DIR/maskrcnn_retrain_resnet50.txt

In [None]:
!mkdir -p $EXPERIMENT_DIR/experiment_dir_retrain

In [None]:
!tao model mask_rcnn train -e $SPECS_DIR/maskrcnn_retrain_resnet50.txt \
                     -d $EXPERIMENT_DIR/experiment_dir_retrain\
                     -k $KEY \
                     --gpus 1

## 9. Evaluate retrained model <a class="anchor" id="head-9"></a>

In [None]:
%env NUM_EPOCH=10

In [None]:
!tao model mask_rcnn evaluate -e $SPECS_DIR/maskrcnn_retrain_resnet50.txt \
                        -m $EXPERIMENT_DIR/experiment_dir_retrain/model.epoch-$NUM_EPOCH.tlt \
                        -k $KEY

## 10. Visualize inferences <a class="anchor" id="head-10"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results. The `infer` tool produces annotated image outputs. You can choose to draw bounding boxes only or draw both bboxes and masks.

In [None]:
# Running inference for detection on n images
!tao model mask_rcnn inference -i $DATA_DIR/val/images \
                         -r $EXPERIMENT_DIR/maskrcnn_annotated_images \
                         -e $SPECS_DIR/maskrcnn_train_resnet50.txt \
                         -m $EXPERIMENT_DIR/experiment_dir_unpruned/model.epoch-$NUM_EPOCH.tlt \
                         -t 0.5 \
                         -k $KEY \
                         --include_mask

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
OUTPUT_PATH = 'maskrcnn_annotated_images/images_annotated' # relative path from $EXPERIMENT_DIR.
COLS = 2 # number of columns in the visualizer grid.
IMAGES = 4 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)