# Optical Character Detection using TAO OCDNet-ViT

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">


## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train an OCDNet-ViT model on ICDAR2015 dataset
* Evaluate the trained model
* Run inference with the trained model and visualize the result
* Export the trained model to an .onnx file for deployment to DeepStream
* Generate TensorRT engine using tao-deploy and verify the engine through evaluation

At the end of this notebook, you will have generated a trained `ocdnet-ViT` model
which you may deploy via [DeepStream](https://developer.nvidia.com/deepstream-sdk).

For more information about OCDNet-ViT, you can take a look at [OCDNet-ViT](https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/ocd.html) documentation page.

Following is a sample prediction of OCDNet-ViT model.
<!--- from img_5.jpg of ICDAR2015 test dataset -->
<img align="center" src="https://github.com/vpraveen-nv/model_card_images/blob/main/cv/notebook/ocdnet/img_5_result.jpg?raw=true" width="640">

## Table of Contents

This notebook shows an example usecase of OCDNet-ViT using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Evaluate retrained models](#head-8)
9. [Inferences](#head-9)
10. [Deploy](#head-10)

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/ocdnet_vit/results`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

The TAO launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher. <br>

`IMPORTANT NOTE:` The code below creates a sample `~/.tao_mounts.json`  file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.


In [None]:
import os

# Please define this local project directory that needs to be mapped to the TAO docker session.
%env LOCAL_PROJECT_DIR=/path/to/local/tao-experiments

os.environ["HOST_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data", "ocdnet_vit")
os.environ["HOST_RESULTS_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "ocdnet_vit", "results")

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/ocdnet

# The sample spec files are present in the same path as the downloaded samples.
os.environ["HOST_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)


In [None]:
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR

In [None]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
   "Mounts":[
         # Mapping the Local project directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        {
           "source": os.environ["HOST_DATA_DIR"],
           "destination": "/data/ocdnet_vit"
        },
        {
           "source": os.environ["HOST_SPECS_DIR"],
           "destination": "/specs"
        },
        {
           "source": os.environ["HOST_RESULTS_DIR"],
           "destination": "/results"
        }
   ],
   "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         },
        "user": "{}:{}".format(os.getuid(), os.getgid()),
        "network": "host"
   }
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(tlt_configs, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.7, <=3.10.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 525.81+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python >=3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the virtualenv and virtualenvwrapper packages.

In [None]:
# SKIP this step IF you have already installed the TAO launcher.
!pip3 install nvidia-tao

In [None]:
# View the versions of the TAO launcher
!tao info

## 2. Prepare dataset and pre-trained models<a class="anchor" id="head-2"></a>

### 2.1 Prepare dataset

 We will be using the `ICDAR2015` dataset for the OCDNet-ViT tutorial. Please access https://rrc.cvc.uab.es/?ch=4&com=tasks to register and download the data from `Task 4.1: Text Localization`. Unzip the files to `$HOST_DATA_DIR/`. 

 The data will then be extracted to have below structure.

```bash
│── train
│   ├──img
|   ├──gt
│── test
│   ├──img
|   ├──gt
```

In [None]:
# Create local dir
!mkdir -p $HOST_DATA_DIR/train/img
!mkdir -p $HOST_DATA_DIR/train/gt
!mkdir -p $HOST_DATA_DIR/test/img
!mkdir -p $HOST_DATA_DIR/test/gt
# unzip training data
!unzip $HOST_DATA_DIR/ch4_training_images.zip -d $HOST_DATA_DIR/train/img
!unzip $HOST_DATA_DIR/ch4_training_localization_transcription_gt.zip -d $HOST_DATA_DIR/train/gt
# unzip test data
!unzip $HOST_DATA_DIR/ch4_test_images.zip -d $HOST_DATA_DIR/test/img
!unzip $HOST_DATA_DIR/Challenge4_Test_Task1_GT.zip -d $HOST_DATA_DIR/test/gt

In [None]:
# Verification. Training dataset contains 1000 images. Test dataset contains 500 images.
!ls $HOST_DATA_DIR/train/img  |wc -l
!ls $HOST_DATA_DIR/train/gt  |wc -l
!ls $HOST_DATA_DIR/test/img  |wc -l
!ls $HOST_DATA_DIR/test/gt  |wc -l

__[Optional]__ If your image size is high resolution(such as 4000x4000) and you want to crop your images to small size for training, you can use following cell to run offline crop. This offline crop needs you to specify your excepted cropped image size(such as 800x800), it will generate cropped images and related labels to the `patch` folder under your dataset path.

In [None]:
# Offline crop for training dataset 
# !python3 offline_crop.py --dataset-path $HOST_DATA_DIR/train \
#                          --patch-height 640 \
#                          --patch-width 640 \
#                          --overlapRate 0.5 \
#                          --has-gt True \
#                          --img-ext jpg \
#                          --visible True

### 2.2 Prepare pre-trained models

We will use NGC CLI to get the pre-trained models. For more details, go to [ngc.nvidia.com](ngc.nvidia.com) and click the SETUP on the navigation bar.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
import os
import platform

if platform.machine() == "x86_64":
    os.environ["CLI"]="ngccli_linux.zip"
else:
    os.environ["CLI"]="ngccli_arm64.zip"


# Remove any previously existing CLI installations
!rm -rf $HOST_RESULTS_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $HOST_RESULTS_DIR/ngccli
!unzip -u "$HOST_RESULTS_DIR/ngccli/$CLI" -d $HOST_RESULTS_DIR/ngccli/
!rm $HOST_RESULTS_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("HOST_RESULTS_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tao/ocdnet:*

In [None]:
!mkdir -p $HOST_RESULTS_DIR/pretrained_ocdnet/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/ocdnet:trainable_ocdnet_vit_v1.0 --dest $HOST_RESULTS_DIR/pretrained_ocdnet
!ngc registry model download-version nvidia/tao/ocdnet:trainable_ocdnet_vit_v1.1 --dest $HOST_RESULTS_DIR/pretrained_ocdnet
!ngc registry model download-version nvidia/tao/ocdnet:trainable_ocdnet_vit_v1.2 --dest $HOST_RESULTS_DIR/pretrained_ocdnet
!ngc registry model download-version nvidia/tao/ocdnet:trainable_ocdnet_vit_v1.3 --dest $HOST_RESULTS_DIR/pretrained_ocdnet

In [None]:
print("Check the models are downloaded into dir.")
!ls -l $HOST_RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.0
!ls -l $HOST_RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.1
!ls -l $HOST_RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.2
!ls -l $HOST_RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.3

## 3. Provide training specification <a class="anchor" id="head-3"></a>

We provide specification files to configure the training parameters including:

* train: configure the training hyperparameters
    * num_gpus: number of gpus 
    * results_dir: Path to restore training result
    * resume_training_checkpoint_path: Resume training from a checkpoint.
    * num_epochs: The total epochs for training
    * validation_interval: validation interval
    * checkpoint_interval: checkpoint interval
    * model_ema: Default to False. If set to True, model ema will enable during training
    * model_ema_decay: Default to 0.999. The decay of model ema, this is only used when model_ema set to True
    * precision： Default to fp32. If set to 'fp16', the AMP training will be enabled
    * optimizer
        * type: Defaults to Adam.
        * lr: Initial learning rate 
    * lr_scheduler
        * type: Only supports WarmupPolyLR
        * warmup_epoch: The epoch numbers for warm up to initinal learning rate. It should be different from num_epochs. 
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * thresh: The threshold for binarization.
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.
    * Metric
        * type: Only supports QuadMetric
        * is_output_polygon: Defaults to false. False for bounding box. True for polygon.
* dataset: configure the dataset and augmentation methods
    * train_dataset:
        * data_path: Path to train images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * size: Ramdom crop size during training. Defaults to [640, 640].
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading 
    * validate_dataset: 
        * data_path: Path to validation images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * short_size: Resize to width x height during evaluation. Defaults to [1280, 736].
            * resize_text_polys: Resize the coordinate of text groudtruth. Defaults to true.
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading            
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Defaults to False. Must set to true if train a model which is pruned. 
    * pruned_graph_path: The path to the pruned model graph.
    * pretrained_model_path: Finetune from a pretrained model. The `.pth` model is supported.
    * enlarge_feature_map_size: Defaults to False. To get better accuracy, we enlarge the output feature map size of FAN backbone.
    * activation_checkpoint: Defaults to False. We use activation checkpoint to save the GPU memory.

Please refer to the TAO documentation about OCDNet-ViT to get more parameters that are configurable.


In [None]:
!cat $HOST_SPECS_DIR/train_ocdnet_vit.yaml

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models

In [None]:
# NOTE: The following paths are set from the perspective of the TAO Docker.

# The data is saved here
%env DATA_DIR=/data/ocdnet_vit
%env SPECS_DIR=/specs
%env RESULTS_DIR=/results

In [None]:
#print("Run training with ngc pretrained model. ")
#Please note that this training will take a while if using default specs (30 epochs).

!tao model ocdnet train \
          -e $SPECS_DIR/train_ocdnet_vit.yaml \
          results_dir=$RESULTS_DIR/train \
          model.pretrained_model_path=$RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.0/ocdnet_fan_tiny_2x_icdar.pth

In [None]:
#print("Resume training with the checkpoint corresponding to your set epoch.")
#For example, resume training from 3rd checkpoint as below.

# %env NUM_EPOCH=000
# !cp "`ls -rlt $HOST_RESULTS_DIR/train/ocd_model_epoch\=${NUM_EPOCH}*.pth |tail -1 |awk -F " " '{print $NF}'`" $HOST_RESULTS_DIR/train/resume.pth

# !tao model ocdnet train \
#          -e $SPECS_DIR/train_ocdnet_vit.yaml \
#          results_dir=$RESULTS_DIR/train \
#          train.resume_training_checkpoint_path=$RESULTS_DIR/train/resume.pth

In [None]:
print("For multi-GPU, change train.num_gpus in train.yaml or set train.num_gpus in commandline based on your machine.")
#For example, run training with 2gpus.
# !tao model ocdnet train \
#          -e $SPECS_DIR/train_ocdnet_vit.yaml \
#          results_dir=$RESULTS_DIR/train \
#          model.pretrained_model_path=$RESULTS_DIR/pretrained_ocdnet/ocdnet_vtrainable_ocdnet_vit_v1.0/ocdnet_fan_tiny_2x_icdar.pth \
#          train.num_gpus=2

In [None]:
print('Trained checkpoints:')
print('---------------------')
!ls -ltrh $HOST_RESULTS_DIR/train

In [None]:
# You can set NUM_EPOCH to the epoch corresponding to any saved checkpoint
# %env NUM_EPOCH=029

# Get the name of the checkpoint corresponding to your set epoch
# tmp=!ls $HOST_RESULTS_DIR/train/*.pth | grep epoch_$NUM_EPOCH
# %env CHECKPOINT={tmp[0]}

# Or get the latest checkpoint
os.environ["CHECKPOINT"] = os.path.join(os.getenv("HOST_RESULTS_DIR"), "train/ocd_model_latest.pth")

print('Rename a trained model: ')
print('---------------------')
!cp $CHECKPOINT $HOST_RESULTS_DIR/train/ocd_vit_model.pth
!ls -ltrh $HOST_RESULTS_DIR/train/ocd_vit_model.pth

## 5. Evaluate a trained model <a class="anchor" id="head-5"></a>

In this section, we run the `evaluate` tool to evaluate the trained model and produce the evaluation metric. The evaluation metric will show the hmeans of different threshold for binarization, you could select the threshold which has the best hmean result.

We provide specification files to configure the parameters including:

* evaluate: configure the training hyperparameters
    * results_dir: Path to restore training result
    * checkpoint: checkpoint path for running evaluation
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.
    * Metric
        * type: Only supports QuadMetric
        * is_output_polygon: Defaults to false. False for bounding box. True for polygon.
* dataset: configure the dataset and augmentation methods
    * validate_dataset: 
        * data_path: Path to validation images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * short_size: Resize to width x height during evaluation. Defaults to [1280, 736].
            * resize_text_polys: Resize the coordinate of text groudtruth. Defaults to true.
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading            
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Whether evaluation a model which has pruned model graph. Defaults to False.
    * pruned_graph_path: The path to the pruned model graph.
    * enlarge_feature_map_size: Defaults to True. To get better accuracy, we enlarge the output feature map size of FAN backbone.

In [None]:
# Evaluate on model
!tao model ocdnet evaluate \
            -e $SPECS_DIR/evaluate_ocdnet_vit.yaml \
            evaluate.checkpoint=$RESULTS_DIR/train/ocd_vit_model.pth \
            model.backbone=fan_tiny_8_p4_hybrid \
            model.enlarge_feature_map_size=True


 ## 6. Prune trained models <a class="anchor" id="head-6"></a>

In this section, we run the `prune` tool to get a pruned model.

We provide specification files to configure as following:

* prune: configure the pruning hyperparameters
    * results_dir: Path to restore training result
    * checkpoint: checkpoint path for running pruning
    * ch_sparsity: Channel sparisty. Also known as pruning threshold. Higher value gives you smaller model.
    * p: The norm degree, Default: 2. By default, it calculates the group L2-norm for each channel/dim.
    * round_to: Round channels to the nearest multiple of round_to. E.g., round_to=8 means channels will be rounded to 8x. Default: 32.
* dataset: configure the dataset and augmentation methods
    * validate_dataset: 
        * data_path: Path to validation images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * short_size: Resize to width x height during evaluation. Defaults to [1280, 736].
            * resize_text_polys: Resize the coordinate of text groudtruth. Defaults to true.
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading            



In [None]:
%env ch_sparsity=0.2

!tao model ocdnet prune -e $SPECS_DIR/prune_ocdnet_vit.yaml \
                prune.checkpoint=$RESULTS_DIR/train/ocd_vit_model.pth \
                prune.ch_sparsity=$ch_sparsity \
                prune.results_dir=$RESULTS_DIR/prune

In [None]:
!ls -lht $HOST_RESULTS_DIR/prune/

 ## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
 * Model needs to be re-trained to bring back accuracy after pruning
 * Need to set load_pruned_graph to true and set the path of the pruned graph

In [None]:
# Retraining using the pruned graph 
!tao model ocdnet train -e $SPECS_DIR/train_ocdnet_vit.yaml \
                  train.results_dir=$RESULTS_DIR/retrain \
                  model.load_pruned_graph=true \
                  model.pruned_graph_path=$RESULTS_DIR/prune/pruned_$ch_sparsity.pth


In [None]:
# Listing the newly retrained model.
!ls -lht $HOST_RESULTS_DIR/retrain

In [None]:
# You can set NUM_EPOCH to the epoch corresponding to any saved checkpoint
# %env NUM_EPOCH=029

# Get the name of the checkpoint corresponding to your set epoch
# tmp=!ls $HOST_RESULTS_DIR/retrain/*.pth | grep epoch_$NUM_EPOCH
# %env CHECKPOINT={tmp[0]}

# Or get the latest checkpoint
os.environ["CHECKPOINT"] = os.path.join(os.getenv("HOST_RESULTS_DIR"), "retrain/ocd_model_latest.pth")

print('Rename a trained model: ')
print('---------------------')
!cp $CHECKPOINT $HOST_RESULTS_DIR/retrain/ocd_vit_model.pth
!ls -ltrh $HOST_RESULTS_DIR/retrain/ocd_vit_model.pth

## 8. Evaluate retrained model <a class="anchor" id="head-8"></a>
* Need to set load_pruned_graph to true and set the path of the pruned graph.
* Model pruning reduces model parameters to improve inference frames per second (FPS) while maintaining nearly the same hmean.

In [None]:
!tao model ocdnet evaluate -e $SPECS_DIR/evaluate_ocdnet_vit.yaml \
                     evaluate.checkpoint=$RESULTS_DIR/retrain/ocd_vit_model.pth \
                     model.load_pruned_graph=true \
                     model.pruned_graph_path=$RESULTS_DIR/prune/pruned_$ch_sparsity.pth


In [None]:
# According the evaluation results, set threshold which has the best hmean result.
%env best_thresh=FIXME

## 9. Visualize Inferences <a class="anchor" id="head-9"></a>
In this section, we run the `inference` tool to generate inferences on the trained models and visualize the results. The `inference` tool produces annotated image outputs and txt files that contain prediction information.

We provide specification files to configure the inference parameters including:

* inference: configure the training hyperparameters
    * results_dir: Path to restore inference result
    * checkpoint: checkpoint path for running inference
    * input_folder: The input folder for inference
    * width: The width for resizing
    * height: The height for resizing
    * polygon: Produce polygon(true) or bounding box(false). Defaults to false.
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * thresh: The threshold for binarization.
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.       
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Whether evaluation a model which has pruned model graph. Defaults to False.
    * pruned_graph_path: The path to the pruned model graph.
    * enlarge_feature_map_size: Defaults to True. To get better accuracy, we enlarge the output feature map size of FAN backbone.


In [None]:
!tao model ocdnet inference \
        -e $SPECS_DIR/inference_ocdnet_vit.yaml \
        inference.checkpoint=$RESULTS_DIR/train/ocd_vit_model.pth \
        inference.input_folder=$DATA_DIR/test/img \
        inference.results_dir=$RESULTS_DIR/infer/ \
        inference.post_processing.args.thresh=$best_thresh \
        model.backbone=fan_tiny_8_p4_hybrid \
        model.enlarge_feature_map_size=True

In [None]:
# Simple grid visualizer
!pip3 install "matplotlib>=3.3.3, <4.0"
import matplotlib.pyplot as plt
import os
from math import ceil
result_image = ['result.jpg']

def visualize_images(output_path, num_cols=4, num_images=10):
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in sorted(os.listdir(output_path)) 
         if image.split("_")[-1] in result_image]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
 # Visualizing the sample images.
IMAGE_DIR = os.path.join(os.environ['HOST_RESULTS_DIR'], "infer")
COLS = 2 # number of columns in the visualizer grid.
IMAGES = 4 # number of images to visualize.

visualize_images(IMAGE_DIR, num_cols=COLS, num_images=IMAGES)

## 10. Deploy <a class="anchor" id="head-10"></a>

In [None]:
# Export pth model to ONNX model
!tao model ocdnet export \
           -e $SPECS_DIR/export_ocdnet_vit.yaml \
           export.checkpoint=$RESULTS_DIR/train/ocd_vit_model.pth \
           export.onnx_file=$RESULTS_DIR/export/ocd_vit_model.onnx
           

In [None]:
# Generate TensorRT engine using tao-deploy
!tao deploy ocdnet gen_trt_engine -e $SPECS_DIR/gen_trt_engine_ocdnet_vit.yaml \
                               gen_trt_engine.onnx_file=$RESULTS_DIR/export/ocd_vit_model.onnx \
                               gen_trt_engine.tensorrt.calibration.cal_batch_size=8 \
                               gen_trt_engine.tensorrt.calibration.cal_batches=2 \
                               gen_trt_engine.tensorrt.data_type=int8 \
                               gen_trt_engine.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine

In [None]:
# Evaluate with generated TensorRT engine
%env CUDA_MODULE_LOADING="LAZY"
!tao deploy ocdnet evaluate -e $SPECS_DIR/evaluate_ocdnet_vit.yaml \
                             evaluate.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \
                             evaluate.post_processing.args.thresh=$best_thresh

In [None]:
# Inference with generated TensorRT engine
!tao deploy ocdnet inference -e $SPECS_DIR/inference_ocdnet_vit.yaml \
                              inference.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \
                              inference.input_folder=$DATA_DIR/test/img \
                              inference.results_dir=$RESULTS_DIR/inference \
                              inference.post_processing.args.thresh=$best_thresh

This notebook has come to an end.