# Optical Character Detection using TAO OCNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained OCDNet model and train OCDNet model on the IAMDATA Handwritting dataset

## Table of Contents

This notebook shows an example usecase of OCDNet using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2) <br>
    2.1 [Download pre-trained model](#head-2-2) <br>
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate](#head-5)
6. [Inference](#head-6)
7. [Deploy](#head-7)
TAO training

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The TAO launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher. <br>

`IMPORTANT NOTE:` The code below creates a sample `~/.tao_mounts.json`  file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.


In [40]:
import os

# Please define this local project directory that needs to be mapped to the TAO docker session.
%env LOCAL_PROJECT_DIR=/home/pdelafuente/tao/my_apps/ocdnet
%env NOTEBOOK_DIR=/home/pdelafuente/tao/my_apps/ocdnet
os.environ["HOST_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data", "ocdnet")
os.environ["HOST_RESULTS_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "ocdnet", "results")
os.environ["PRE_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data", "iamdata")


# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/ocdnet

# The sample spec files are present in the same path as the downloaded samples.
os.environ["HOST_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_DIR", os.getcwd()),
    "specs"
)

env: LOCAL_PROJECT_DIR=/home/pdelafuente/tao/my_apps/ocdnet
env: NOTEBOOK_DIR=/home/pdelafuente/tao/my_apps/ocdnet


In [None]:
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR
! mkdir -p $PRE_DATA_DIR

In [10]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tlt_configs = {
   "Mounts":[
       # Mapping the data directory
       {
           "source": os.environ["LOCAL_PROJECT_DIR"],
           "destination": "/workspace/tao/my_apps/ocdnet"
       },
       {
           "source": os.environ["HOST_DATA_DIR"],
           "destination": "/workspace/tao/my_apps/ocdnet/data"
       },
       {
           "source": os.environ["HOST_SPECS_DIR"],
           "destination": "/workspace/tao/my_apps/ocdnet/specs"
       },
       {
           "source": os.environ["HOST_RESULTS_DIR"],
           "destination": "/workspace/tao/my_apps/ocdnet/results"
       },
   ],
   "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(tlt_configs, mfile, indent=4)

In [11]:
!cat ~/.tao_mounts.json

{
    "Mounts": [
        {
            "source": "/home/pdelafuente/tao/my_apps/ocdnet",
            "destination": "/workspace/tao/my_apps/ocdnet"
        },
        {
            "source": "/home/pdelafuente/tao/my_apps/ocdnet/data/ocdnet",
            "destination": "/workspace/tao/my_apps/ocdnet/data"
        },
        {
            "source": "/home/pdelafuente/tao/my_apps/ocdnet/specs",
            "destination": "/workspace/tao/my_apps/ocdnet/specs"
        },
        {
            "source": "/home/pdelafuente/tao/my_apps/ocdnet/ocdnet/results",
            "destination": "/workspace/tao/my_apps/ocdnet/results"
        }
    ],
    "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
        }
    }
}

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.


In [None]:
# SKIP this step IF you have already installed the TAO launcher or you plan to run directly from the container
!pip3 install nvidia-tao

In [2]:
# View the versions of the TAO launcher
!tao info

Configuration of the TAO Toolkit Instance
task_group: ['model', 'dataset', 'deploy']
format_version: 3.0
toolkit_version: 5.0.0
published_date: 05/08/2023


## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

We will be using the IAM handwritting dataset. To find more details please visit
https://fki.tic.heia-fr.ch/databases/iam-handwriting-database. You will need need to register for a free account and manually downlaod the required files. Please review the licensing requirements. These files are to be used for research use only.  

There are several zipped files to download, the ascii.tgz and the document images which are each stored in one of 3 zipped files based on the document title.
Place these files in your data/iamdata folder ($PRE_DATA_DIR)
ascii.tgz
formsA-D.tgz
formsE-H.tgz
formsI-Z.tgz

Please download the IAMDATA (https://rrc.cvc.uab.es/?ch=4&com=downloads) to `$PRE_DATA_DIR`.  

In [41]:
#the ascii tar file has several files of metadata for the iam dataset as. Run the command in this block to see the list of files. 
#We just need the words.txt file which contains the words detected and  x,y,h,w info for each for each image
!tar -tf $PRE_DATA_DIR/ascii.tgz

./
forms.txt
lines.txt
words.txt
sentences.txt


In [30]:
#Now lets extract just the words.txt file into our data/iamdata directory
#The words.txt file contains the words detected and  x,y,h,w info for each for each image
!tar -xf $PRE_DATA_DIR/ascii.tgz --directory $PRE_DATA_DIR/ words.txt 

In [36]:
#let's see how many images are in each file. 
!tar -tzf $PRE_DATA_DIR/formsA-D.tgz | wc -l
!tar -tzf $PRE_DATA_DIR/formsE-H.tgz | wc -l
!tar -tzf $PRE_DATA_DIR/formsI-Z.tgz | wc -l

529
522
488


In [38]:
#create directories to hold image data
!mkdir -p $PRE_DATA_DIR/train/img
!mkdir -p $PRE_DATA_DIR/test/img
!mkdir -p $PRE_DATA_DIR/train/gt
!mkdir -p $PRE_DATA_DIR/test/gt

In [39]:
#unpack the images, let's use the first two groups of images for training and the last for validation

!tar -xzf $PRE_DATA_DIR/formsA-D.tgz --directory $PRE_DATA_DIR/train/img
!tar -xzf $PRE_DATA_DIR/formsE-H.tgz --directory $PRE_DATA_DIR/train/img 
!tar -xzf $PRE_DATA_DIR/formsI-Z.tgz --directory $PRE_DATA_DIR/test/img 

In [43]:
# verify
!ls -l $PRE_DATA_DIR/test

total 24
drwxrwxr-x 2 pdelafuente pdelafuente  4096 Jul  3 12:57 gt
drwxrwxr-x 2 pdelafuente pdelafuente 20480 Jul  3 12:59 img


In [56]:
import pandas as pd

def extract_columns(line):
    #filename = line[:line.index('-') + 5]
    parts = line.split()
    file = parts[0]
    if len(file) == 13:
        filename = 'gt_' + line[:line.index('-') + 4] + '.txt'
    else:
        filename = 'gt_' + line[:line.index('-') + 5] + '.txt'
    
    word = parts[-1]
    x = parts[3]
    y = parts[4]
    w = parts[5]
    h = parts[6]
    x2 = int(x) + int(w)
    y2 = y
    x3 = int(x) + int(w)
    y3 = int(y) + int(h)
    x4 = x
    y4 = int(y) + int(h)
    return filename,  x, y, x2, y2, x3, y3, x4, y4, word

def process_text_file(file_path):
    data = []
    with open(file_path, 'r') as file:
        #skip the first 18 lines in the words.txt file as that is just meta data 
        lines = file.readlines()[19:] 
        for line in lines:
            filename, x, y, x2, y2, x3, y3, x4, y4,word = extract_columns(line.strip()) 
            if not word == ', ,' and not word == '. .' or not word == ''and not word == '*- -' and not word == '; ;' and not word == '.' and not word == ',':
                data.append([filename, x, y, x2, y2, x3, y3, x4, y4, word])
    # Create a DataFrame from the extracted data
    columns = ['filename', 'x', 'y','x2', 'y2', 'x3', 'y3', 'x4', 'y4', 'word' ] 
    df = pd.DataFrame(data, columns=columns)
    return df

# Example usage:
pfile = os.environ["PRE_DATA_DIR"] + '/words.txt'
df = process_text_file(pfile)
        
df.head()

Unnamed: 0,filename,x,y,x2,y2,x3,y3,x4,y4,word
0,gt_a01-000u.txt,507,766,720,766,720,814,507,814,MOVE
1,gt_a01-000u.txt,796,764,866,764,866,814,796,814,to
2,gt_a01-000u.txt,919,757,1085,757,1085,835,919,835,stop
3,gt_a01-000u.txt,1185,754,1311,754,1311,815,1185,815,Mr.
4,gt_a01-000u.txt,1438,746,1820,746,1820,819,1438,819,Gaitskell


This command will group the dataframe from the previous step and create a txt file for each image with the 8 point coordinates and the identified words.


In [76]:
#write out the label files for each image and save to the applicable test or training gt folder
groups = df.groupby('filename')
for filename, group in groups:
    gdf = pd.DataFrame(data=group)
    pdf = gdf[['x', 'y', 'x2', 'y2','x3','y3','x4','y4','word']]
    if filename[3] in ['j','k','l','m','n','p','r']:
        tefile = os.environ["PRE_DATA_DIR"] + '/test/gt/' + filename
        #pdf.to_csv(f'data/iamdata\test\gt\{filename}', index=False, header=False)
        pdf.to_csv(tefile, index=False, header=False)
    else:
        trfile = os.environ["PRE_DATA_DIR"] + '/train/gt/' + filename
        pdf.to_csv(trfile, index=False, header=False)

In [79]:
#lets copy our data from the data prep directory to the host data directory
#you may wish to remove the original zip files and images in the data prep directory once you copy the data to the host data directory
!cp -a $PRE_DATA_DIR/train/. $HOST_DATA_DIR/train/
!cp -a $PRE_DATA_DIR/test/. $HOST_DATA_DIR/test/

Next we download a pretrained model

## 2.2 Prepare Pretrained Models <a class="anchor" id="head-2-2"></a>

In [None]:
#Let's download a pretrained model
!pip install gdown
#Download deformable_resnet18 pretrained model
!gdown https://drive.google.com/uc?id=16U4Smk6k3cFxxU8NhXC-VQmMQ220VgYH -O $HOST_DATA_DIR/ocdnet_deformable_resnet18.pth
#Download deformable_resnet50 pretrained model
!gdown https://drive.google.com/uc?id=1qkv6pDYopwlrLb9uAtI8n0gyVX0kn3GQ -O $HOST_DATA_DIR/ocdnet_deformable_resnet50.pth

## 3. Provide training specification <a class="anchor" id="head-3"></a>

We provide specification files to configure the training parameters including:

* num_gpus: number of gpus 
* train: configure the training hyperparameters
    * results_dir: Path to restore training result
    * resume_training_checkpoint_path: Resume training from a checkpoint.
    * num_epochs: The total epochs for training
    * validation_interval: validation interval
    * checkpoint_interval: checkpoint interval
    * optimizer
        * type: Defaults to Adam.
        * lr: Initial learning rate 
    * lr_scheduler
        * type: Only supports WarmupPolyLR
        * warmup_epoch: The epoch numbers for warm up to initinal learning rate. It should be different from num_epochs. 
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * thresh: The threshold for binarization.
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.
    * Metric
        * type: Only supports QuadMetric
        * is_output_polygon: Defaults to false. False for bounding box. True for polygon.
* dataset: configure the dataset and augmentation methods
    * train_dataset:
        * data_path: Path to train images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * size: Ramdom crop size during training. Defaults to [640, 640].
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading 
    * validate_dataset: 
        * data_path: Path to validation images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * short_size: Resize to width x height during evaluation. Defaults to [1280, 736].
            * resize_text_polys: Resize the coordinate of text groudtruth. Defaults to true.
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading            
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Defaults to False. Must set to true if train a model which is pruned. 
    * pruned_graph_path: The path to the pruned model graph.
    * pretrained_model_path: Finetune from a pretrained model. The `.pth` model is supported.

Please refer to the TAO documentation about OCDNet to get more parameters that are configurable.



## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
Now we train the pretrained model in OCDNET with the IAM dataset. 

If you opt to just run from within the container so this needs to be run on a machine with a GPU, Docker and NVIDIA drivers.
These are the two TAO containers we are using in this notebook:
1. nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt
2. nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-deploy

Container #1 is used to train, evaluate,  and inference TAO models.
Container #2 is used to export models and generate tensor rt engines for deployment. 

Doublecheck the files in the specs directory that are used as parameters in the command. Below we are using the train.yaml and evaluate.yaml.  Make sure the directories in those files that point to the results and datasets are accurate. 

In [None]:
# NOTE: The following paths are set from the perspective of the TAO Docker.

# The data is saved here
%env DATA_DIR=/workspace/tao/my_apps/ocdnet/data/ocdnet
%env SPECS_DIR=/workspace/tao/my_apps/ocdnet/specs
%env RESULTS_DIR=/workspace/tao/my_apps/ocdnet/results

In [None]:
#Train using TAO Launcher
#print("Run training with ngc pretrained model.")
!tao model ocdnet train \
          -e $SPECS_DIR/train.yaml \
          -r $RESULTS_DIR/train -k $KEY \
          model.pretrained_model_path=$DATA_DIR/ocdnet_deformable_resnet18.pth

In [None]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
#Train using a direct call to docker container

#docker command to train model with labeled dataset
#better to run from terminal, make sure to mount project directory
!docker run -it --rm --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet train -e $SPECS_DIR/train.yaml -r $RESULTS_DIR/train -k $KEY \
model.pretrained_model_path=$DATA_DIR/ocdnet_deformable_resnet18.pth -v ~/home/pdelafuente/tao:/workspace/tao

## 5. Evaluate a trained model <a class="anchor" id="head-5"></a>

In this section, we run the `evaluate` tool to evaluate the trained model and produce the evaluation metric.

We provide specification files to configure the parameters including:

* evaluate: configure the training hyperparameters
    * results_dir: Path to restore training result
    * checkpoint: checkpoint path for running evaluation
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * thresh: The threshold for binarization.
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.
    * Metric
        * type: Only supports QuadMetric
        * is_output_polygon: Defaults to false. False for bounding box. True for polygon.
* dataset: configure the dataset and augmentation methods
    * validate_dataset: 
        * data_path: Path to validation images. If there are multi sources, set it looks like ['/path/1' , '/path/2']
        * pre_processes
            * short_size: Resize to width x height during evaluation. Defaults to [1280, 736].
            * resize_text_polys: Resize the coordinate of text groudtruth. Defaults to true.
        * loader
            * batch_size: batch size for dataloader
            * num_workers: number of workers to do data loading            
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Whether evaluation a model which has pruned model graph. Defaults to False.
    * pruned_graph_path: The path to the pruned model graph.

In [None]:
# Evaluate on model
!tao model ocdnet evaluate \
            -e $SPECS_DIR/evaluate.yaml \
            evaluate.checkpoint=$RESULTS_DIR/train/model_best.pth

In [None]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
#evaluate trained model
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet evaluate -e $SPECS_DIR/evaluate.yaml \
evaluate.checkpoint=$RESULTS_DIR/train/model_best.pth

## 6. Visualize Inferences <a class="anchor" id="head-6"></a>
In this section, we run the `inference` tool to generate inferences on the trained models and visualize the results. The `inference` tool produces annotated image outputs and txt files that contain prediction information.

We provide specification files to configure the inference parameters including:

* inference: configure the training hyperparameters
    * results_dir: Path to restore inference result
    * checkpoint: checkpoint path for running inference
    * input_folder: The input folder for inference
    * width: The width for resizing
    * height: The height for resizing
    * polygon: Produce polygon(true) or bounding box(false). Defaults to false.
    * post_processing
        * type: Only supports SegDetectorRepresenter
        * thresh: The threshold for binarization.
        * box_thresh: The threshold for bounding box.
        * unclip_ratio: Default to 1.5. The box will look larger if this ratio is set to larger.       
* model: configure the model setting
    * backbone: The backbone type. The deformable_resnet18 and deformable_resnet50 are supported.
    * load_pruned_graph: Whether evaluation a model which has pruned model graph. Defaults to False.
    * pruned_graph_path: The path to the pruned model graph.


In [None]:
#run inference using TAO
!tao model ocdnet inference \
           -e $SPECS_DIR/inference.yaml \
           inference.checkpoint=$RESULTS_DIR/train/model_best.pth \
           inference.input_folder=$DATA_DIR/test/img \
           inference.results_dir=$RESULTS_DIR/inference

In [None]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option  
#run inference using direct TAO container call
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet inference -e /workspace/tao/my_apps/ocdnet/specs/inference.yaml \
inference.checkpoint=$RESULTS_DIR/train/model_best.pth \
inference.input_folder=$DATA_DIR/test/img \
inference.results_dir=$RESULTS_DIR/inference 

## 7. Deploy <a class="anchor" id="head-7"></a>

In [None]:
# Export pth model to ONNX model
!tao model ocdnet export \
           -e $SPECS_DIR/export.yaml \
           export.checkpoint=$RESULTS_DIR/train/model_best.pth \
           export.onnx_file=$RESULTS_DIR/export/model_best.onnx

In [1]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
#export model to onnx format
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet export -e $SPECS_DIR/export.yaml \
export.checkpoint=$RESULTS_DIR/train/model_best.pth \
export.onnx_file=$RESULTS_DIR/export/model_best.onnx


=== TAO Toolkit PyTorch ===

NVIDIA Release 4.0.0-PyTorch (build 53420872)
TAO Toolkit Version 4.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

[2023-06-27 20:36:41,152 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Created a temporary directory at /tmp/tmp56q0zajx
[2023-06-27 20:36:41,152 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Writing /tmp/tmp56q0zajx/_remote_module_non_scriptable.py
[2023-06-27 20:36:41,613 - TAO Toolkit - matplotlib.font_manager - INFO] generated new fontManager
'export.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://hyd

In [None]:
# Generate TensorRT engine using tao-deploy
!tao deploy ocdnet gen_trt_engine -e $SPECS_DIR/gen_trt_engine.yaml \
                               gen_trt_engine.onnx_file=$RESULTS_DIR/export/model_best.onnx \
                               gen_trt_engine.tensorrt.min_batch_size=1 \
                               gen_trt_engine.tensorrt.opt_batch_size=3 \
                               gen_trt_engine.tensorrt.max_batch_size=3 \
                               gen_trt_engine.tensorrt.data_type=int8 \
                               gen_trt_engine.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine

In [7]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
# Generate TensorRT engine using tao-deploy, direct container call
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-deploy \
ocdnet gen_trt_engine -e $SPECS_DIR/gen_trt_engine.yaml \
gen_trt_engine.onnx_file=$RESULTS_DIR/export/model_best.onnx \
                               gen_trt_engine.tensorrt.min_batch_size=1 \
                               gen_trt_engine.tensorrt.opt_batch_size=3 \
                               gen_trt_engine.tensorrt.max_batch_size=3 \
                               gen_trt_engine.tensorrt.data_type=int8 \
                               gen_trt_engine.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine


=== TAO Toolkit Deploy ===

NVIDIA Release 4.0.0-Deploy (build 52693241)
TAO Toolkit Version 4.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

2023-06-27 20:49:06,836 [INFO] matplotlib.font_manager: generated new fontManager
python /usr/local/lib/python3.8/dist-packages/nvidia_tao_deploy/cv/ocdnet/scripts/gen_trt_engine.py  --config-path /workspace/tao/my_apps/ocdnet/specs --config-name gen_trt_engine.yaml gen_trt_engine.onnx_file=/workspace/tao/my_apps/ocdnet/results/export/model_bestocdnet.onnx gen_trt_engine.tensorrt.min_batch_size=1 gen_trt_engine.tensorrt.opt_batch_size=3 gen_trt_engine.tensorrt.max_batch_size=3 gen_trt_engine.tensorrt.data_type=int8 gen_trt_engine.trt_engine=

In [None]:
# Evaluate with generated TensorRT engine
%env CUDA_MODULE_LOADING="LAZY"
!tao deploy ocdnet evaluate -e $SPECS_DIR/evaluate.yaml \
                             evaluate.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine
                             

In [12]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
# Evaluate with generated TensorRT engine - direct docker call
%env CUDA_MODULE_LOADING="LAZY"
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet evaluate -e $SPECS_DIR/evaluate.yaml \
                             evaluate.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine

env: CUDA_MODULE_LOADING="LAZY"

=== TAO Toolkit PyTorch ===

NVIDIA Release 4.0.0-PyTorch (build 53420872)
TAO Toolkit Version 4.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

[2023-06-27 21:07:28,694 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Created a temporary directory at /tmp/tmpkc_smjc6
[2023-06-27 21:07:28,694 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Writing /tmp/tmpkc_smjc6/_remote_module_non_scriptable.py
[2023-06-27 21:07:29,152 - TAO Toolkit - matplotlib.font_manager - INFO] generated new fontManager
'evaluate.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be rem

In [None]:
# Inference with generated TensorRT engine
!tao deploy ocdnet inference -e $SPECS_DIR/inference.yaml \
                              inference.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \
                              inference.input_folder=$DATA_DIR/test/img \
                              inference.results_dir=$RESULTS_DIR/inference

In [15]:
#FYI only if using direct container calls, do not both the previous and this block, just pick one option 
# Inference with generated TensorRT engine - direct container call
!docker run -it --rm --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 -v /home/pdelafuente/tao:/workspace/tao --gpus all nvcr.io/ea-tlt/tao_ea/tao-toolkit:5.0.0-ea-pyt \
ocdnet inference -e $SPECS_DIR/inference.yaml \
inference.trt_engine=$RESULTS_DIR/export/ocdnet_model.engine \
                              inference.input_folder=$DATA_DIR/test/img \
                              inference.results_dir=$RESULTS_DIR/inference


=== TAO Toolkit PyTorch ===

NVIDIA Release 4.0.0-PyTorch (build 53420872)
TAO Toolkit Version 4.0.0

Various files include modifications (c) NVIDIA CORPORATION & AFFILIATES.  All rights reserved.

This container image and its contents are governed by the TAO Toolkit End User License Agreement.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/tao-toolkit-software-license-agreement

[2023-06-27 21:15:30,624 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Created a temporary directory at /tmp/tmp1i1qpz0r
[2023-06-27 21:15:30,625 - TAO Toolkit - torch.distributed.nn.jit.instantiator - INFO] Writing /tmp/tmp1i1qpz0r/_remote_module_non_scriptable.py
[2023-06-27 21:15:31,084 - TAO Toolkit - matplotlib.font_manager - INFO] generated new fontManager
'inference.yaml' is validated against ConfigStore schema with the same name.
This behavior is deprecated in Hydra 1.1 and will be removed in Hydra 1.2.
See https://

This completes the OCDNet portion.  Proceed to the OCRNet notebook to learn how to train the OCR model for handwritting detection.

The combination of both the OCDNet and OCRnet models are used together for identify areas on the documents that have handwritten content and then identifying what is handwritten. 