# Classification using TAO Classification PyT

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://d29g4g2dyqv443.cloudfront.net/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">


## What is FAN ?

FAN (Fully Attentional Network) is a transformer-based family of backbone from NVIDIA research that achieves SOTA in robustness against various corruptions. This family of backbone can easily generalize to new domains, be more robust to noise, blur etc. Key design behind FAN block is the attentional channel processing module that leads to robust representation learning. FAN can be used for image classification tasks as well as downstream tasks such as object detection and segmentation.
FAN can be useful when domain gap exists between the training and testing datasets, for example, a computer vision model is trained using high-resolution images taken in well-lit studio conditions. The training dataset consists of professional photographs with ideal lighting and controlled environments. However, during testing, the model encounters low-resolution images captured by surveillance cameras in outdoor settings with varying lighting conditions and weather effects, so to bridge this domain gap, employing techniques such as FAN can help enhance the model's adaptability to the testing dataset's distinct visual characteristics and challenges.

## What is GCViT ?

The model in this instance is an image classification model based on [GCViT](https://arxiv.org/abs/2206.09959) architecture. Global context vision transformer (GC ViT),enhances parameter and compute utilization for computer vision. It leverages global context self-attention modules, joint with standard local self-attention, to effectively and efficiently model both long and short-range spatial interactions, without the need for expensive operations such as computing attention masks or shifting local windows.

### Sample prediction of Classification PyT model
<img align="center" src="https://github.com/vpraveen-nv/model_card_images/blob/main/cv/notebook/classification_pyt/sample.jpg?raw=true" width="960">

## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Train a fan_small_12_p4_hybrid model on the Cats and Dogs dataset
* Evaluate the trained model.
* Run Inference on the trained model.
* Export the trained model to a .onnx file for deployment to DeepStream.

At the end of this notebook, you will have generated a trained and optimized `classification` model
which you may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps)
or [DeepStream](https://developer.nvidia.com/deepstream-sdk).

## Table of Contents

This notebook shows an example usecase of Classification using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Inferences](#head-6)
7. [Deploy](#head-7)


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The TAO launcher uses docker containers under the hood, and **for our data and results directory to be visible to the docker, they need to be mapped**. The launcher can be configured using the config file `~/.tao_mounts.json`. Apart from the mounts, you can also configure additional options like the Environment Variables and amount of Shared Memory available to the TAO launcher. <br>

`IMPORTANT NOTE:` The code below creates a sample `~/.tao_mounts.json`  file. Here, we can map directories in which we save the data, specs, results and cache. You should configure it for your specific case so these directories are correctly visible to the docker container.


In [None]:
import os

# Please define this local project directory that needs to be mapped to the TAO docker session.
%env LOCAL_PROJECT_DIR=FIXME

os.environ["HOST_DATA_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "data")
os.environ["HOST_RESULTS_DIR"] = os.path.join(os.getenv("LOCAL_PROJECT_DIR", os.getcwd()), "classification_pyt")

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=/path/to/local/tao-experiments/classification
# The sample spec files are present in the same path as the downloaded samples.
os.environ["HOST_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
# Point to the 'deps' folder in samples from where you are launching notebook inside classification folder.
os.environ["PROJECT_DIR"]=FIXME
# Set your encryption key, and use the same key for all commands
%env NUM_GPUS = 1

In [3]:
! mkdir -p $HOST_DATA_DIR
! mkdir -p $HOST_SPECS_DIR
! mkdir -p $HOST_RESULTS_DIR

In [39]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")
tao_configs = {
   "Mounts":[
       # Mapping the data directory
       {
           "source": os.environ["LOCAL_PROJECT_DIR"],
           "destination": "/workspace/tao-experiments"
       },
       {
           "source": os.environ["HOST_DATA_DIR"],
           "destination": "/data"
       },
       {
           "source": os.environ["HOST_SPECS_DIR"],
           "destination": "/specs"
       },
       {
           "source": os.environ["HOST_RESULTS_DIR"],
           "destination": "/results"
       },
   ],
   "DockerOptions": {
        "shm_size": "16G",
        "ulimits": {
            "memlock": -1,
            "stack": 67108864
         }
   }
}
# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(tao_configs, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

In [None]:
! realpath ~/.tao_mounts.json

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```

where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.7, <=3.10.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.


In [7]:
# SKIP this step IF you have already installed the TAO launcher.
# !pip3 install nvidia-tao

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

### 2.1 Prepare dataset

We will be using the `Cats and Dogs` dataset for the classification tutorial. Please use the following installation steps. 

In [None]:
!wget https://www.dropbox.com/s/wml49yrtdo53mie/cats_dogs_dataset_reorg.zip?dl=0 -O cats_dogs_dataset.zip
!unzip -qo cats_dogs_dataset.zip -d $HOST_DATA_DIR/

In [None]:
# Install the following dependencies for running the dataset preparation scripts
!pip3 install Cython==0.29.36
!pip3 install -r $PROJECT_DIR/deps/requirements-pip.txt
!pip3 install --upgrade "six>=1.17.0,<2.0"

### A. Verify downloaded dataset <a class="anchor" id="head-1-1"></a>

In [None]:
!ls -l $HOST_DATA_DIR/cats_dogs_dataset

In [None]:
!ls -l $HOST_DATA_DIR
!if [ ! -f $HOST_DATA_DIR/cats_dogs_dataset/classes.txt ]; then echo 'Dataset Not Found, Please Download.'; else echo 'Successfully Found Cats Dogs Dataset.';fi

## 3. Provide training specification <a class="anchor" id="head-2"></a>

We provide specification files to configure the training parameters including:

* checkpoint_config: configure the checkpoint setting
    * interval: number of iterations at which checkpoint needs to be saved
* train_config: configure the training hyperparameters
    * optim_config
    * epochs
    * checkpoint_interval
* dataset_config: configure the dataset and augmentation methods
    * train_img_dirs
    * train_ann_dirs
    * pallete: color and mapping class for each class
    * output_shape
    * batch_size
    * workers: number of workers to do data loading
    * clips_per_video: number of clips to be sampled from single video
    * augmentation_config

Please refer to the TAO documentation about Classification to get all the parameters that are configurable.

**Note:** If you are using the Logistic Regression head, the following parameters from the spec file model config should be used:

* model:
  * backbone:
    * freeze: true
    * pretrained: "/path/to/NV_DINOV2_518.pth"
  * head:
    * lr_head:
      * C: 0.316   # tunable
      * max_iter: 5000   # tunable
    * type: LogisticRegressionHead
    * num_classes: 1000

In [None]:
!cat $HOST_SPECS_DIR/train_cats_dogs.yaml

## 4. Run TAO training <a class="anchor" id="head-3"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [None]:
# NOTE: The following paths are set from the perspective of the TAO Docker.

# The data is saved here
%env DATA_DIR = /data
%env SPECS_DIR = /specs
%env RESULTS_DIR = /results

### A. Download pre-trained model <a class="anchor" id="head-1-4"></a>

We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar. 

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
import os
%env CLI=ngccli_cat_linux.zip
!mkdir -p $HOST_RESULTS_DIR/ngccli

# # Remove any previously existing CLI installations
!rm -rf $HOST_RESULTS_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $HOST_RESULTS_DIR/ngccli
!unzip -u "$HOST_RESULTS_DIR/ngccli/$CLI" -d $HOST_RESULTS_DIR/ngccli/
!rm $HOST_RESULTS_DIR/ngccli/*.zip
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("HOST_RESULTS_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tao/pretrained_fan_classification_imagenet:*

In [16]:
!mkdir -p $LOCAL_PROJECT_DIR/pretrained_fan_hybrid_small/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_fan_classification_imagenet:fan_hybrid_small --dest $LOCAL_PROJECT_DIR/pretrained_fan_hybrid_small

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_PROJECT_DIR/pretrained_fan_hybrid_small/pretrained_fan_classification_imagenet_vfan_hybrid_small

In [None]:
# This is the suitable number of epochs for this model with pretrained weights. Please change this value as needed.
%env EPOCHS = 3

print("Train Classification Model")
!tao model classification_pyt train \
                  -e $SPECS_DIR/train_cats_dogs.yaml \
                  results_dir=$RESULTS_DIR/classification_experiment \
                  train.num_gpus=$NUM_GPUS \
                  train.num_epochs=$EPOCHS

In [None]:
print("To resume from a checkpoint, use the below command. Update the epoch number accordingly")
!tao model classification_pyt train \
                  -e $SPECS_DIR/train_cats_dogs.yaml \
                  results_dir=$RESULTS_DIR/classification_experiment \
                  train.num_gpus=$NUM_GPUS \
                  train.resume_training_checkpoint_path=$RESULTS_DIR/classification_experiment/train/classifier_model_latest.pth \
                  train.num_epochs=$EPOCHS

In [None]:
print('PyTorch checkpoints:')
print('---------------------')
!ls -ltrh $HOST_RESULTS_DIR/classification_experiment/train

In [None]:
# You can set NUM_EPOCH to the epoch corresponding to any saved checkpoint
%env NUM_EPOCH=3

In [None]:
print('Rename a model: Note that the training is not deterministic, so you may change the model name accordingly.')
print('---------------------')
# NOTE: The following command may require `sudo`. You can run the command outside the notebook.
!ls -ltrh $HOST_RESULTS_DIR/classification_experiment/train/classifier_model_latest.pth

## 5. Evaluate trained models <a class="anchor" id="head-4"></a>


Evaluate Cats Dogs Classification Model

In [None]:
!tao model classification_pyt evaluate \
                    -e $SPECS_DIR/test_cats_dogs.yaml \
                    evaluate.checkpoint=$RESULTS_DIR/classification_experiment/train/classifier_model_latest.pth \
                    results_dir=$RESULTS_DIR/classification_experiment

## 6. Inferences <a class="anchor" id="head-5"></a>
In this section, we run the classification inference tool to generate inferences with the trained classification models and print the results. 


In [None]:
!tao model classification_pyt inference \
                    -e $SPECS_DIR/test_cats_dogs.yaml \
                    inference.checkpoint=$RESULTS_DIR/classification_experiment/train/classifier_model_latest.pth \
                    results_dir=$RESULTS_DIR/classification_experiment

In [None]:
# Visualize the results
!cat $HOST_RESULTS_DIR/classification_experiment/inference/result.csv

Visualize the inference with images from the csv file. It contains the following columns - Image Name, class_label, class_confidence

In [None]:
# Install Deps
!pip3 install pillow
!pip3 install "matplotlib>=3.3.3, <4.0"

In [None]:
import matplotlib.pyplot as plt
from PIL import Image
import os
import csv
from math import ceil
import random

DATA_DIR = os.environ.get('HOST_DATA_DIR')
DATA_DOWNLOAD_DIR = os.environ.get('DATA_DIR')
RESULT_DIR = os.environ.get('HOST_RESULTS_DIR')
csv_path = os.path.join(RESULT_DIR, "classification_experiment/inference/" 'result.csv')
results = []
with open(csv_path) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for row in csv_reader:
        results.append((row[0], row[1]))
random.shuffle(results)

w,h = 200,200
fig = plt.figure(figsize=(30,30))
columns = 5
rows = 1
for i in range(1, columns*rows + 1):
    ax = fig.add_subplot(rows, columns,i)
    img = Image.open(results[i][0].replace(DATA_DOWNLOAD_DIR, DATA_DIR))
    img = img.resize((w,h))
    plt.imshow(img)
    ax.set_title(results[i][1], fontsize=40)

## 7. Deploy! <a class="anchor" id="head-6"></a>

In [None]:
# Export the Classification model to ONNX model
# NOTE: Export is done on single GPU - GPU num need not be provided

!tao model classification_pyt export \
                   -e $SPECS_DIR/export_cats_dogs.yaml \
                   export.checkpoint=$RESULTS_DIR/classification_experiment/train/classifier_model_latest.pth \
                   export.onnx_file=$RESULTS_DIR/classification_experiment/export/classification_model_export.onnx \
                   results_dir=$RESULTS_DIR/classification_experiment/

In [None]:
# Generate a TensorRT Engine using TAO Deploy
!tao deploy classification_pyt gen_trt_engine \
                   -e $SPECS_DIR/export_cats_dogs.yaml \
                   gen_trt_engine.onnx_file=$RESULTS_DIR/classification_experiment/export/classification_model_export.onnx \
                   gen_trt_engine.trt_engine=$RESULTS_DIR/classification_experiment/gen_trt_engine/classification_model_export.engine \
                   results_dir=$RESULTS_DIR/classification_experiment/

In [None]:
# Run evaluation using the generated TensorRT Engine
!tao deploy classification_pyt evaluate \
                   -e $SPECS_DIR/export_cats_dogs.yaml \
                   evaluate.trt_engine=$RESULTS_DIR/classification_experiment/gen_trt_engine/classification_model_export.engine \
                   results_dir=$RESULTS_DIR/classification_experiment/

In [None]:
# Run inference using the generated TensorRT Engine
!tao deploy classification_pyt inference \
                   -e $SPECS_DIR/export_cats_dogs.yaml \
                   inference.trt_engine=$RESULTS_DIR/classification_experiment/gen_trt_engine/classification_model_export.engine \
                   results_dir=$RESULTS_DIR/classification_experiment/

In [None]:
# Visualize the results
!cat $HOST_RESULTS_DIR/classification_experiment/trt_inference/result.csv

In [None]:
# Visualize Inference

import matplotlib.pyplot as plt
from PIL import Image
import os
import csv
from math import ceil
import random

DATA_DIR = os.environ.get('HOST_DATA_DIR')
DATA_DOWNLOAD_DIR = os.environ.get('DATA_DIR')
RESULT_DIR = os.environ.get('HOST_RESULTS_DIR')
csv_path = os.path.join(RESULT_DIR, "classification_experiment/trt_inference/" 'result.csv')
results = []
with open(csv_path) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for row in csv_reader:
        results.append((row[0], row[1]))
random.shuffle(results)

w,h = 200,200
fig = plt.figure(figsize=(30,30))
columns = 5
rows = 1
for i in range(1, columns*rows + 1):
    ax = fig.add_subplot(rows, columns,i)
    img = Image.open(results[i][0].replace(DATA_DOWNLOAD_DIR, DATA_DIR))
    img = img.resize((w,h))
    plt.imshow(img)
    ax.set_title(results[i][1], fontsize=40)

This notebook has come to an end.