# Multi-task classification using TAO

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080">

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet10 model and train a ResNet-10 Multi-task Classification model on fashion dataset
* Prune the trained model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Run Inference on the trained model
* Export the pruned and retrained model to a .etlt file for deployment to DeepStream

### Table of Contents
This notebook shows an example use case for classification using the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables and map drives](#head-0)
1. [Installing the TAO Launcher](#head-1)
2. [Prepare dataset and pre-trained model](#head-2) <br>
     2.1 [Download the dataset](#head-2-1)<br>
     2.2 [Verify the downloaded dataset](#head-2-2)<br>
     2.3 [Data preprocessing](#head-2-3)<br>
     2.4 [Download pretrained model](#head-2-4)
3. [Provide training specification](#head-3)
4. [Run TAO training](#head-4)
5. [Evaluate trained models](#head-5)
6. [Prune trained models](#head-6)
7. [Retrain pruned models](#head-7)
8. [Testing the model](#head-8)
9. [Inferences](#head-9)
10. [Export and Deploy!](#head-10)

## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>
When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/classification`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/multitask_classification
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/classification

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
os.environ["LOCAL_PROJECT_DIR"] = FIXME

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "multitask_classification"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/classification/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [None]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions":{
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

In [None]:
!cat ~/.tao_mounts.json

## 1. Installing the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

In [None]:
# SKIP this cell IF you have already installed the TAO launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tao

In [None]:
# View the versions of the TAO launcher
!tao info

## 2. Prepare datasets and pre-trained model <a class="anchor" id="head-2"></a>

We will be using the Fashion Product Images (Small) for the tutorial. This dataset is available on Kaggle.
 
In this tutorial, our trained classification network will perform three tasks: article category classification, base color classification and target season classification.

### 2.1 Download the dataset <a class="anchor" id="head-2-1"></a>

In [None]:
import os
!mkdir -p $LOCAL_DATA_DIR
!echo "Your LOCAL_DATA_DIR is: $LOCAL_DATA_DIR"

To download the dataset, you will need a Kaggle account. After login, you can download the dataset zip file here: https://www.kaggle.com/paramaggarwal/fashion-product-images-small

The downloaded file is `archive.zip` with a subfolder called `myntradataset`. Unzip contents in this subfolder to your `LOCAL_DATA_DIR` created in the cell above and you should have a folder called `images` and a CSV file called `styles.csv`

### 2.2 Verify the downloaded dataset <a class="anchor" id="head-2-2"></a>

In [None]:
# Check the dataset is present
!mkdir -p $LOCAL_DATA_DIR
!if [ ! -d $LOCAL_DATA_DIR/images ]; then echo 'images folder NOT found.'; else echo 'Found images folder.';fi
!if [ ! -f $LOCAL_DATA_DIR/styles.csv ]; then echo 'CSV file NOT found.'; else echo 'Found CSV file.';fi

### 2.3 Data preprocessing <a class="anchor" id="head-2-3"></a>

In order to make data trainable in TAO, we need to preprocess it and do train / val split.

TAO Multitask classification requires:   
1. A training label CSV file containing labels for training images
2. A validation label CSV file containing labels for validation images
3. An image folder containing all train and val images (may also contain other images, the images to be used is controlled by CSV files).

The CSV files for training / validation labels should have following patterns:
1. The first column should always be `fname` containing file names for images (without folder prefix)
2. Rest of columns should be the name of individual tasks. There're no limitations on the number of tasks

For example, if your validation set has 2 images, the CSV should look like this:

| fname     | base_color | category | season |
|-----------|------------|----------|--------|
| 10000.jpg | Blue       | Shoes    | Spring |
| 10001.jpg | White      | Bags     | Fall   |

We also need to do train/val split. Here, we use 10% of data (random chosen) as validation set.

In [None]:
!pip3 install numpy
!pip3 install pandas
import os
import numpy as np
import pandas as pd

df = pd.read_csv(os.environ['LOCAL_DATA_DIR'] + '/styles.csv', error_bad_lines=False, warn_bad_lines=False)
df = df[['id', 'baseColour', 'subCategory', 'season']]
df = df.dropna()
category_cls = df.subCategory.value_counts()[:10].index # 10-class classification
season_cls = ['Spring', 'Summer', 'Fall', 'Winter'] # 4-class classification
color_cls = df.baseColour.value_counts()[:11].index # 11-class classification

# Get all valid rows
df = df[df.subCategory.isin(category_cls) & df.season.isin(season_cls) & df.baseColour.isin(color_cls)]
df.columns = ['fname', 'base_color', 'category', 'season']
df.fname = df.fname.astype(str)
df.fname = df.fname + '.jpg'

# remove entries whose image file is missing
all_img_files = os.listdir(os.environ['LOCAL_DATA_DIR'] + '/images')
df = df[df.fname.isin(all_img_files)]

idx = np.arange(len(df))
np.random.shuffle(idx)
val_df = df.iloc[idx[:(len(df) // 10)]]
train_df = df.iloc[idx[(len(df) // 10):]]

# Add a simple sanity check
assert len(val_df.season.unique()) == 4 and len(val_df.base_color.unique()) == 11 and \
    len(val_df.category.unique()) == 10, 'Validation set misses some classes, re-run this cell!'
assert len(train_df.season.unique()) == 4 and len(train_df.base_color.unique()) == 11 and \
    len(train_df.category.unique()) == 10, 'Training set misses some classes, re-run this cell!'

# save processed data labels
train_df.to_csv(os.environ['LOCAL_DATA_DIR'] + '/train.csv', index=False)
val_df.to_csv(os.environ['LOCAL_DATA_DIR'] + '/val.csv', index=False)

In [None]:
# verify
import pandas as pd

print("Number of images in the train set. {}".format(
    len(pd.read_csv(os.environ['LOCAL_DATA_DIR'] + '/train.csv'))
))
print("Number of images in the validation set. {}".format(
    len(pd.read_csv(os.environ['LOCAL_DATA_DIR'] + '/val.csv'))
))

In [None]:
# Sample label.
pd.read_csv(os.environ['LOCAL_DATA_DIR'] + '/val.csv').head()

### 2.4 Download pre-trained model <a class="anchor" id="head-2-4"></a>

 We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tao/pretrained_classification:*

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet10/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_classification:resnet10 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet10

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet10/pretrained_classification_vresnet10

## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Training dataset
* Validation dataset
* Pre-trained models
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [None]:
!cat $LOCAL_SPECS_DIR/mclassification_spec.cfg

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models

In [None]:
!tao multitask_classification train -e $SPECS_DIR/mclassification_spec.cfg \
                                    -r $USER_EXPERIMENT_DIR \
                                    -k $KEY \
                                    --gpus $NUM_GPUS

In [None]:
print("To resume from checkpoint, please change pretrain_model_path to resume_model_path in config file.")

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/multitask_cls_training_log_resnet10.csv
%set_env EPOCH=010

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>


In [None]:
!tao multitask_classification evaluate -m $USER_EXPERIMENT_DIR/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                       -e $SPECS_DIR/mclassification_spec.cfg \
                                       -k $KEY

## 6. Prune trained models <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion
* Threshold for pruning

Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. A pth value 0.65 is just a starting point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/resnet_pruned
!tao multitask_classification prune -m $USER_EXPERIMENT_DIR/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                    -o $USER_EXPERIMENT_DIR/resnet_pruned/resnet10_pruned.tlt \
                                    -eq union \
                                    -pth 0.65 \
                                    -k $KEY \
                                    --results_dir $USER_EXPERIMENT_DIR/logs

In [None]:
print('Pruned model:')
print('------------')
!ls -rlt $LOCAL_EXPERIMENT_DIR/resnet_pruned

## 7. Retrain pruned models <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification

In [None]:
!cat $LOCAL_SPECS_DIR/mclassification_retrain_spec.cfg

In [None]:
!tao multitask_classification train -e $SPECS_DIR/mclassification_retrain_spec.cfg \
                                    -r $USER_EXPERIMENT_DIR/resnet_pruned \
                                    -k $KEY \
                                    --gpus $NUM_GPUS

## 8. Testing the model! <a class="anchor" id="head-8"></a>

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $LOCAL_EXPERIMENT_DIR/resnet_pruned/multitask_cls_training_log_resnet10.csv
%set_env EPOCH=010

In [None]:
!tao multitask_classification evaluate -m $USER_EXPERIMENT_DIR/resnet_pruned/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                       -e $SPECS_DIR/mclassification_retrain_spec.cfg \
                                       -k $KEY

TAO also provides `confmat` command to generate confusion matrix of the model on an unseen dataset. Users need to provide the image folder and the dataset labels. Here, we use the validation dataset as sample.

In [None]:
!tao multitask_classification confmat -m $USER_EXPERIMENT_DIR/resnet_pruned/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                      -i $DATA_DOWNLOAD_DIR/images \
                                      -l $DATA_DOWNLOAD_DIR/val.csv \
                                      -k $KEY

## 9. Inferences <a class="anchor" id="head-9"></a>

TAO provides `inference` command to infer on a single image. User needs to provide class mapping JSON file generated during training process.

In [None]:
!pip3 install matplotlib==3.3.3
import matplotlib.pyplot as plt
from PIL import Image 
import os

DEMO_IMAGE = '1654.jpg'
image_path = os.path.join(os.environ.get('LOCAL_DATA_DIR'), 'images', DEMO_IMAGE)
plt.imshow(Image.open(image_path))
os.environ['DEMO_IMG_PATH'] = os.path.join(os.environ.get('DATA_DOWNLOAD_DIR'), 'images/', DEMO_IMAGE)

In [None]:
!tao multitask_classification inference -m $USER_EXPERIMENT_DIR/resnet_pruned/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                        -i $DEMO_IMG_PATH \
                                        -cm $USER_EXPERIMENT_DIR/class_mapping.json \
                                        -k $KEY

## 10. Export and Deploy! <a class="anchor" id="head-10"></a>

You may export in FP32, FP16 or INT8 mode using the code block below. For INT8, you need to provide calibration image directory.

In [None]:
# tao <task> export will fail if .etlt already exists. So we clear the export folder before tao <task> export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
!mkdir -p $LOCAL_EXPERIMENT_DIR/export
# Export in FP32 mode. Change --data_type to fp16 for FP16 mode
!tao multitask_classification export -m $USER_EXPERIMENT_DIR/resnet_pruned/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
                                     -cm $USER_EXPERIMENT_DIR/class_mapping.json \
                                     -o $USER_EXPERIMENT_DIR/export/mcls_export.etlt \
                                     -k $KEY \
                                     --data_type fp32

# Uncomment to export in INT8 mode (generate calibration cache file). 
# !tao multitask_classification export -m $USER_EXPERIMENT_DIR/resnet_pruned/weights/multitask_cls_resnet10_epoch_$EPOCH.tlt \
#                                      -cm $USER_EXPERIMENT_DIR/class_mapping.json \
#                                      -o $USER_EXPERIMENT_DIR/export/mcls_export.etlt \
#                                      -k $KEY \
#                                      --cal_image_dir  $DATA_DOWNLOAD_DIR/imdb_processed/images \
#                                      --data_type int8 \
#                                      --batch_size 16 \
#                                      --batches 10 \
#                                      --cal_cache_file $USER_EXPERIMENT_DIR/export/cal.bin  \
#                                      --cal_data_file $USER_EXPERIMENT_DIR/export/cal.tensorfile

In [None]:
print('Exported model:')
print('------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/export

You can now generate a TensorRT engine using `tao-converter`.

The `tao-converter` produces optimized tensorrt engines for the platform that it resides on. Therefore, to get maximum performance, please instantiate this docker and execute the `tao-converter` command, with the exported `.etlt` file and calibration cache (for int8 mode) on your target device. The tao-converter utility included in this docker only works for x86 devices, with discrete NVIDIA GPU's. 

For the jetson devices, please download the tao-converter for jetson from the dev zone link [here](https://developer.nvidia.com/tao-converter). 

If you choose to integrate your model into deepstream directly, you may do so by simply copying the exported `.etlt` file along with the calibration cache to the target device and updating the spec file that configures the `gst-nvinfer` element to point to this newly exported model. Usually this file is called `config_infer_primary.txt` for detection models and `config_infer_secondary_*.txt` for classification models.

In [None]:
# For tao-converter, you need to provide input shape -d, which can be found in training spec file.
#   You also need to provide all output nodes -o. They are in the pattern `*task_name*/Softmax`,
#   the export command will also have it printed for your reference.

# Convert to TensorRT engine (FP32)
!tao converter -k $KEY \
                   -d 3,80,60 \
                   -o base_color/Softmax,category/Softmax,season/Softmax \
                   -e $USER_EXPERIMENT_DIR/export/trt.engine \
                   -m 16 \
                   -t fp32 \
                   -i nchw \
                   $USER_EXPERIMENT_DIR/export/mcls_export.etlt

# Convert to TensorRT engine (FP16)
# !tao converter -k $KEY \
#                    -d 3,80,60 \
#                    -o base_color/Softmax,category/Softmax,season/Softmax \
#                    -e $USER_EXPERIMENT_DIR/export/trt.engine \
#                    -m 16 \
#                    -t fp16 \
#                    -i nchw \
#                    $USER_EXPERIMENT_DIR/export/mcls_export.etlt

# Convert to TensorRT engine (INT8).
# !tao converter -k $KEY  \
#                    -d 3,80,60 \
#                    -o base_color/Softmax,category/Softmax,season/Softmax \
#                    -c $USER_EXPERIMENT_DIR/export/cal.bin \
#                    -e $USER_EXPERIMENT_DIR/export/trt.engine \
#                    -b 8 \
#                    -m 16 \
#                    -t int8 \
#                    -i nchw \
#                    $USER_EXPERIMENT_DIR/export/mcls_export.etlt