# TAO Image Multiclass Classification


This example trained on https://github.com/everguard-inc/dataset_ppe/tree/ppe_multilabel_crops dataset with Nvidia pretrained Resnet18

# Set up env variables and map drives 

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY='cXU2NzU4bHNpNHBpMzN2Z21mcmsxcDQzcDE6MGIwNDFmMDYtNmFjYy00YjJiLTliYWMtMDdjN2NjZjgwMDYx' #SET YOUR_NGC_KEY
%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/classification
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/classification

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/classification
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
os.environ["LOCAL_PROJECT_DIR"] = '/home/eg/auv/tao/tutorial/classification_v2' #SET YOUR LOCAL PATH

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "classification"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
os.makedirs(os.environ["LOCAL_SPECS_DIR"], exist_ok=True)
os.makedirs(os.environ["LOCAL_DATA_DIR"], exist_ok=True)

%env SPECS_DIR=/workspace/tao-experiments/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

In [None]:
os.getenv("NOTEBOOK_ROOT", os.getcwd())

In [None]:
!echo $LOCAL_PROJECT_DIR

**Directory above will be used for configs**

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from outside to inside of the docker instance.

In [None]:
# Mapping up the local directories to the TAO docker.
import json
import os
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions":{
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

!cat ~/.tao_mounts.json

In [None]:
# SKIP this cell IF you have already installed the TAO launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tao

If you followed installation advices from [Confluence Page](https://everguard.atlassian.net/wiki/spaces/EVERGUARD/pages/1644658744/Nvidia+TAO) your environment is ready.
Check it with following commands:

In [None]:
!tao info

Your data should be storred in LOCAL_DATA_DIR -> /workspace/tao-experiments/data. and has next structure 

# Download pretrained models

We will use NGC CLI to get the pre-trained models. You had to setup it earlier.

List of models

In [None]:
!ngc registry model list nvidia/tao/pretrained_classification:*

Create directory for model

In [None]:
#mkdir -p $LOCAL_EXPERIMENT_DIR/<model_dir_name>/
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/

In [None]:
# Pull pretrained model from NGC
#ngc registry model download-version nvidia/tao/pretrained_classification:<model_name> --dest $LOCAL_EXPERIMENT_DIR/<model_dir_name>
!ngc registry model download-version nvidia/tao/pretrained_classification:resnet18 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet18


In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/pretrained_classification_vresnet18

# Configuratin file

[Conf file templates](https://docs.nvidia.com/tao/tao-toolkit/text/multitask_image_classification.html#preparing-the-input-data-structure) with detailed explanation of hyperparameters

In [None]:
# Should be saved at specs directory
!echo $LOCAL_SPECS_DIR

In [None]:
!cat $LOCAL_SPECS_DIR/classification_spec.cfg

# Run TAO training
Provide the sample spec file and the output directory location for models

In [None]:
!echo $SPECS_DIR

In [None]:
!tao multitask_classification train\
        -e $SPECS_DIR/classification_spec.cfg\
        -r $USER_EXPERIMENT_DIR/output\
        -k $KEY --gpu_index 1 --gpus 2

In [None]:
!tao multitask_classification inference\
                -m $USER_EXPERIMENT_DIR/output/weights/multitask_cls_resnet18_epoch_010.tlt\
                -i /workspace/tao-experiments/data/dataset_ppe/test/crops/val/rlg_f2b53921_2022-01-15_18-21-58_classes-in_harness-hardhat_unrecognized-in_vest-person_not_in_bucket_crop-01.jpg\
                -cm $USER_EXPERIMENT_DIR/output/class_mapping.json -k $KEY


In [None]:
import matplotlib.pyplot as plt
import cv2
img = cv2.imread('/home/eg/auv/tao/tutorial/classification_v2/data/dataset_ppe/test/crops/val/rlg_f2b53921_2022-01-15_18-21-58_classes-in_harness-hardhat_unrecognized-in_vest-person_not_in_bucket_crop-01.jpg')
plt.imshow(img)

In [None]:
!cat $LOCAL_PROJECT_DIR/classification/output/multitask_cls_training_log_resnet18.csv

# Evaluate trained models
In this step, we assume that the training is complete and the model from the final epoch (`resnet_0<num>.tlt`) is available. If you would like to run evaluation on an earlier model, please edit the spec file at `$SPECS_DIR/classification_spec.cfg` to point to the intended model.

In [None]:
!tao multitask_classification evaluate\
            -e $SPECS_DIR/classification_spec.cfg\
            -m $USER_EXPERIMENT_DIR/output/weights/multitask_cls_resnet18_epoch_010.tlt\
            -k $KEY 



# Prune trained models
* Specify pre-trained model
* Equalization criterion
* Threshold for pruning
* Exclude prediction layer that you don't want pruned (e.g. predictions)

Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is depend on the dataset. A pth value 0.68 is just a starting point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!echo $USER_EXPERIMENT_DIR

In [None]:
%env EPOCH=010
!mkdir -p $LOCAL_EXPERIMENT_DIR/output/resnet_pruned
!tao multitask_classification prune -m $USER_EXPERIMENT_DIR/output/weights/multitask_cls_resnet18_epoch_010.tlt \
                          -o $USER_EXPERIMENT_DIR/output/resnet_pruned/resnet18_nopool_bn_pruned.tlt \
                          -eq union \
                          -pth 0.6 \
                          -k $KEY \
                          --results_dir $USER_EXPERIMENT_DIR/logs

In [None]:
#Train pruned
!tao multitask_classification train\
        -e $SPECS_DIR/classification_spec_pruned.cfg\
        -r $USER_EXPERIMENT_DIR/output_retrain\
        -k $KEY --gpu_index 1 --gpus 2


# Evaluate Pruned

In [None]:
!tao multitask_classification evaluate\
-e $SPECS_DIR/classification_spec_pruned.cfg\
-m $USER_EXPERIMENT_DIR/output/weights/multitask_cls_resnet18_epoch_005.tlt -k $KEY
