## Switch to CPU Instance (Advisable only for Non Colab-Pro instance)

1. Switch to CPU Instance for until Step 2 for non GPU dependent tasks
2. This increases your time available for the GPU dependent tasks on a Colab instance
2. Change Runtime type to CPU by Runtime(Top Left tab)->Change Runtime Type->None(Hardware Accelerator)
3.   Then click on Connect (Top Right)



## Mounting Google drive
Mount your Google drive storage to this Colab instance

In [None]:
try:
    import google.colab
    %env GOOGLE_COLAB=1
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
except:
    %env GOOGLE_COLAB=0
    print("Warning: Not a Colab Environment")

# Emotion Classification using TAO EmotionNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train an EmotionNet model on subset of CK+ dataset
* Run Inference on the trained model
* Export the retrained model to a .etlt file for deployment for DeepStream SDK

### Table of Contents

This notebook shows an example of emotion classification in the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables](#head-0)
1. [Prepare dataset and pre-trained model](#head-1) <br>
    1.1 [Verify downloaded dataset](#head-1-1) <br>
    1.2 [Convert dataset labels to required json format](#head-1-2) <br>
    1.3 [Verify dataset conversion](#head-1-3) <br>
    1.4 [Download pre-trained model](#head-1-4) <br>
2. [Setup GPU environment](#head-2) <br>
    2.1 [Connect to GPU Instance](#head-2-1) <br>
    2.2 [Mounting Google drive](#head-2-2) <br>
    2.3 [Setup Python environment](#head-2-3) <br>
    2.4 [Reset env variables](#head-2-4) <br>
3. [Generate tfrecords from labels in json format](#head-3)
4. [Provide training specification](#head-4)
5. [Run TAO training](#head-5)
6. [Evaluate trained models](#head-6)
7. [Run TAO inference](#head-7)


## 0. Set up env variables and set FIXME parameters <a class="anchor" id="head-0"></a>

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

#### FIXME
1. NUM_GPUS - set this to <= number of GPU's availble on the instance
1. EXPERIMENT_DIR - set this path to a folder location where pretrained models, checkpoints and log files during different model actions will be saved
1. delete_existing_experiments - set to True to remove existing pretrained models, checkpoints and log files of a previous experiment
1. DATA_DIR - set this path to a folder location where you want to dataset to be present
1. delete_existing_data - set this to True to remove existing preprocessed and original data

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env TAO_DOCKER_DISABLE=1

%env KEY=nvidia_tlt
#FIXME1
%env NUM_GPUS=1

# Change the paths according to your directory structure, these are just examples
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/ColabNotebooks
if not os.path.exists(os.environ["COLAB_NOTEBOOKS_PATH"]):
    raise("Error, enter the path of the colab notebooks repo correctly")

#FIXME2
%env EXPERIMENT_DIR=/content/drive/MyDrive/results/emotionnet
#FIXME3
delete_existing_experiments = True
#FIXME4
%env DATA_DIR=/content/drive/MyDrive/emotionnet_data
#FIXME5
delete_existing_data = False

if delete_existing_experiments:
    !rm -rf $EXPERIMENT_DIR
if delete_existing_data:
    !rm -rf $DATA_DIR

SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/emotionnet/specs"
%env SPECS_DIR={SPECS_DIR}
DATASET_SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/emotionnet/dataset_specs"
%env DATASET_SPECS_DIR={DATASET_SPECS_DIR}

# Showing list of specification files.
!ls -rlt $SPECS_DIR
!ls -rlt $DATASET_SPECS_DIR

!sudo mkdir -p $DATA_DIR && sudo chmod -R 777 $DATA_DIR
!sudo mkdir -p $EXPERIMENT_DIR && sudo chmod -R 777 $EXPERIMENT_DIR

## 1. Prepare dataset and pre-trained model <a class="anchor" id="head-1"></a>

Please download the CK+ dataset from: https://www.pitt.edu/~emotion/ck-spread.htm.
You will need to sign the dataset user agreement and send it to the email provided on the agreement sheet to get access to the dataset.

After obtaining the dataset, please place the files in `$DATA_DIR`. Please rename the dataset folder to `ckplus` as `+` sign may not be a valid folder name.
You will then have the following path for the CK+ dataset.
* Input data in `$DATA_DIR/ckplus`

You will then unzip the folder of ckplus dataset to the following folders.
* Image data: `$DATA_DIR/ckplus/cohn-kanade-images`
* Emotion label data: `$DATA_DIR/ckplus/Emotion`
* Landmarks label data: `$DATA_DIR/ckplus/Landmarks`

Note: please make sure that the folder name are as listed above. 

### A. Verify downloaded dataset <a class="anchor" id="head-1-1"></a>


In [None]:
# Check the dataset is present
!if [ ! -d $DATA_DIR/ckplus/cohn-kanade-images ]; then echo 'Image Data folder not found, please download.'; else echo 'Found Image Data folder.';fi
!if [ ! -d $DATA_DIR/ckplus/Emotion ]; then echo 'Emotion labels folder not found, please download.'; else echo 'Found Emotion Labels folder.';fi
!if [ ! -d $DATA_DIR/ckplus/Landmarks ]; then echo 'Landmarks labels folder not found, please download.'; else echo 'Found Landmarks Labels folder.';fi

### B. Convert dataset labels to required json format <a class="anchor" id="head-1-2"></a>

In [None]:
!python3 $COLAB_NOTEBOOKS_PATH/tensorflow/emotionnet/ckplus_convert.py --root_path $DATA_DIR --dataset_folder_name ckplus --container_root_path $DATA_DIR

### C. Verify dataset conversion <a class="anchor" id="head-1-3"></a>

Please use the provided conversion script `ckplus_convert.py` to convert existing `Landmarks` and `Emotion` labels from `CK+` dataset to the required json label format. 

Note: for other public datasets, please use this script as a reference to convert the labels to required format. 

In [None]:
# Sample json label.
!sed -n 1,201p $DATA_DIR/ckplus/data_factory/fiducial/S052_004_00000031_happy.json

### D. Download pre-trained model <a class="anchor" id="head-1-4"></a>

Please follow the instructions in the following to download and verify the pretrain model for emotionnet.

For EmotionNet pretrain model please download model: `nvidia/tao/emotionnet:trainable_v1.0`.

After downloading the pre-trained model, please place the files in `$EXPERIMENT_DIR/pretrain_models`
You will then have the following path

* pretrain model in `$EXPERIMENT_DIR/pretrain_models/emotionnet_vtrainable_v1.0/model.tlt`

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env LOCAL_PROJECT_DIR=/ngc_content/
%env CLI=ngccli_cat_linux.zip
!sudo mkdir -p $LOCAL_PROJECT_DIR/ngccli && sudo chmod -R 777 $LOCAL_PROJECT_DIR

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u -q "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))
!cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 $LOCAL_PROJECT_DIR/ngccli/ngc-cli/libstdc++.so.6

In [None]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/emotionnet:*

In [None]:
# Create the target destination to download the model.
!mkdir -p $EXPERIMENT_DIR/pretrain_models/

In [None]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/emotionnet:trainable_v1.0 \
    --dest $EXPERIMENT_DIR/pretrain_models/

In [None]:
!ls -rlt $EXPERIMENT_DIR/pretrain_models/emotionnet_vtrainable_v1.0

In [None]:
# Check the dataset is present
!if [ ! -f $EXPERIMENT_DIR/pretrain_models/emotionnet_vtrainable_v1.0/model.tlt ]; then echo 'Pretrain model file not found, please download.'; else echo 'Found Pretrain model file.';fi

## 2. Setup GPU environment <a class="anchor" id="head-2"></a>


### 2.1 Connect to GPU Instance <a class="anchor" id="head-2-1"></a>

1. Move any data saved to the Colab Instance storage to Google Drive  
2. Change Runtime type to GPU by Runtime(Top Left tab)->Change Runtime Type->GPU(Hardware Accelerator)
3.   Then click on Connect (Top Right)



### 2.2 Mounting Google drive <a class="anchor" id="head-2-2"></a>
Mount your Google drive storage to this Colab instance

In [None]:
try:
    import google.colab
    %env GOOGLE_COLAB=1
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
except:
    %env GOOGLE_COLAB=0
    print("Warning: Not a Colab Environment")

### 2.3 Setup Python environment <a class="anchor" id="head-2-3"></a>
Setup the environment necessary to run the TAO Networks by running the bash script

In [None]:
#FIXME
%env GENERAL_WHL_PATH=/content/drive/MyDrive/tf/general_whl
#FIXME
%env CODEBASE_WHL_PATH=/content/drive/MyDrive/tf/codebase_whl

import os
if os.path.exists(os.environ["GENERAL_WHL_PATH"]) and os.path.exists(os.environ["GENERAL_WHL_PATH"]):
    if os.environ["GOOGLE_COLAB"] == "1":
        os.environ["bash_script"] = "setup_env.sh"
    else:
        os.environ["bash_script"] = "setup_env_desktop.sh"

    !sed -i "s|PATH_TO_GENERAL_WHL|$GENERAL_WHL_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
    !sed -i "s|PATH_TO_CODEBASE_WHL|$CODEBASE_WHL_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
    !sed -i "s|PATH_TO_COLAB_NOTEBOOKS|$COLAB_NOTEBOOKS_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

    !sh $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script
else:
    raise("Error, enter the whl paths correctly")

In [None]:
if os.environ.get("PYTHONPATH","") == "":
    os.environ["PYTHONPATH"] = ""
os.environ["PYTHONPATH"]+=":/opt/nvidia/"
if os.environ["GOOGLE_COLAB"] == "1":
    os.environ["PYTHONPATH"]+=":/usr/local/lib/python3.6/dist-packages/third_party/nvml"
else:
    os.environ["PYTHONPATH"]+=":/home_duplicate/rarunachalam/miniconda3/envs/tf_py_36/lib/python3.6/site-packages/third_party/nvml" # FIX MINICONDA PATH

### 2.4 Reset env variables <a class="anchor" id="head-2-4"></a>

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env TAO_DOCKER_DISABLE=1

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Change the paths according to your directory structure, these are just examples
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/ColabNotebooks
if not os.path.exists(os.environ["COLAB_NOTEBOOKS_PATH"]):
    raise("Error, enter the path of the colab notebooks repo correctly")
%env EXPERIMENT_DIR=/content/drive/MyDrive/results/emotionnet
%env DATA_DIR=/content/drive/MyDrive/emotionnet_data

SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/emotionnet/specs"
%env SPECS_DIR={SPECS_DIR}
DATASET_SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/emotionnet/dataset_specs"
%env DATASET_SPECS_DIR={DATASET_SPECS_DIR}

# Showing list of specification files.
!ls -rlt $SPECS_DIR
!ls -rlt $DATASET_SPECS_DIR

## 3. Generate tfrecords from labels in json format <a class="anchor" id="head-3"></a>
* Create the tfrecords using the dataset_convert command


In [None]:
!rm -rf $DATA_DIR/post_data

In [None]:
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $DATASET_SPECS_DIR/dataio_config_ckplus.json
!tao emotionnet dataset_convert -c $DATASET_SPECS_DIR/dataio_config_ckplus.json

In [None]:
!ls $DATA_DIR/ckplus/data_factory/fiducial/* | wc -l

In [None]:
# Check the result folder is present
!mkdir -p $EXPERIMENT_DIR
!if [ ! -d $DATA_DIR/post_data/ckplus/Ground_Truth_DataFactory ]; then echo 'Ground truth folder not found.'; else echo 'Found Ground truth folder.';fi
!if [ ! -d $DATA_DIR/post_data/ckplus/GT_user_json ]; then echo 'GT user json folder not found.'; else echo 'Found GT user json folder.';fi

## 4. Provide training specification <a class="anchor" id="head-4"></a>
* Tfrecords for the train datasets
    * In order to use the newly generated tfrecords for training, update the 'ground_truth_folder_name' and 'tfrecords_directory_path' parameters of 'dataset_info' section in the spec file at `$SPECS_DIR/emotionnet_tlt_pretrain.yaml`
* Pre-trained model path
    * Update "pretrained_model_path" in the spec file at `$SPECS_DIR/emotionnet_tlt_pretrain.yaml`
    * If you want to training from random weights with your own data, you can enter "null" for "pretrained_model_path" section
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [None]:
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/emotionnet_tlt_pretrain.yaml
!sed -i "s|EXPERIMENT_DIR_PATH|$EXPERIMENT_DIR/|g" $SPECS_DIR/emotionnet_tlt_pretrain.yaml
!cat $SPECS_DIR/emotionnet_tlt_pretrain.yaml

## 5. Run TAO training <a class="anchor" id="head-5"></a>
* Provide the sample spec file and the output directory location for models

*Note: The training may take hours to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode. 

In [None]:
!tao emotionnet train -e $SPECS_DIR/emotionnet_tlt_pretrain.yaml \
                      -r $EXPERIMENT_DIR/experiment_result/exp1 \
                      -k $KEY

In [None]:
!ls -lh $EXPERIMENT_DIR/experiment_result

## 6. Evaluate the trained model <a class="anchor" id="head-6"></a>


In [None]:
!tao emotionnet evaluate -m $EXPERIMENT_DIR/experiment_result/exp1/model.tlt \
                         -r $EXPERIMENT_DIR/experiment_result/exp1 \
                         -e $SPECS_DIR/emotionnet_tlt_pretrain.yaml \
                         -k $KEY

In [None]:
# check the Evaluation result file and summary file and is presented
!if [ ! -f $EXPERIMENT_DIR/experiment_result/exp1/eval_results.txt ]; then echo 'Evaluation result summary file not found, please generate.'; else echo 'Found Evaluation result summary file.';fi
!if [ ! -f $EXPERIMENT_DIR/experiment_result/exp1/full_results.txt ]; then echo 'Evaluation result file not found, please generate.'; else echo 'Found Evaluation result file.';fi
!cat  $EXPERIMENT_DIR/experiment_result/exp1/eval_results.txt

## 7. Visualize Inference <a class="anchor" id="head-7"></a>

In this section, we run the inference tool to generate inferences on the trained models.

In [None]:
# Running inference for detection on n images
!tao emotionnet inference -e $SPECS_DIR/emotionnet_tlt_pretrain.yaml \
                          -i $DATA_DIR/ckplus/data_factory/fiducial/S111_001_00000013_surprise.json \
                          -m $EXPERIMENT_DIR/experiment_result/exp1/model.tlt \
                          -o $EXPERIMENT_DIR \
                          -k $KEY 

In [None]:
!sed -n 1,1p $EXPERIMENT_DIR/result.txt