In [None]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

# Heart Rate Estimation using TAO HeartRateNet

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/embedded-transfer-learning-toolkit-software-stack-1200x670px.png" width="1080"> 

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained model and train a model on the COHFACE dataset
* Run Inference on the trained model
* Export the retrained model to a .etlt file for deployment to DeepStream SDK

### Table of Contents

This notebook shows an example of non-invasive heart rate estimation using the Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables](#head-0)
1. [Prepare dataset and pre-trained model](#head-1) <br>
    1.1 [Verify downloaded dataset](#head-1-1) <br>
    1.2 [Process the extracted data](#head-1-2) <br>
    1.3 [Download pre-trained model](#head-1-3) <br>
2. [Setup GPU environment](#head-2) <br>
    2.1 [Connect to GPU Instance](#head-2-1) <br>
    2.2 [Mounting Google drive](#head-2-2) <br>
    2.3 [Setup Python environment](#head-2-3) <br>
    2.4 [Reset env variables](#head-2-4) <br>    
3. [Generate tfrecords from RGB videos](#head-3) <br>
    3.1 [Download haarcascade classifier](#head-3-1) <br>
    3.2 [Generate tfrecords](#head-3-2) <br>
4. [Provide training specification](#head-4) <br>
5. [Run TAO training](#head-5) <br>
6. [Evaluate the trained model](#head-6) <br>
7. [Inference](#head-7) <br>

## 0. Set up env variables<a class="anchor" id="head-0"></a>

When using the purpose-built pretrained models from NGC, please make sure to set the `$KEY` environment variable to the key as mentioned in the model overview. Failing to do so, can lead to errors when trying to load them as pretrained models.

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command-line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1
%env EXPERIMENT_DIR=/results/heartratenet
%env DATA_DIR=/content/drive/MyDrive/heartratenet_data
%env SPECS_DIR=/content/drive/MyDrive/ColabNotebooks/tensorflow/heartratenet/specs

# Showing list of specification files.
!ls -rlt $SPECS_DIR
!mkdir -p $EXPERIMENT_DIR/model/

The heartratenet api uses the `$DATAIO_SPEC` and `$TRAIN_SPEC` yaml files to set up directories. 

* `$DATAIO_SPEC` is `$LOCAL_SPECS_DIR/heartratenet_data_generation.yaml`
* `$TRAIN_SPEC` is `$LOCAL_SPECS_DIR/heartratenet_tlt_pretrain.yaml`

In [None]:
# Set up dataio and train experiment spec path and check if they exist

os.environ["DATAIO_SPEC"] = os.path.join(os.environ["SPECS_DIR"], 'heartratenet_data_generation.yaml')
os.environ["TRAIN_SPEC"] = os.path.join(os.environ["SPECS_DIR"], 'heartratenet_tlt_pretrain.yaml')

!if [ ! -f $DATAIO_SPEC ]; then echo "Dataio spec file not found."; else echo "Found dataio spec file.";fi
!if [ ! -f $TRAIN_SPEC ]; then echo "Train spec file not found."; else echo "Found train spec file.";fi

Now, we have to make sure the environment paths selected above match the inputs to the api.
Go to `$DATAIO_SPEC` file and change `input_directory_path` and `data_directory_output_path` to the path specified in `$HEARTRATENET_DATA`.
Go to `$TRAIN_SPEC` file and change the `results_dir` to `$USER_EXPERIMENT_DIR` and also change the `checkpoint_dir` accordingly. This file is the input to the model. Next, change the `tfrecords_directory_path` to `$HEARTRATENET_DATA`. 

In [None]:
# Check to see if spec files are found and is updated.
try:
    from yaml import load, SafeLoader
    import os
    from os.path import normpath
    
    with open(os.environ.get("DATAIO_SPEC")) as f:
        print('dataio spec found')
        dataio_args = load(f, Loader = SafeLoader)
    if normpath(dataio_args['input_directory_path'])!= normpath(os.environ.get("HEARTRATENET_DATA")) or normpath(dataio_args['data_directory_output_path']) != normpath(os.environ.get("HEARTRATENET_DATA")):
        print(normpath(dataio_args['input_directory_path']), os.environ.get("HEARTRATENET_DATA") )
        print('Please update input_directory_path and data_directory_output_path')
except:
    print('Dataio spec is not found, please ensure there is dataio spec in proper folder')
    
try:
    from yaml import load
    import os
    from os.path import normpath, join
    
    with open(os.environ.get("TRAIN_SPEC")) as f:
        print('train spec found')
        train_args = load(f, Loader=SafeLoader)
        
    if normpath(train_args['results_dir']) != normpath(os.environ.get("USER_EXPERIMENT_DIR")):
        print('Please update results_dir')
        
    if normpath(train_args['dataloader']['dataset_info']['tfrecords_directory_path'])!= normpath(os.environ.get("HEARTRATENET_DATA")):
        print('Please update the tfrecords_directory_path')
except:
    print('Train spec is not found, please ensure there is train spec in proper folder')

## 1. Prepare dataset and pre-trained model <a class="anchor" id="head-1"></a>

Please download COHFACE public dataset from the following website: https://www.idiap.ch/dataset/cohface

After downloading the data, please extract the data to cohface folder and place it under `$LOCAL_DATA_DIR`.

### A. Verify downloaded dataset <a class="anchor" id="head-1-1"></a>

In [None]:
# Check the dataset is present.
!mkdir -p $DATA_DIR

In [None]:
!if [ ! -d $DATA_DIR/heartratenet/data ]; then echo "Data folder not found, please download."; else echo "Data folder found.";fi

### B. Process the extracted data <a class="anchor" id="head-1-2"></a>

The `dataio` module for heartratenet expects the data to be formatted in a predefined format.

The `dataio` spec file specifies the folders to be read in the three lists train_subjects, validation_subjects and test_subjects.
Place the video file under each subject in a folder named images. The path is `$LOCAL_DATA_DIR/subject_folder`.

The ground truth is expected in the following format. For the RGB camera feed, `image_timestamps.csv` consists of frame ID and corresponding timestamp in rows `ID`,`Time`. For the pulse readings, `ground_truth.csv` consists of a timestamp and the corresponding ppg reading in rows `Time`,`PulseWaveform`. The heart rate is predicted as the dominant frequency of the ppg signal. The API takes care of sampling differences between the RGB and PPG signals. COHFACE dataset has 40 subjects, use `start_subject_id` and `end_subject_id` as input arguments to the following script to specify subjects to process, `process_cohface.py` process subjects in range `[start_subject_id, end_subject_id)`

The following block will process the COHFACE dataset into a format consistent with heartratenet api.

In [None]:
%cd /content/drive/MyDrive/ColabNotebooks/tensorflow/heartratenet/
!python3.6 process_cohface.py -i $DATA_DIR/heartratenet/data/cohface/ \
                           -o /content/cohface_processed \
                           -start_subject_id 1 \
                           -end_subject_id 2

In [None]:
#!mkdir -p /content/drive/MyDrive/heartratenet_data/cohface_processed/1/0
!mv /content/cohface_processed/1/0/* /content/drive/MyDrive/heartratenet_data/cohface_processed/1/0/

### C. Download pre-trained model <a class="anchor" id="head-1-3"></a>

Please follow the instructions in the following to download and verify the pretrained model for heartratenet.

For HeartRateNet pretrained model please download model: `nvidia/tao/heartratenet:trainable_v2.0`.

After download the pre-trained model, please place the files in `$LOCAL_EXPERIMENT_DIR/pretrain_models`
You will then have the following path

* pretrained model in `$LOCAL_EXPERIMENT_DIR/pretrain_models/heartratenet_vtrainable_v2.0/model.tlt`


In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env LOCAL_PROJECT_DIR=/content/
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u -q "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))
!cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 /content/ngccli/ngc-cli/libstdc++.so.6

In [None]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/heartratenet:*

In [None]:
# Create the target destination to download the model.
!mkdir -p $EXPERIMENT_DIR/pretrain_models/

In [None]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/heartratenet:trainable_v2.0 \
    --dest $EXPERIMENT_DIR/pretrain_models/

In [None]:
!ls -rlt $EXPERIMENT_DIR/pretrain_models/heartratenet_vtrainable_v2.0

In [None]:
# Check the dataset is present
!if [ ! -f $EXPERIMENT_DIR/pretrain_models/heartratenet_vtrainable_v2.0/model.tlt ]; then echo 'Pretrain model file not found, please download.'; else echo 'Found Pretrain model file.';fi

## 2. Setup GPU environment <a class="anchor" id="head-2"></a>


### 2.1 Connect to GPU Instance <a class="anchor" id="head-2-1"></a>

1. Move any data saved to the Colab Instance storage to Google Drive  
2. Change Runtime type to GPU by Runtime(Top Left tab)->Change Runtime Type->GPU(Hardware Accelerator)
3.   Then click on Connect (Top Right)



### 2.2 Mounting Google drive <a class="anchor" id="head-2-2"></a>
Mount your Google drive storage to this Colab instance

In [None]:
from google.colab import drive
drive.mount('/content/drive')

### 2.3 Setup Python environment <a class="anchor" id="head-2-3"></a>
Setup the environment necessary to run the TAO Networks by running the bash script

In [None]:
!sh /content/drive/MyDrive/tf/setup_env.sh

### 2.4 Reset env variables <a class="anchor" id="head-2-4"></a>

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env KEY=nvidia_tlt
%env NUM_GPUS=1
%env EXPERIMENT_DIR=/results/classification
%env DATA_DIR=/content/drive/MyDrive/tf_data/classification_data/

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/classification

%env SPECS_DIR=/content/drive/MyDrive/ColabNotebooks/tensorflow/classification/specs

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

## 3. Generate tfrecords from RGB videos <a class="anchor" id="head-3"></a>
* Download haarcascade classifier and prepare directory
* Generate required motion and appearance maps for attention network.
* Create the tfrecords using the tao command

### A. Download haarcascade classifier <a class="anchor" id="head-3-1"></a>
Obtain the haarcascade classifer from (https://github.com/opencv/opencv/blob/master/data/haarcascades/haarcascade_frontalface_default.xml)

After downloading the haarcascade classifier, please place `haarcascade_frontalface_default.xml` in `$LOCAL_DATA_DIR`

You will have the following path

* haarcascade file in
`$LOCAL_DATA_DIR/haarcascade_frontalface_default.xml`

In [None]:
!wget https://github.com/opencv/opencv/raw/master/data/haarcascades/haarcascade_frontalface_default.xml -P $DATA_DIR/

Note: Please make sure the file has been downloaded successfully, failure to do so will result in the rest of the notebook not being operational.

In [None]:
# Check to see if haar classifier is present.
!if [ ! -f $DATA_DIR/haarcascade_frontalface_default.xml ]; then echo "Classifier not found, please ensure classifier is in proper file"; else echo "Classifier found.";fi

### B. Generate tfrecords <a class="anchor" id="head-3-2"></a>

In [None]:
!mkdir /content/drive/MyDrive/heartratenet_data/processed

In [None]:
!tao heartratenet dataset_convert --experiment_spec_file $DATAIO_SPEC

In [None]:
# Check to see if tfrecords are present.
!if [ ! -f $DATA_DIR/processed/train.tfrecord ]; then echo "Did not find training file, please ensure training record is generated."; else echo "Found training record";fi
!if [ ! -f $DATA_DIR/processed/validation.tfrecord ]; then echo "Did not find validation file, please ensure validation record is generated."; else echo "Found validation record";fi
!if [ ! -f $DATA_DIR/processed/test.tfrecord ]; then echo "Did not find test file, please ensure test record is generated."; else echo "Found test record";fi

## 4. Provide training specification <a class="anchor" id="head-4"></a>
* Tfrecords for the training dataset
    * In order to use the newly generated tfrecords for training, update the `tfrecords_directory_path` parameter of `dataset_info` section in the spec file at `$TRAIN_SPEC`
* Pre-trained model path
    * Update `checkpoint_dir` in the spec file `$TRAIN_SPEC`
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate, etc.


In [None]:
!cat $LOCAL_TRAIN_SPEC

## 5. Run TAO training <a class="anchor" id="head-5"></a>
* Provide the sample spec file and the output directory location for models

In [None]:
!tao heartratenet train -e $TRAIN_SPEC \
                        -k $KEY \
                        -r $EXPERIMENT_DIR/model

## 6. Evaluate the trained model <a class="anchor" id="head-6"></a>

In [None]:
!tao heartratenet evaluate -e $TRAIN_SPEC \
                           -k $KEY \
                           -m $EXPERIMENT_DIR/model/ \
                           -r $EXPERIMENT_DIR/eval_results

## 7. Inference <a class="anchor" id="head-7"></a>
* Ensure you have the required data format as indicated in the model card
* Modify `m` to the full model path for evaluation
* Modify `subject_infer_dir` and `subject` below to align with your data
* Modify `results_dir` to your desired result directory
* Modify `fps` to match inference data fps, COHFACE dataset recorded in 20fps

In [None]:
!tao heartratenet inference -m $EXPERIMENT_DIR/model/model.tlt \
                            --subject_infer_dir $DATA_DIR \
                            --subject cohface_processed/1/0 \
                            --results_dir $EXPERIMENT_DIR \
                            --fps 20 \
                            -k $KEY \
                            -c channels_first

In [None]:
import os
import cv2
import IPython.display
import PIL.Image

subject_infer_dir = os.environ['LOCAL_DATA_DIR']
subject = 'Subject1'
display_freq = 30
results_file = os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'], 'results.txt')
with open(results_file, 'r') as file:
    results = file.read()

subject_work_dir = os.path.join(subject_infer_dir, subject, 'images')
cap = cv2.VideoCapture(os.path.join(subject_work_dir, '%04d.bmp'))
while cap.isOpened():
    ret, frame = cap.read()
    if ret:
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
        IPython.display.display(PIL.Image.fromarray(frame))
    else:
        break
print(results)