## 0. Set up env variables <a class="anchor" id="head-0"></a>

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%set_env KEY=nvidia_tlt
%set_env GPU_INDEX=0
%set_env USER_EXPERIMENT_DIR=/workspace/tlt-experiments/unet
%set_env DATA_DOWNLOAD_DIR=/workspace/tlt-experiments/Kvasir-SEG-TLT

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tlt-samples/unet

# Please define this local project directory that needs to be mapped to the TLT docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/unet
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
%env LOCAL_PROJECT_DIR=/raid/home/eason/Unet_training

# !PLEASE MAKE SURE TO UPDATE THIS PATH!.
# Point to the 'deps' folder in samples from where you are launching notebook inside unet folder.
%env PROJECT_DIR=/raid/home/eason/Unet_training

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "Kvasir-SEG-TLT"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "unet"
)

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%set_env SPECS_DIR=/workspace/tlt-experiments/examples/unet/specs

! ls -l $LOCAL_DATA_DIR

In [None]:
# Mapping up the local directories to the TLT docker.
import json
mounts_file = os.path.expanduser("~/.tlt_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tlt-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ]
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)

## 1. Installing the TLT launcher <a class="anchor" id="head-1"></a>
The TLT launcher is a python package distributed as a python wheel listed in the `nvidia-pyindex` python index. You may install the launcher by executing the following cell.

Please note that TLT recommends users to run the TLT launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TLT python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be trigerred to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

Please note that TLT recommends users to run the TLT launcher in a virtual env with python >=3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the virtualenv and virtualenvwrapper packages.

In [None]:
# SKIP this step IF you have already installed the tlt launcher.
!pip3 install nvidia-pyindex
!pip3 install nvidia-tlt

In [None]:
# View the versions of the TLT launcher
!tlt info

## 2. Prepare pre-trained model <a class="anchor" id="head-1"></a>

We will use NGC CLI to get the pre-trained models. For more details, go to ngc.nvidia.com and click the SETUP on the navigation bar. 

*Note: When using vanilla_unet as arch for binary segmentation, pre-trained model section can be skipped. Pre-trained weights are available only for Resnet/ VGG templates*

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_reg_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
!ngc registry model list nvidia/tlt_semantic_segmentation:*

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet101/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tlt_semantic_segmentation:resnet101 --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet101

In [None]:
print("Check that model is downloaded into dir.")
!ls -l $LOCAL_EXPERIMENT_DIR/pretrained_resnet101/tlt_semantic_segmentation_vresnet101

## 3. Provide training specification <a class="anchor" id="head-3"></a>

* Images and Masks path
    * In order to use the newly generated images, masks folder update the dataset_config parameter in the spec file at `$SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt` 
    * Update the train, val images and masks paths. The test only requires the images path. 
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.


In [None]:
!cat $LOCAL_SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt

## 4. Run TLT training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [None]:
print("For multi-GPU, change --gpus based on your machine.")
!tlt unet train --gpus=1 --gpu_index=$GPU_INDEX \
              -e $SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt \
              -r $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned \
              -m $USER_EXPERIMENT_DIR/pretrained_resnet101/tlt_semantic_segmentation_vresnet101/resnet_101.hdf5 \
              -n model_Kvasir_SEG \
              -k $KEY 

Unet supports restarting from checkpoint. Incase, the training job is killed prematurely, you may resume training from the closest checkpoint by simply re-running the same command line. Please do make sure to use the same number of GPUs when restarting the training.

In [None]:
print('Model for every epoch at checkpoint_interval mentioned in the spec file:')
print('---------------------')
!ls -ltrh $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/
!ls -ltrh $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights

## 5. Evaluate trained models <a class="anchor" id="head-5"></a>

The last step model saved in the `$USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights` dir is used for evaluation/ inference/ export. The evaluation also creates `$LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/results_tlt.json`

In [None]:
!tlt unet evaluate --gpu_index=$GPU_INDEX -e $SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt \
                 -m $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights/model_Kvasir_SEG.tlt \
                 -o $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/ \
                 -k $KEY

In [None]:
!cat $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/results_tlt.json

## 6. Model Export <a class="anchor" id="head-7"></a>

In [None]:
# tlt-export will fail if .etlt already exists. So we clear the export folder before tlt-export
!rm -rf $LOCAL_EXPERIMENT_DIR/export
# Export in FP32 mode. 
!mkdir -p $LOCAL_EXPERIMENT_DIR/export 
!tlt unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights/model_Kvasir_SEG.tlt \
               -k $KEY \
               -e $SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt \
               --data_type fp32 \
               --engine_file $USER_EXPERIMENT_DIR/export/trtfp32.Kvasir_SEG.engine

In [None]:
# Verify the tensorrt engine accuracy on the validation dataset. This generates results text file results_trt.json in 
# the dir mentioned to -o argument 

!tlt unet evaluate --gpu_index=$GPU_INDEX -e $SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt \
                 -m $USER_EXPERIMENT_DIR/export/trtfp32.Kvasir_SEG.engine \
                 -o $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/ \
                 -k $KEY

In [None]:
!cat $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/results_trt.json

In [None]:
!rm -rf $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights/*.etlt

In [None]:
# Export in INt8 mode. 
!mkdir -p $LOCAL_EXPERIMENT_DIR/export 
!tlt unet export --gpu_index=$GPU_INDEX -m $USER_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights/model_Kvasir_SEG.tlt \
                -k $KEY \
                -e $SPECS_DIR/unet_train_resnet_unet_Kvasir_SEG.txt \
                --data_type int8 \
                --engine_file $USER_EXPERIMENT_DIR/export/int8.Kvasir_SEG.engine \
                --data_type int8 \
                --cal_data_file $USER_EXPERIMENT_DIR/export/Kvasir_SEG_cal_data_file.txt \
                --cal_cache_file $USER_EXPERIMENT_DIR/export/Kvasir_SEG_cal.bin

In [None]:
# Check if etlt model is correctly saved.
!ls -l $LOCAL_EXPERIMENT_DIR/Kvasir_SEG_experiment_unpruned/weights/