## Switch to CPU Instance (Advisable only for Non Colab-Pro instance)

1. Switch to CPU Instance for until Step 2 for non GPU dependent tasks
2. This increases your time available for the GPU dependent tasks on a Colab instance
2. Change Runtime type to CPU by Runtime(Top Left tab)->Change Runtime Type->None(Hardware Accelerator)
3.   Then click on Connect (Top Right)



## Mounting Google drive
Mount your Google drive storage to this Colab instance

In [1]:
try:
    import google.colab
    %env GOOGLE_COLAB=1
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
except:
    %env GOOGLE_COLAB=0
    print("Warning: Not a Colab Environment")

env: GOOGLE_COLAB=1
Mounted at /content/drive


# Object Detection using TAO YOLOv4

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080">


## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Take a pretrained resnet18 model and train a ResNet-18 Yolo_v4 model on the KITTI dataset
* Prune the trained yolo_V4 model
* Retrain the pruned model to recover lost accuracy
* Export the pruned model
* Quantize the pruned model using QAT
* Run Inference on the trained model
* Export the pruned, quantized and retrained model to a .etlt file for deployment to DeepStream

## Table of Contents

This notebook shows an example usecase of YOLO v4 object detection using Train Adapt Optimize (TAO) Toolkit.

0. [Set up env variables](#head-0)
1. [Prepare dataset and pre-trained model](#head-1) <br>
     1.1 [Download the dataset](#head-1-1)<br>
     1.2 [Verify the downloaded dataset](#head-1-2)<br>
     1.3 [Download pretrained model](#head-1-3)
2. [Setup GPU environment](#head-2) <br>
    2.1 [Connect to GPU Instance](#head-2-1) <br>
    2.2 [Mounting Google drive](#head-2-2) <br>
    2.3 [Setup Python environment](#head-2-3) <br>
    2.4 [Reset env variables](#head-2-4) <br>
3. [Generate TF records](#head-3)
4. [Provide training specification](#head-4)
5. [Run TAO training](#head-5)
6. [Evaluate trained models](#head-6)
7. [Prune trained models](#head-7)
8. [Retrain pruned models](#head-8)
9. [Evaluate retrained model](#head-9)
10. [Visualize inferences](#head-10)


#### Note
1. This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly
1. This notebook uses KITTI dataset by default, which should be around ~12 GB. If you are limited by Google-Drive storage, we recommend to:

    i. Download the dataset onto the local system

    ii. Run the utility script at $COLAB_NOTEBOOKS/tensorflow/utils/generate_kitti_subset.py in your local system

    iii. This generates a subset of coco dataset with number of sample images you wish for

    iv. Upload this subset onto Google Drive

1. Using the default config/spec file provided in this notebook, each weight file size of yolo_v4 created during training will be ~400 MB

## 0. Set up env variables and set FIXME parameters <a class="anchor" id="head-0"></a>

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

#### FIXME
1. NUM_GPUS - set this to <= number of GPU's availble on the instance
1. COLAB_NOTEBOOKS_PATH - for Google Colab environment, set this path where you want to clone the repo to; for local system environment, set this path to the already cloned repo
1. EXPERIMENT_DIR - set this path to a folder location where pretrained models, checkpoints and log files during different model actions will be saved
1. delete_existing_experiments - set to True to remove existing pretrained models, checkpoints and log files of a previous experiment
1. DATA_DIR - set this path to a folder location where you want to dataset to be present
1. delete_existing_data - set this to True to remove existing preprocessed and original data

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env TAO_DOCKER_DISABLE=1

%env KEY=nvidia_tlt
#FIXME1
%env NUM_GPUS=1

#FIXME2
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao
if os.environ["GOOGLE_COLAB"] == "1":
    if not os.path.exists(os.path.join(os.environ["COLAB_NOTEBOOKS_PATH"])):
      !git clone https://github.com/NVIDIA-AI-IOT/nvidia-tao.git $COLAB_NOTEBOOKS_PATH
else:
    if not os.path.exists(os.environ["COLAB_NOTEBOOKS_PATH"]):
        raise Exception("Error, enter the path of the colab notebooks repo correctly")

#FIXME3
%env EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
#FIXME4
delete_existing_experiments = True
#FIXME5
%env DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/
#FIXME6
#DON'T CHANGE THIS VARIABLE!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
delete_existing_data = False

if delete_existing_experiments:
    !sudo rm -rf $EXPERIMENT_DIR
if delete_existing_data:
    !sudo rm -rf $DATA_DIR

SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/yolo_v4/specs"
%env SPECS_DIR={SPECS_DIR}
# Showing list of specification files.
!ls -rlt $SPECS_DIR

!sudo mkdir -p $DATA_DIR && sudo chmod -R 777 $DATA_DIR
!sudo mkdir -p $EXPERIMENT_DIR && sudo chmod -R 777 $EXPERIMENT_DIR

env: TAO_DOCKER_DISABLE=1
env: KEY=nvidia_tlt
env: NUM_GPUS=1
env: COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao
env: EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
env: DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/
env: SPECS_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao/tensorflow/yolo_v4/specs
total 9
-rw------- 1 root root 2218 Apr 24 03:00 yolo_v4_train_resnet18_kitti.txt
-rw------- 1 root root  260 Apr 24 03:00 yolo_v4_tfrecords_kitti_val.txt
-rw------- 1 root root  274 Apr 24 03:00 yolo_v4_tfrecords_kitti_train.txt
-rw------- 1 root root 2197 Apr 24 03:00 yolo_v4_retrain_resnet18_kitti.txt
-rw------- 1 root root 2953 Apr 24 15:25 yolo_v4_train_resnet18_kitti_3.txt


## 1. Prepare dataset and pre-trained model <a class="anchor" id="head-1"></a>

 We will be using the KITTI detection dataset for the tutorial. To find more details please visit
 http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d. Please download the KITTI detection images (http://www.cvlibs.net/download.php?file=data_object_image_2.zip) and labels (http://www.cvlibs.net/download.php?file=data_object_label_2.zip) to $DATA_DIR.
 
 The data will then be extracted to have
 * training images in `$DATA_DIR/training/image_2`
 * training labels in `$DATA_DIR/training/label_2`
 * testing images in `$DATA_DIR/testing/image_2`
 
You may use this notebook with your own dataset as well. To use this example with your own dataset, please follow the same directory structure as mentioned below.

*Note: There are no labels for the testing images, therefore we use it just to visualize inferences for the trained model.*

### 1.1 Download the dataset <a class="anchor" id="head-1-1"></a>

Once you have gotten the download links in your email, please populate them in place of the `KITTI_IMAGES_DOWNLOAD_URL` and the `KITTI_LABELS_DOWNLOAD_URL`. This next cell, will download the data and place in `$DATA_DIR`

In [None]:
import os
!mkdir -p $DATA_DIR
os.environ["URL_IMAGES"]="https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip"
!if [ ! -f $DATA_DIR/data_object_image_2.zip ]; then wget $URL_IMAGES -O $DATA_DIR/data_object_image_2.zip; else echo "image archive already downloaded"; fi 
os.environ["URL_LABELS"]="https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip"
!if [ ! -f $DATA_DIR/data_object_label_2.zip ]; then wget $URL_LABELS -O $DATA_DIR/data_object_label_2.zip; else echo "label archive already downloaded"; fi 

--2023-04-22 15:43:02--  https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.170.93
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.170.93|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 12569945557 (12G) [application/zip]
Saving to: ‘/content/drive/MyDrive/kitti_data//data_object_image_2.zip’


2023-04-22 15:51:48 (22.8 MB/s) - ‘/content/drive/MyDrive/kitti_data//data_object_image_2.zip’ saved [12569945557/12569945557]

--2023-04-22 15:51:48--  https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip
Resolving s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)... 52.219.169.113
Connecting to s3.eu-central-1.amazonaws.com (s3.eu-central-1.amazonaws.com)|52.219.169.113|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 5601213 (5.3M) [application/zip]
Saving to: ‘/content/drive/

### 1.2 Verify the downloaded dataset <a class="anchor" id="head-1-2"></a>

In [None]:
# Check the dataset is present
!mkdir -p $DATA_DIR
!if [ ! -f $DATA_DIR/data_object_image_2.zip ]; then echo 'Image zip file not found, please download.'; else echo 'Found Image zip file.';fi
!if [ ! -f $DATA_DIR/data_object_label_2.zip ]; then echo 'Label zip file not found, please download.'; else echo 'Found Labels zip file.';fi

Found Image zip file.
Found Labels zip file.


In [None]:
# This may take a while: verify integrity of zip files 
!sha256sum $DATA_DIR/data_object_image_2.zip | cut -d ' ' -f 1 | grep -xq '^351c5a2aa0cd9238b50174a3a62b846bc5855da256b82a196431d60ff8d43617$' ; \
if test $? -eq 0; then echo "images OK"; else echo "images corrupt, re-download!" && rm -f $DATA_DIR/data_object_image_2.zip; fi 
!sha256sum $DATA_DIR/data_object_label_2.zip | cut -d ' ' -f 1 | grep -xq '^4efc76220d867e1c31bb980bbf8cbc02599f02a9cb4350effa98dbb04aaed880$' ; \
if test $? -eq 0; then echo "labels OK"; else echo "labels corrupt, re-download!" && rm -f $DATA_DIR/data_object_label_2.zip; fi 

In [None]:
# unpack 
!unzip -u $DATA_DIR/data_object_image_2.zip -d $DATA_DIR
!unzip -u $DATA_DIR/data_object_label_2.zip -d $DATA_DIR

In [None]:
!ls /content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/

bulk_results.csv  dataset_test.csv  TAO_V7_results  V7_test   V7_validation
checkpoint	  nvidia-tao	    V7_detector.ag  V7_train


In [None]:
# verify
import os

DATA_DIR = os.environ.get('DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "V7_train/V7/JPEGImages")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "V7_train/V7/train_annotation_KITTI")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "V7_test/V7/JPEGImages")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

Number of images in the train/val set. 5040
Number of labels in the train/val set. 5040
Number of images in the test set. 1080


In [None]:
!cat $DATA_DIR/V7_train/V7/train_annotation_KITTI/0000c035a08c3770.txt

Boat 0.0 0 0.0 0 0 679 995 0.0 0.0 0.0 0.0 0.0 0.0 0.0


In [None]:
# Sample kitti label.
!cat $DATA_DIR/training/label_2/000110.txt

Car 0.27 0 2.50 862.65 129.39 1241.00 304.96 1.73 1.74 4.71 5.50 1.30 8.19 3.07
Car 0.68 3 -0.76 1184.97 141.54 1241.00 187.84 1.52 1.60 4.42 22.39 0.48 24.57 -0.03
Car 0.00 1 1.73 346.64 175.63 449.93 248.90 1.58 1.76 4.18 -5.13 1.67 17.86 1.46
Car 0.00 0 1.75 420.44 170.72 540.83 256.12 1.65 1.88 4.45 -2.78 1.64 16.30 1.58
Car 0.00 0 -0.35 815.59 143.96 962.82 198.54 1.90 1.78 4.72 10.19 0.90 26.65 0.01
Car 0.00 1 -2.09 966.10 144.74 1039.76 182.96 1.80 1.65 3.55 19.49 0.49 35.99 -1.59
Van 0.00 2 -2.07 1084.26 132.74 1173.25 177.89 2.11 1.75 4.31 26.02 0.24 36.41 -1.45
Car 0.00 2 -2.13 1004.98 144.16 1087.13 178.96 1.64 1.70 3.91 21.91 0.30 36.47 -1.59
Car 0.00 2 1.77 407.73 178.44 487.07 230.28 1.55 1.71 4.50 -5.35 1.76 24.13 1.55
Car 0.00 1 1.45 657.19 166.33 702.65 198.71 1.50 1.71 4.44 3.39 1.22 35.96 1.55
Car 0.00 1 -1.46 599.30 171.76 631.96 197.12 1.58 1.71 3.75 0.39 1.54 47.31 -1.45
Car 0.00 0 -1.02 557.79 165.74 591.61 181.27 1.66 1.65 4.45 -3.89 0.91 80.12 -1.07


In [None]:
# Generate val dataset out of training dataset
!python3 $COLAB_NOTEBOOKS_PATH/tensorflow/ssd/generate_val_dataset.py --input_image_dir=$DATA_DIR/training/image_2 \
                                        --input_label_dir=$DATA_DIR/training/label_2 \
                                        --output_dir=$DATA_DIR/val

### 1.3 Download pre-trained model <a class="anchor" id="head-1-3"></a>

We will use NGC CLI to get the pre-trained models. For more details, go to [ngc.nvidia.com](ngc.nvidia.com) and click the SETUP on the navigation bar.

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env LOCAL_PROJECT_DIR=/ngc_content/
%env CLI=ngccli_cat_linux.zip
!sudo mkdir -p $LOCAL_PROJECT_DIR/ngccli && sudo chmod -R 777 $LOCAL_PROJECT_DIR

# Remove any previously existing CLI installations
!sudo rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u -q "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))
!cp /usr/lib/x86_64-linux-gnu/libstdc++.so.6 $LOCAL_PROJECT_DIR/ngccli/ngc-cli/libstdc++.so.6

env: LOCAL_PROJECT_DIR=/ngc_content/
env: CLI=ngccli_cat_linux.zip
--2023-04-24 15:35:23--  https://ngc.nvidia.com/downloads/ngccli_cat_linux.zip
Resolving ngc.nvidia.com (ngc.nvidia.com)... 13.35.8.80, 13.35.8.55, 13.35.8.121, ...
Connecting to ngc.nvidia.com (ngc.nvidia.com)|13.35.8.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 42742318 (41M) [application/zip]
Saving to: ‘/ngc_content//ngccli/ngccli_cat_linux.zip’


2023-04-24 15:35:23 (224 MB/s) - ‘/ngc_content//ngccli/ngccli_cat_linux.zip’ saved [42742318/42742318]



In [None]:
!ngc registry model list nvidia/tao/pretrained_object_detection:*

+-----------------------+----------+--------+------------+-----------+------------------+-----------+-----------------+--------------+
| Version               | Accuracy | Epochs | Batch Size | GPU Model | Memory Footprint | File Size | Status          | Created Date |
+-----------------------+----------+--------+------------+-----------+------------------+-----------+-----------------+--------------+
| vgg19                 | 77.56    | 80     | 1          | V100      | 153.7            | 153.72 MB | UPLOAD_COMPLETE | Aug 18, 2021 |
| vgg16                 | 77.17    | 80     | 1          | V100      | 113.2            | 113.16 MB | UPLOAD_COMPLETE | Aug 18, 2021 |
| squeezenet            | 65.13    | 80     | 1          | V100      | 6.5              | 6.46 MB   | UPLOAD_COMPLETE | Aug 18, 2021 |
| resnet50              | 77.91    | 80     | 1          | V100      | 294.2            | 294.2 MB  | UPLOAD_COMPLETE | Aug 18, 2021 |
| resnet34              | 77.04    | 80     | 1        

In [None]:
!mkdir -p $EXPERIMENT_DIR/pretrained_resnet18/

In [None]:
# Pull pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_object_detection:resnet18 \
                    --dest $EXPERIMENT_DIR/pretrained_resnet18

Downloaded 82.38 MB in 21s, Download speed: 3.92 MB/s               
--------------------------------------------------------------------------------
   Transfer id: pretrained_object_detection_vresnet18
   Download status: Completed
   Downloaded local path: /content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4/pretrained_resnet18/pretrained_object_detection_vresnet18
   Total files downloaded: 1
   Total downloaded size: 82.38 MB
   Started at: 2023-04-24 15:36:42.880615
   Completed at: 2023-04-24 15:37:03.907746
   Duration taken: 21s
--------------------------------------------------------------------------------


In [None]:
print("Check that model is downloaded into dir.")
!ls -l $EXPERIMENT_DIR/pretrained_resnet18/pretrained_object_detection_vresnet18

Check that model is downloaded into dir.
total 91093
-rw------- 1 root root 93278448 Apr 24 15:37 resnet_18.hdf5


## 2. Setup GPU environment <a class="anchor" id="head-2"></a>


### 2.1 Connect to GPU Instance <a class="anchor" id="head-2-1"></a>

1. Move any data saved to the Colab Instance storage to Google Drive  
2. Change Runtime type to GPU by Runtime(Top Left tab)->Change Runtime Type->GPU(Hardware Accelerator)
3.   Then click on Connect (Top Right)



### 2.2 Mounting Google drive <a class="anchor" id="head-2-2"></a>
Mount your Google drive storage to this Colab instance

In [None]:
try:
    import google.colab
    %env GOOGLE_COLAB=1
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
except:
    %env GOOGLE_COLAB=0
    print("Warning: Not a Colab Environment")

env: GOOGLE_COLAB=1
Mounted at /content/drive


### 2.3 Setup Python environment <a class="anchor" id="head-2-3"></a>
Setup the environment necessary to run the TAO Networks by running the bash script

In [None]:
import os
if os.environ["GOOGLE_COLAB"] == "1":
    os.environ["bash_script"] = "setup_env.sh"
else:
    os.environ["bash_script"] = "setup_env_desktop.sh"

!sed -i "s|PATH_TO_COLAB_NOTEBOOKS|$COLAB_NOTEBOOKS_PATH|g" $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

!sh $COLAB_NOTEBOOKS_PATH/tensorflow/$bash_script

0% [Working]            Get:1 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ InRelease [3,622 B]
0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.                                                                               Get:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  InRelease [1,581 B]
0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.0% [Connecting to archive.ubuntu.com (185.125.190.39)] [Connecting to security.                                                                               Get:3 https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/ Packages [76.4 kB]
Get:4 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64  Packages [993 kB]
Get:5 http://security.ubuntu.com/ubuntu 

In [None]:
if os.environ.get("PYTHONPATH","") == "":
    os.environ["PYTHONPATH"] = ""
os.environ["PYTHONPATH"]+=":/opt/nvidia/"
if os.environ["GOOGLE_COLAB"] == "1":
    os.environ["PYTHONPATH"]+=":/usr/local/lib/python3.6/dist-packages/third_party/nvml"
else:
    os.environ["PYTHONPATH"]+=":/home_duplicate/rarunachalam/miniconda3/envs/tf_py_36/lib/python3.6/site-packages/third_party/nvml" # FIX MINICONDA PATH

In [None]:
#FIXME2
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao

%env EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
#FIXME4
delete_existing_experiments = False
#FIXME5
%env DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/
#FIXME6
delete_existing_data = False


env: COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao
env: EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
env: DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/


### 2.4 Reset env variables (Use the same paths which was set in Step 0) <a class="anchor" id="head-2-4"></a>

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env TAO_DOCKER_DISABLE=1

%env KEY=nvidia_tlt
%env NUM_GPUS=1

# Change the paths according to your directory structure, these are just examples
%env COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao
if not os.path.exists(os.environ["COLAB_NOTEBOOKS_PATH"]):
    raise Exception("Error, enter the path of the colab notebooks repo correctly")
%env EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
%env DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/

SPECS_DIR=f"{os.environ['COLAB_NOTEBOOKS_PATH']}/tensorflow/yolo_v4/specs"
%env SPECS_DIR={SPECS_DIR}
# Showing list of specification files.
!ls -rlt $SPECS_DIR

env: TAO_DOCKER_DISABLE=1
env: KEY=nvidia_tlt
env: NUM_GPUS=1
env: COLAB_NOTEBOOKS_PATH=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao
env: EXPERIMENT_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/TAO_V7_results/yolo_v4
env: DATA_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/
env: SPECS_DIR=/content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao/tensorflow/yolo_v4/specs
total 9
-rwx------ 1 root root 2218 Apr 24 03:00 yolo_v4_train_resnet18_kitti.txt
-rwx------ 1 root root  260 Apr 24 03:00 yolo_v4_tfrecords_kitti_val.txt
-rwx------ 1 root root  274 Apr 24 03:00 yolo_v4_tfrecords_kitti_train.txt
-rwx------ 1 root root 2197 Apr 24 03:00 yolo_v4_retrain_resnet18_kitti.txt
-rwx------ 1 root root 2953 Apr 24 15:33 yolo_v4_train_resnet18_kitti_3.txt


## 3. Generate tfrecords <a class="anchor" id="head-3"></a>

The default YOLOv4 data format requires generation of TFRecords. Currently, the old sequence data format (image folders and label txt folders) is still supported and if you prefer to use the sequence data format, you can skip this section. To use sequence data format, please use spec file `yolo_v4_train_resnet18_kitti_seq.txt` and `yolo_v4_retrain_resnet18_kitti_seq.txt`. And you can check our [user guide](https://docs.nvidia.com/tao/tao-toolkit/text/object_detection/yolo_v4.html#dataset-config) for more details about tfrecords generation and sequence data format usage.

Note: we observe that for YOLOv4, when mosaic augmentation is turned on (mosaic_prob > 0), the sequence format has faster training speed.

Note: we observe the TFRecords format sometimes results in CUDA error during evaluation. Setting `force_on_cpu` in `nms_config` to `true` can help prevent this problem.

In [None]:
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt

In [None]:
#!tao yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt \
#                            -o $DATA_DIR/training/tfrecords/train

In [None]:
!tao yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_train.txt \
                             -o $DATA_DIR/VOC12_reduced/VOC12_trai/VOC2012/

Using TensorFlow backend.
2023-04-23 16:35:41.436172: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-23 16:35:45.701137: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2023-04-23 16:35:48,505 [INFO] iva.detectnet_v2.dataio.build_converter: Instantiating a kitti converter
2023-04-23 16:35:48,506 [INFO] iva.detectnet_v2.dataio.dataset_converter_lib: Creating output directory /content/drive/MyDrive/Colab_Notebooks/benchmark_project/VOC12_reduced//VOC12_reduced/VOC12_trai/VOC2012
Traceback (most recent call last):
  File "</usr/local/lib/python3.6/dist-packages/iva/yolo_v4/scripts/dataset_convert.py>", line 3, in <module>
  File "<frozen iva.yolo_v4.scripts.dataset_convert>", line 18, in <module>
  File "<frozen iva.detectnet_v2.scripts.dataset_convert>", line 119, in main
  File "<frozen iva.detectnet_v2.dataio.data

In [None]:
!tao yolo_v4 dataset_convert -d $SPECS_DIR/yolo_v4_tfrecords_kitti_val.txt \
                             -o $DATA_DIR/val/tfrecords/val

In [None]:
# If you use your own dataset, you will need to run the code below to generate the best anchor shape

# !tao yolo_v4 kmeans -l $DATA_DIR/training/label_2 \
#                     -i $DATA_DIR/training/image_2 \
#                     -n 9 \
#                     -x 1248 \
#                     -y 384

# The anchor shape generated by this script is sorted. Write the first 3 into small_anchor_shape in the config
# file. Write middle 3 into mid_anchor_shape. Write last 3 into big_anchor_shape.

In [None]:
import os

DATA_DIR = os.environ.get('DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "V7_train/V7/JPEGImages")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "V7_train/V7/train_annotation_KITTI")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "V7_test/V7/JPEGImages")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

Number of images in the train/val set. 5040
Number of labels in the train/val set. 5040
Number of images in the test set. 1080


In [None]:
!cd /content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/

In [None]:
!tao yolo_v4 kmeans -l drive/MyDrive/ColabNotebooks/benchmark_project/VOC12_reduced/VOC2012/VOC12_trai/train_annotation_KITTI \
                     -i drive/MyDrive/ColabNotebooks/benchmark_project/VOC12_reduced/VOC2012/VOC12_trai/JPEGImages \
                     -n 9 \
                     -x 1248 \
                     -y 384

## 4. Provide training specification <a class="anchor" id="head-4"></a>
* Augmentation parameters for on-the-fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.
* Whether to use quantization aware training (QAT)

In [None]:
!ls /content/drive/MyDrive/Colab_Notebooks/benchmark_project/V7_reduced/nvidia-tao/tensorflow/yolo_v4/specs/

yolo_v4_retrain_resnet18_kitti.txt  yolo_v4_train_resnet18_kitti_3.txt
yolo_v4_tfrecords_kitti_train.txt   yolo_v4_train_resnet18_kitti.txt
yolo_v4_tfrecords_kitti_val.txt


In [None]:
# Provide pretrained model path
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt
!sed -i "s|EXPERIMENT_DIR_PATH|$EXPERIMENT_DIR/|g" $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt

# To enable QAT training on sample spec file, uncomment following lines
# !sed -i "s/enable_qat: false/enable_qat: true/g" $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt
# !sed -i "s/enable_qat: false/enable_qat: true/g" $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt

In [None]:
# By default, the sample spec file disables QAT training. You can force non-QAT training by running lines below
# !sed -i "s/enable_qat: true/enable_qat: false/g" $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt
# !sed -i "s/enable_qat: true/enable_qat: false/g" $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt

In [None]:
!cat $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt

random_seed: 42
yolov4_config {
  big_anchor_shape: "[(114.94, 60.67), (159.06, 114.59), (297.59, 176.38)]"
  mid_anchor_shape: "[(42.99, 31.91), (79.57, 31.75), (56.80, 56.93)]"
  small_anchor_shape: "[(15.60, 13.88), (30.25, 20.25), (20.67, 49.63)]"
  box_matching_iou: 0.25
  matching_neutral_box_iou: 0.5
  arch: "resnet"
  nlayers: 18
  arch_conv_blocks: 2
  loss_loc_weight: 1.0
  loss_neg_obj_weights: 1.0
  loss_class_weights: 1.0
  label_smoothing: 0.0
  big_grid_xy_extend: 0.05
  mid_grid_xy_extend: 0.1
  small_grid_xy_extend: 0.2
  freeze_bn: false
  #freeze_blocks: 0
  force_relu: false
}
training_config {
  batch_size_per_gpu: 8
  num_epochs: 80
  enable_qat: false
  checkpoint_interval: 10
  learning_rate {
    soft_start_cosine_annealing_schedule {
      min_learning_rate: 1e-7
      max_learning_rate: 1e-4
      soft_start: 0.3
    }
  }
  regularizer {
    type: L1
    weight: 3e-5
  }
  optimizer {
    adam {
      epsilon: 1e-7
      beta1: 0.9
      beta2: 0.999
      a

## 5. Run TAO training <a class="anchor" id="head-5"></a>
* Provide the sample spec file and the output directory location for models
* WARNING: training will take several hours or one day to complete

In [None]:
!mkdir -p $EXPERIMENT_DIR/experiment_dir_unpruned

#2 hours on VOC12

In [None]:
print("To run with multigpu, please change --gpus based on the number of available GPUs in your machine.")
!tao yolo_v4 train -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt \
                   -r $EXPERIMENT_DIR/experiment_dir_unpruned \
                   -k $KEY \
                   --gpus 1

To run with multigpu, please change --gpus based on the number of available GPUs in your machine.
Using TensorFlow backend.
2023-04-24 16:06:42.041315: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-24 16:06:46.563586: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
2023-04-24 16:06:50.016607: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200175000 Hz
2023-04-24 16:06:50.017133: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x68168b0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-24 16:06:50.017166: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2023-04-24 16:06:50.019163: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic librar

In [None]:
print("To resume from checkpoint, please change pretrain_model_path to resume_model_path in config file.")

To resume from checkpoint, please change pretrain_model_path to resume_model_path in config file.


In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -ltrh $EXPERIMENT_DIR/experiment_dir_unpruned/weights

Model for each epoch:
---------------------
total 3.2G
-rw------- 1 root root 400M Apr 24 17:05 yolov4_resnet18_epoch_010.tlt
-rw------- 1 root root 400M Apr 24 17:53 yolov4_resnet18_epoch_020.tlt
-rw------- 1 root root 400M Apr 24 18:41 yolov4_resnet18_epoch_030.tlt
-rw------- 1 root root 400M Apr 24 19:29 yolov4_resnet18_epoch_040.tlt
-rw------- 1 root root 400M Apr 24 20:18 yolov4_resnet18_epoch_050.tlt
-rw------- 1 root root 400M Apr 24 21:07 yolov4_resnet18_epoch_060.tlt
-rw------- 1 root root 400M Apr 24 21:55 yolov4_resnet18_epoch_070.tlt
-rw------- 1 root root 400M Apr 24 22:43 yolov4_resnet18_epoch_080.tlt


In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $EXPERIMENT_DIR/experiment_dir_unpruned/yolov4_training_log_resnet18.csv
%env EPOCH=080

epoch,AP_airplane,AP_bird,AP_boat,AP_bus,AP_cat,AP_dog,AP_horse,AP_person,AP_train,loss,lr,mAP,validation_loss
1,0.00014734050390452334,0.00016335865392469166,0.00011307100859339665,0.00020614306328592044,0.0,0.0,2.3875066551748015e-06,0.0,0.0,20125.867,4.2624997e-06,7.025563737374521e-05,11780.735438368056
2,0.002932551319648094,0.0003787878787878788,0.0,0.0003071253071253071,0.018181818181818184,0.0,0.0,0.0,0.0,8575.352,8.425e-06,0.0024222536319310514,6248.34556568287
3,0.000547645125958379,0.00043706293706293706,0.0,0.0,0.0,0.0001400756408460569,0.0,0.0,0.0,5820.517,1.2587499e-05,0.00012497596709637476,4996.951790364584
4,0.0,0.0002575328354365182,0.0,0.0,0.0,0.0,0.0002307337332718043,0.0,0.0,4497.0713,1.6749998e-05,5.425184096759139e-05,3851.875826461227
5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3435.2524,2.09125e-05,0.0,2827.2514485677084
6,0.0,0.0,3.2677602771060715e-06,0.0,0.0,0.0,0.0,0.0,0.0,2503.7927,2.5074998e-05,3.6308447523400797e-07,2010.1436315465855
7,0.0,0.0,0.0,6.69

## 6. Evaluate trained models <a class="anchor" id="head-6"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt \
                      -m $EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

Using TensorFlow backend.
2023-04-24 22:44:40.135817: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-24 22:44:44.445480: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0






















2023-04-24 22:44:52.603705: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200175000 Hz
2023-04-24 22:44:52.604297: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xacb59e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-24 22:44:52.604332: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2023-04-24 22:44:52.606433: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2023-04-24 22:44:52.672486: I tensorflow/compiler/xla/service

## 7. Prune trained models <a class="anchor" id="head-7"></a>
* Specify pre-trained model
* Equalization criterion (`Only for resnets as they have element wise operations or MobileNets.`)
* Threshold for pruning.
* A key to save and load the model
* Output directory to store the model

Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold value depends on the dataset and the model. `0.5` in the block below is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.

In [None]:
!mkdir -p $EXPERIMENT_DIR/experiment_dir_pruned

In [None]:
!tao yolo_v4 prune -m $EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                   -e $SPECS_DIR/yolo_v4_train_resnet18_kitti.txt \
                   -o $EXPERIMENT_DIR/experiment_dir_pruned/yolov4_resnet18_pruned.tlt \
                   -eq intersection \
                   -pth 0.1 \
                   -k $KEY

In [None]:
!ls -rlt $EXPERIMENT_DIR/experiment_dir_pruned/

## 8. Retrain pruned models <a class="anchor" id="head-8"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification
* WARNING: training will take several hours or one day to complete

In [None]:
# Printing the retrain spec file. 
# Here we have updated the spec file to include the newly pruned model as a pretrained weights.
!sed -i "s|TAO_DATA_PATH|$DATA_DIR/|g" $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt
!sed -i "s|EXPERIMENT_DIR_PATH|$EXPERIMENT_DIR/|g" $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt
!cat $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt

In [None]:
!mkdir -p $EXPERIMENT_DIR/experiment_dir_retrain

In [None]:
# Retraining using the pruned model as pretrained weights 
!tao yolo_v4 train --gpus 1 \
                   -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                   -r $EXPERIMENT_DIR/experiment_dir_retrain \
                   -k $KEY

In [None]:
# Listing the newly retrained model.
!ls -rlt $EXPERIMENT_DIR/experiment_dir_retrain/weights

In [None]:
# Now check the evaluation stats in the csv file and pick the model with highest eval accuracy.
!cat $EXPERIMENT_DIR/experiment_dir_retrain/yolov4_training_log_resnet18.csv
%env EPOCH=160

## 9. Evaluate retrained model <a class="anchor" id="head-9"></a>

In [None]:
!tao yolo_v4 evaluate -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                      -m $EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                      -k $KEY

2023-04-24 15:22:38.596562: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Traceback (most recent call last):
  File "/usr/local/bin/yolo_v4", line 5, in <module>
    from iva.yolo_v4.entrypoint.yolo_v4 import main
  File "/usr/local/lib/python3.6/dist-packages/iva/__init__.py", line 10, in <module>
    import third_party.keras.mixed_precision as MP
  File "/usr/local/lib/python3.6/dist-packages/third_party/keras/mixed_precision.py", line 11, in <module>
    import keras
  File "/usr/local/lib/python3.6/dist-packages/keras/__init__.py", line 20, in <module>
    from keras import distribute
  File "/usr/local/lib/python3.6/dist-packages/keras/distribute/__init__.py", line 18, in <module>
    from keras.distribute import sidecar_evaluator
  File "/usr/local/lib/python3.6/dist-packages/keras/distribute/sidecar_evaluator.py", line 22, in <module>
    from keras.optimizers.optimizer_experimental import (
  File "/usr/loc

## 10. Visualize inferences <a class="anchor" id="head-10"></a>
In this section, we run the `infer` tool to generate inferences on the trained models and visualize the results.

In [None]:
# Copy some test images
!mkdir -p $DATA_DIR/test_samples
!cp $DATA_DIR/testing/image_2/000* $DATA_DIR/test_samples/

# Original

!tao yolo_v4 inference -i $DATA_DIR/test_samples \
                       -o $EXPERIMENT_DIR/yolo_infer_images \
                       -e $SPECS_DIR/yolo_v4_retrain_resnet18_kitti.txt \
                       -m $EXPERIMENT_DIR/experiment_dir_retrain/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                       -l $EXPERIMENT_DIR/yolo_infer_labels \
                       -k $KEY

In [None]:
!pip install keras

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/, https://pypi.ngc.nvidia.com
Collecting keras
  Downloading keras-2.10.0-py2.py3-none-any.whl (1.7 MB)
     |████████████████████████████████| 1.7 MB 4.0 MB/s            
[?25hInstalling collected packages: keras
Successfully installed keras-2.10.0


In [None]:
# Running inference for detection on n images
!tao yolo_v4 inference -i $DATA_DIR/V7_test/V7/JPEGImages \
                       -o $EXPERIMENT_DIR/yolo_infer_images \
                       -e $SPECS_DIR/yolo_v4_train_resnet18_kitti_3.txt \
                       -m $EXPERIMENT_DIR/experiment_dir_unpruned/weights/yolov4_resnet18_epoch_$EPOCH.tlt \
                       -l $EXPERIMENT_DIR/yolo_infer_labels \
                       -k $KEY

Using TensorFlow backend.
2023-04-24 22:49:17.029179: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
Using TensorFlow backend.
2023-04-24 22:49:21.291444: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0






















2023-04-24 22:49:29.197514: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200175000 Hz
2023-04-24 22:49:29.198084: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0xa8c52e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2023-04-24 22:49:29.198118: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2023-04-24 22:49:29.200149: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1
2023-04-24 22:49:29.265737: I tensorflow/compiler/xla/service

The `inference` tool produces two outputs. 
1. Overlain images in `$EXPERIMENT_DIR/yolo_infer_images`
2. Frame by frame bbox labels in kitti format located in `$EXPERIMENT_DIR/yolo_infer_labels`

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(image_dir, num_cols=4, num_images=10):
    output_path = os.path.join(os.environ['EXPERIMENT_DIR'], image_dir)
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(output_path, image) for image in os.listdir(output_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images.
!mkdir -p $EXPERIMENT_DIR/yolo_infer_images
OUTPUT_PATH = 'yolo_infer_images' # relative path from $EXPERIMENT_DIR.
COLS = 3 # number of columns in the visualizer grid.
IMAGES = 9 # number of images to visualize.

visualize_images(OUTPUT_PATH, num_cols=COLS, num_images=IMAGES)

Output hidden; open in https://colab.research.google.com to view.