# Integrated Edge AI workflow using NVIDIA TAO toolkit and ZEDEDA Edge Orchestrator 

<img align="center" src="https://github.com/cshari-zededa/demo_files/releases/download/v1.0/nvidia_zededa.png" width="1080"> 

NVIDIA TAO (Transfer, Adapt and Optimize) toolkit offers toolkit to work with pre-trained NVIDIA NGC models, retrain them with additional datasets, optimize the model, and convert them into TensorRT engines for running them using NVIDIA GPUs.

ZEDEDA Edge Orchestrator is an enterprise-grade Edge Management platform and Orchestration Engine that manages lifecyle of Edge workloads, including Edge AI, across thousands of Edge devices distributed across the globe, from a centralized portal. In this demo we are going to see how these two powerful solutions compliment each other, and give a complete Edge AI lifecyle management, right from model development, till the Edge AI solution rollout. 

This demo uses NVIDIA NGC model DetectNet-V2 as an example.


# Object Detection using DetectNet-V2 

Transfer learning is the process of transferring learned features from one application to another. It is a commonly used training technique where you use a model trained on one task and re-train to use it on a different task. 

Train Adapt Optimize (TAO) Toolkit  is a simple and easy-to-use Python based AI toolkit for taking purpose-built AI models and customizing them with users' own data.

<img align="center" src="https://developer.nvidia.com/sites/default/files/akamai/TAO/tlt-tao-toolkit-bring-your-own-model-diagram.png" width="1080"> 

## What is DetectNet-V2?

DetectNet_v2, also known as GridBox object detection, is a highly optimized CNN based object detection model, that does bounding-box regression over a uniform grid on the input image. The gridbox system divides an input image into a uniform grid that predicts four normalized bounding-box parameters (xc, yc, w, h) and confidence value per output class.

The raw normalized bounding-box and confidence detection are thesholded and then post-processed by a clustering algorithm such as DBSCAN, NMS or a HYBRID (DBSCAN + NMS) to produce the final bounding-box coordinates and category labels.

### Sample output predictions from a trained DetectNet_v2 model

<img align="center" src="https://miro.medium.com/v2/resize:fit:720/0*YaQDIKR4gRbP2-by" width="960">

<img align="center" src="https://miro.medium.com/v2/resize:fit:720/0*eY1qluSyldYl9qDw" width="960">

## Learning Objectives
In this notebook, you will learn how to leverage the simplicity and convenience of TAO along with the versatility of ZEDEDA Edge Orchestrator to effectively develop and deploy Edge AI models. At the end of this exercise you would have learnt to:

* Take a pretrained resnet18 model and train a ResNet-18 DetectNet_v2 model on the KITTI dataset
* Prune the trained detectnet_v2 model
* Retrain the pruned model to recover lost accuracy
* Run Inference on the trained model
* Export the pruned, quantized and retrained model in .ONNX format (Open Neural Network Exchange)
* Convert the model from .onnx into a TensorRT engine format (.engine) to use Jetson iGPU for inference acceleration
* Add the model to model repository, and package the Edge AI application profile for deployment
* Deploy the Edge AI solution bundle across the fleet using ZEDEDA Edge AI orchestration Engine
* Observe the solution in ZEDEDA UI and a demo web interface

At the end of this notebook, you will have a trained and optimized `detectnet_v2` model that you
may deploy via [Triton](https://github.com/NVIDIA-AI-IOT/tao-toolkit-triton-apps) or [DeepStream](https://developer.nvidia.com/deepstream-sdk).


## 0. Set up env variables and map drives <a class="anchor" id="head-0"></a>

The following notebook requires the user to set an env variable called the `$LOCAL_PROJECT_DIR` as the path to the users workspace. Please note that the dataset to run this notebook is expected to reside in the `$LOCAL_PROJECT_DIR/data`, while the TAO experiment generated collaterals will be output to `$LOCAL_PROJECT_DIR/detectnet_v2`. More information on how to set up the dataset and the supported steps in the TAO workflow are provided in the subsequent cells.

*Note: Please make sure to remove any stray artifacts/files from the `$USER_EXPERIMENT_DIR` or `$DATA_DOWNLOAD_DIR` paths as mentioned below, that may have been generated from previous experiments. Having checkpoint files etc may interfere with creating a training graph for a new experiment.*

*Note: This notebook currently is by default set up to run training using 1 GPU. To use more GPU's please update the env variable `$NUM_GPUS` accordingly*

In [None]:
# Setting up env variables for cleaner command line commands.
import os

%env NUM_GPUS=1
%env USER_EXPERIMENT_DIR=/workspace/tao-experiments/detectnet_v2
%env DATA_DOWNLOAD_DIR=/workspace/tao-experiments/data
%env DOCKERHUB_USERNAME=csharizededa

# Set this path if you don't run the notebook from the samples directory.
# %env NOTEBOOK_ROOT=~/tao-samples/detectnet_v2

# Please define this local project directory that needs to be mapped to the TAO docker session.
# The dataset expected to be present in $LOCAL_PROJECT_DIR/data, while the results for the steps
# in this notebook will be stored at $LOCAL_PROJECT_DIR/detectnet_v2
# !PLEASE MAKE SURE TO UPDATE THIS PATH!.

os.environ["LOCAL_PROJECT_DIR"] = "/home/ubuntu/tao"

os.environ["LOCAL_DATA_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "data"
)
os.environ["LOCAL_EXPERIMENT_DIR"] = os.path.join(
    os.getenv("LOCAL_PROJECT_DIR", os.getcwd()),
    "detectnet_v2"
)

# Make the experiment directory 
! mkdir -p $LOCAL_EXPERIMENT_DIR

# The sample spec files are present in the same path as the downloaded samples.
os.environ["LOCAL_SPECS_DIR"] = os.path.join(
    os.getenv("NOTEBOOK_ROOT", os.getcwd()),
    "specs"
)
%env SPECS_DIR=/workspace/tao-experiments/detectnet_v2/specs
CLEARML_LOGGED_IN = False
WANDB_LOGGED_IN = False

# Showing list of specification files.
!ls -rlt $LOCAL_SPECS_DIR

The cell below maps the project directory on your local host to a workspace directory in the TAO docker instance, so that the data and the results are mapped from in and out of the docker. For more information please refer to the [launcher instance](https://docs.nvidia.com/tao/tao-toolkit/tao_launcher.html) in the user guide.

When running this cell on AWS, update the drive_map entry with the dictionary defined below, so that you don't have permission issues when writing data into folders created by the TAO docker.

```json
drive_map = {
    "Mounts": [
            # Mapping the data directory
            {
                "source": os.environ["LOCAL_PROJECT_DIR"],
                "destination": "/workspace/tao-experiments"
            },
            # Mapping the specs directory.
            {
                "source": os.environ["LOCAL_SPECS_DIR"],
                "destination": os.environ["SPECS_DIR"]
            },
        ],
    "DockerOptions": {
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}
```

In [None]:
# Mapping up the local directories to the TAO docker.
import json
mounts_file = os.path.expanduser("~/.tao_mounts.json")

# Define the dictionary with the mapped drives
drive_map = {
    "Mounts": [
        # Mapping the data directory
        {
            "source": os.environ["LOCAL_PROJECT_DIR"],
            "destination": "/workspace/tao-experiments"
        },
        # Mapping the specs directory.
        {
            "source": os.environ["LOCAL_SPECS_DIR"],
            "destination": os.environ["SPECS_DIR"]
        },
    ],
    "DockerOptions":{
        "user": "{}:{}".format(os.getuid(), os.getgid())
    }
}

# Writing the mounts file.
with open(mounts_file, "w") as mfile:
    json.dump(drive_map, mfile, indent=4)


In [None]:
!cat ~/.tao_mounts.json

## 1. Install the TAO launcher <a class="anchor" id="head-1"></a>
The TAO launcher is a python package distributed as a python wheel listed in PyPI. You may install the launcher by executing the following cell.

Please note that TAO Toolkit recommends users to run the TAO launcher in a virtual env with python 3.6.9. You may follow the instruction in this [page](https://virtualenvwrapper.readthedocs.io/en/latest/install.html) to set up a python virtual env using the `virtualenv` and `virtualenvwrapper` packages. Once you have setup virtualenvwrapper, please set the version of python to be used in the virtual env by using the `VIRTUALENVWRAPPER_PYTHON` variable. You may do so by running

```sh
export VIRTUALENVWRAPPER_PYTHON=/path/to/bin/python3.x
```
where x >= 6 and <= 8

We recommend performing this step first and then launching the notebook from the virtual environment. In addition to installing TAO python package, please make sure of the following software requirements:
* python >=3.6.9 < 3.8.x
* docker-ce > 19.03.5
* docker-API 1.40
* nvidia-container-toolkit > 1.3.0-1
* nvidia-container-runtime > 3.4.0-1
* nvidia-docker2 > 2.5.0-1
* nvidia-driver > 455+

Once you have installed the pre-requisites, please log in to the docker registry nvcr.io by following the command below

```sh
docker login nvcr.io
```

You will be triggered to enter a username and password. The username is `$oauthtoken` and the password is the API key generated from `ngc.nvidia.com`. Please follow the instructions in the [NGC setup guide](https://docs.nvidia.com/ngc/ngc-overview/index.html#generating-api-key) to generate your own API key.

In [None]:
# SKIP this step IF you have already installed the TAO launcher wheel.
!pip3 install nvidia-tao

In [None]:
# View the versions of the TAO launcher
!tao info --verbose

## 2. Prepare dataset and pre-trained model <a class="anchor" id="head-2"></a>

We will be using the kitti object detection dataset for this example. To find more details, please visit http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d. Please download both, the left color images of the object dataset from [here](http://www.cvlibs.net/download.php?file=data_object_image_2.zip) and, the training labels for the object dataset from [here](http://www.cvlibs.net/download.php?file=data_object_label_2.zip), and place the zip files in `$LOCAL_DATA_DIR`

The data will then be extracted to have
* training images in `$LOCAL_DATA_DIR/training/image_2`
* training labels in `$LOCAL_DATA_DIR/training/label_2`
* testing images in `$LOCAL_DATA_DIR/testing/image_2`

You may use this notebook with your own dataset as well. To use this example with your own dataset, please follow the same directory structure as mentioned below.

*Note: There are no labels for the testing images, therefore we use it just to visualize inferences for the trained model.*

### A. Download the dataset <a class="anchor" id="head-2-1"></a>
Once you have gotten the download links in your email, please populate them in place of the `KITTI_IMAGES_DOWNLOAD_URL` and the `KITTI_LABELS_DOWNLOAD_URL`. This next cell, will download the data and place in `$LOCAL_DATA_DIR`

In [None]:
import os
!mkdir -p $LOCAL_DATA_DIR
os.environ["URL_IMAGES"]="https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_image_2.zip"
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then wget $URL_IMAGES -O $LOCAL_DATA_DIR/data_object_image_2.zip; else echo "image archive already downloaded"; fi 
os.environ["URL_LABELS"]="https://s3.eu-central-1.amazonaws.com/avg-kitti/data_object_label_2.zip"
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then wget $URL_LABELS -O $LOCAL_DATA_DIR/data_object_label_2.zip; else echo "label archive already downloaded"; fi 

### B. Verify downloaded dataset <a class="anchor" id="head-2-2"></a>

In [None]:
# Check the dataset is present
!if [ ! -f $LOCAL_DATA_DIR/data_object_image_2.zip ]; then echo 'Image zip file not found, please download.'; else echo 'Found Image zip file.';fi
!if [ ! -f $LOCAL_DATA_DIR/data_object_label_2.zip ]; then echo 'Label zip file not found, please download.'; else echo 'Found Labels zip file.';fi

In [None]:
# This may take a while: verify integrity of zip files 
!sha256sum $LOCAL_DATA_DIR/data_object_image_2.zip | cut -d ' ' -f 1 | grep -xq '^351c5a2aa0cd9238b50174a3a62b846bc5855da256b82a196431d60ff8d43617$' ; \
if test $? -eq 0; then echo "images OK"; else echo "images corrupt, redownload!" && rm -f $LOCAL_DATA_DIR/data_object_image_2.zip; fi 
!sha256sum $LOCAL_DATA_DIR/data_object_label_2.zip | cut -d ' ' -f 1 | grep -xq '^4efc76220d867e1c31bb980bbf8cbc02599f02a9cb4350effa98dbb04aaed880$' ; \
if test $? -eq 0; then echo "labels OK"; else echo "labels corrupt, redownload!" && rm -f $LOCAL_DATA_DIR/data_object_label_2.zip; fi 

In [None]:
# unpack downloaded datasets to $DATA_DOWNLOAD_DIR.
# The training images will be under $DATA_DOWNLOAD_DIR/training/image_2 and 
# labels will be under $DATA_DOWNLOAD_DIR/training/label_2.
# The testing images will be under $DATA_DOWNLOAD_DIR/testing/image_2.
!unzip -u $LOCAL_DATA_DIR/data_object_image_2.zip -d $LOCAL_DATA_DIR
!unzip -u $LOCAL_DATA_DIR/data_object_label_2.zip -d $LOCAL_DATA_DIR

In [None]:
# verify
import os

DATA_DIR = os.environ.get('LOCAL_DATA_DIR')
num_training_images = len(os.listdir(os.path.join(DATA_DIR, "training/image_2")))
num_training_labels = len(os.listdir(os.path.join(DATA_DIR, "training/label_2")))
num_testing_images = len(os.listdir(os.path.join(DATA_DIR, "testing/image_2")))
print("Number of images in the train/val set. {}".format(num_training_images))
print("Number of labels in the train/val set. {}".format(num_training_labels))
print("Number of images in the test set. {}".format(num_testing_images))

### C. Prepare tf records from kitti format dataset <a class="anchor" id="head-2-3"></a>

* Update the tfrecords spec file to take in your kitti format dataset
* Create the tfrecords using the detectnet_v2 dataset_convert 

*Note: TfRecords only need to be generated once.*

In [None]:
print("TFrecords conversion spec file for kitti training")
!cat $LOCAL_SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt

In [None]:
# Creating a new directory for the output tfrecords dump.
print("Converting Tfrecords for kitti trainval dataset")
!mkdir -p $LOCAL_DATA_DIR/tfrecords && rm -rf $LOCAL_DATA_DIR/tfrecords/*
!tao model detectnet_v2 dataset_convert \
                  -d $SPECS_DIR/detectnet_v2_tfrecords_kitti_trainval.txt \
                  -o $DATA_DOWNLOAD_DIR/tfrecords/kitti_trainval/kitti_trainval \
                  -r $USER_EXPERIMENT_DIR/

In [None]:
!ls -rlt $LOCAL_DATA_DIR/tfrecords/kitti_trainval/

### D. Download pre-trained model <a class="anchor" id="head-2-4"></a>
Download the correct pretrained model from the NGC model registry for your experiment. Please note that for DetectNet_v2, the input is expected to be 0-1 normalized with input channels in RGB order. Therefore, for optimum results please download model templates from `nvidia/tao/pretrained_detectnet_v2`. The templates are now organized as version strings. For example, to download a resnet18 model suitable for detectnet please resolve to the ngc object shown as `nvidia/tao/pretrained_detectnet_v2:resnet18`. 

All other models are in BGR order expect input preprocessing with mean subtraction and input channels. Using them as pretrained weights may result in suboptimal performance.

You may also use this notebook with the following purpose-built pretrained models 
* [PeopleNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:peoplenet)
* [TrafficCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:trafficcamnet)
* [DashCamNet](https://ngc.nvidia.com/catalog/models/nvidia:tao:dashcamnet)
* [FaceDetect-IR](https://ngc.nvidia.com/catalog/models/nvidia:tao:facedetectir) 

In [None]:
# Installing NGC CLI on the local machine.
## Download and install
%env CLI=ngccli_cat_linux.zip
!mkdir -p $LOCAL_PROJECT_DIR/ngccli

# Remove any previously existing CLI installations
!rm -rf $LOCAL_PROJECT_DIR/ngccli/*
!wget "https://ngc.nvidia.com/downloads/$CLI" -P $LOCAL_PROJECT_DIR/ngccli
!unzip -u "$LOCAL_PROJECT_DIR/ngccli/$CLI" -d $LOCAL_PROJECT_DIR/ngccli/
!rm $LOCAL_PROJECT_DIR/ngccli/*.zip 
os.environ["PATH"]="{}/ngccli/ngc-cli:{}".format(os.getenv("LOCAL_PROJECT_DIR", ""), os.getenv("PATH", ""))

In [None]:
# List models available in the model registry.
!ngc registry model list nvidia/tao/pretrained_detectnet_v2:*

In [None]:
# Create the target destination to download the model.
!mkdir -p $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/

In [None]:
# Download the pretrained model from NGC
!ngc registry model download-version nvidia/tao/pretrained_detectnet_v2:resnet18 \
    --dest $LOCAL_EXPERIMENT_DIR/pretrained_resnet18

In [None]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/pretrained_resnet18/pretrained_detectnet_v2_vresnet18

## 3. Provide training specification <a class="anchor" id="head-3"></a>
* Tfrecords for the train datasets
    * To use the newly generated tfrecords, update the dataset_config parameter in the spec file at `$SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt` 
    * Update the fold number to use for evaluation. In case of random data split, please use fold `0` only
    * For sequence-wise split, you may use any fold generated from the dataset convert tool
* Pre-trained models
* Augmentation parameters for on the fly data augmentation
* Other training (hyper-)parameters such as batch size, number of epochs, learning rate etc.

In [None]:
!cat $LOCAL_SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt

## 4. Run TAO training <a class="anchor" id="head-4"></a>
* Provide the sample spec file and the output directory location for models

*Note: The training may take hours to complete. Also, the remaining notebook, assumes that the training was done in single-GPU mode. When run in multi-GPU mode, please expect to update the pruning and inference steps with new pruning thresholds and updated parameters in the clusterfile.json accordingly for optimum performance.*

*Detectnet_v2 now supports restart from checkpoint. In case the training job is killed prematurely, you may resume training from the closest checkpoint by simply re-running the **same** command line. Please do make sure to use the <u>**same number of GPUs**</u> when restarting the training.*

*When running the training with NUM_GPUs>1, you may need to modify the `batch_size_per_gpu` and `learning_rate` to get similar mAP as a 1GPU training run. In most cases, scaling down the batch-size by a factor of NUM_GPU's or scaling up the learning rate by a factor of NUM_GPU's would be a good place to start.* 

In [None]:
!tao model detectnet_v2 train -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_unpruned \
                        -n resnet18_detector \
                        --gpus $NUM_GPUS

In [None]:
print('Model for each epoch:')
print('---------------------')
!ls -lh $LOCAL_EXPERIMENT_DIR/experiment_dir_unpruned/weights

## 5. Evaluate the trained model <a class="anchor" id="head-5"></a>

In [None]:
!tao model detectnet_v2 evaluate -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt\
                           -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.hdf5

## 6. Prune the trained model <a class="anchor" id="head-6"></a>
* Specify pre-trained model
* Equalization criterion (`Applicable for resnets and mobilenets`)
* Threshold for pruning.
* Output directory to store the model

*Usually, you just need to adjust `-pth` (threshold) for accuracy and model size trade off. Higher `pth` gives you smaller model (and thus higher inference speed) but worse accuracy. The threshold to use is dependent on the dataset. A pth value `5.2e-6` is just a start point. If the retrain accuracy is good, you can increase this value to get smaller models. Otherwise, lower this value to get better accuracy.*

*For some internal studies, we have noticed that a pth value of 0.01 is a good starting point for detectnet_v2 models.*

In [None]:
# Create an output directory if it doesn't exist.
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned

In [None]:
!tao model detectnet_v2 prune \
                  -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.hdf5 \
                  -o $USER_EXPERIMENT_DIR/experiment_dir_pruned/resnet18_nopool_bn_detectnet_v2_pruned.hdf5 \
                  -eq union \
                  -pth 0.0000052

In [None]:
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_pruned/

## 7. Retrain the pruned model <a class="anchor" id="head-7"></a>
* Model needs to be re-trained to bring back accuracy after pruning
* Specify re-training specification with pretrained weights as pruned model.

*Note: For retraining, please set the `load_graph` option to `true` in the model_config to load the pruned model graph. Also, if after retraining, the model shows some decrease in mAP, it could be that the originally trained model was pruned a little too much. Please try reducing the pruning threshold (thereby reducing the pruning ratio) and use the new model to retrain.*

*Note: DetectNet_v2 now supports Quantization Aware Training, to help with optmizing the model. By default, the training in the cell below doesn't run the model with QAT enabled. For information on training a model with QAT, please refer to the cells under [section 11](#head-11)*

In [None]:
# Printing the retrain experiment file. 
# Note: We have updated the experiment file to include the 
# newly pruned model as a pretrained weights and, the
# load_graph option is set to true 
!cat $LOCAL_SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt

In [None]:
# Retraining using the pruned model as pretrained weights 
!tao model detectnet_v2 train -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                        -r $USER_EXPERIMENT_DIR/experiment_dir_retrain \
                        -n resnet18_detector_pruned \
                        --gpus $NUM_GPUS

In [None]:
# Listing the newly retrained model.
!ls -rlt $LOCAL_EXPERIMENT_DIR/experiment_dir_retrain/weights

## 8. Evaluate the retrained model <a class="anchor" id="head-8"></a>

This section evaluates the pruned and retrained model, using the `evaluate` command.

In [None]:
!tao model detectnet_v2 evaluate -e $SPECS_DIR/detectnet_v2_retrain_resnet18_kitti.txt \
                           -m $USER_EXPERIMENT_DIR/experiment_dir_retrain/weights/resnet18_detector_pruned.hdf5

## 9. Model Export <a class="anchor" id="head-10"></a>

In [None]:
!mkdir -p $LOCAL_EXPERIMENT_DIR/experiment_dir_final

!tao deploy detectnet_v2 

# Removing a pre-existing copy of the onnx if there has been any.
import os
output_file=os.path.join(os.environ['LOCAL_EXPERIMENT_DIR'],
                         "experiment_dir_final/resnet18_detector.onnx")
if os.path.exists(output_file):
    os.system("rm {}".format(output_file))

!tao model detectnet_v2 export \
                  -m $USER_EXPERIMENT_DIR/experiment_dir_unpruned/weights/resnet18_detector.hdf5 \
                  -e $SPECS_DIR/detectnet_v2_train_resnet18_kitti.txt \
                  -o $USER_EXPERIMENT_DIR/experiment_dir_final/resnet18_detector.onnx \
                  --onnx_route tf2onnx \
                  --gen_ds_config

## 11. Use ZEDEDA Edge AI service to convert to TensorRT engine for Jetson 

we can host .ONNX models directly on Triton Inference Server App on ZEDEDA supported Edge devices, and that would work. This is useful for Edge devices without any GPU.  But what is the fun if we don't optimize it to use GPUs! For this demo, we want to run this model on NVIDIA Jetson devices mananged by ZEDEDA Controller. Therefore we want to generate an optimized version of this .ONNX model which is capable of running inferences using iGPU on Jetson. But there is a limitation that the conversion from .ONNX to TensorRT engine format needs to be done on the target hardware. This means we can not perform this conversion inline with this notebook, as the TensorRT generated by this host will not work on Jetson. To address this need, ZEDEDA provides this conversion as a service through its Edge AI toolkit. This section passes the ONNX file to ZEDEDA Edge AI service for conversion. ZEDEDA Edge AI service internally optimizes the model into TensorRT engine file, and shares the .engine as output.

In [1]:
from zededa_edge_ai_toolkit import ZededaTRTConverter

onnx_file_path = "/home/ubuntu/tao/detectnet_v2/experiment_dir_final/resnet18_detector.onnx"

ZededaTRTConverter(onnx_file_path, "model.plan")

✅ Output file saved as: model.plan


Verify that the model.plan is present in the target directory:

In [None]:
!ls -ltr model.plan

## 12. Use ZEDEDA Edge AI Orchestrator to build an Edge AI Application profile

### Using the retrained NGC model to build an Edge AI application profile 

Now that we have built the model that is ready to be used with Triton Inference Server, let's proceed to build an Edge AI application profile that has the following components:
* Triton Inference Server as a container built for the Jetson devices
* The model we just built packaged as the OCI volume to be mounted to the Triton Inference Server
* A metrics exporter container based on Grafana Alloy, to export metrics from the Triton Inference Server to Grafana Cloud
* An Edge AI use case demo application, which demonstrates AI business logic by passing a demo video file to Triton Inference Server and showing the output

We also demonstrate the capability of ZEDEDA Orchestration Engine, by deploying this profile across the fleet in one shot, with the concept of application profile deployment feature. In this example we are going to match edge devices based on device attributes called tags, which is similar to labels used in Kubernetes. The overall workflow in shown below:

<img align="center" src="https://github.com/cshari-zededa/demo_files/releases/download/v1.0/Image.3-2-25.at.3.09.AM.jpg" width="1080"> 

In [2]:
import getpass
import subprocess

username = input("Enter Docker Username: ")
password = getpass.getpass("Enter Docker Password: ")

subprocess.run(f"echo {password} | docker login -u {username} --password-stdin", shell=True, check=True)

print("✅ Successfully logged into Docker!")
from zededa_controller import ZededaController
zededa = ZededaController()

Enter Docker Username:  csharizededa
Enter Docker Password:  ········


Login Succeeded
✅ Successfully logged into Docker!


https://docs.docker.com/engine/reference/commandline/login/#credential-stores



In [3]:
import os
import subprocess

!rm -rf models
!mkdir -p models/peoplenet/1

!ls -ltr models 
# Define parameters
model_repository_path = "models"
docker_image_name = "modelrepository"

# Step 1: Create a Dockerfile
dockerfile_content = f"""
FROM scratch
COPY models/ /models
"""

dockerfile_path = "./Dockerfile"

# Write Dockerfile
with open(dockerfile_path, "w") as f:
    f.write(dockerfile_content)

# Step 3: Build the OCI Image
subprocess.run(["docker", "build", "-t", docker_image_name, "."], check=True)


total 4
drwx------ 3 ubuntu ubuntu 4096 Mar  3 01:19 peoplenet


#0 building with "default" instance using docker driver

#1 [internal] load build definition from Dockerfile
#1 transferring dockerfile: 72B done
#1 DONE 0.0s

#2 [internal] load .dockerignore
#2 transferring context: 2B done
#2 DONE 0.0s

#3 [internal] load build context
#3 transferring context: 102B done
#3 DONE 0.0s

#4 [1/1] COPY models/ /models
#4 DONE 0.0s

#5 exporting to image
#5 exporting layers 0.0s done
#5 writing image sha256:beaa6586838f861f28c48f0c7ad964826f684c47e3c2e687c25e2e0939bb40ca done
#5 naming to docker.io/library/modelrepository done
#5 DONE 0.1s


CompletedProcess(args=['docker', 'build', '-t', 'modelrepository', '.'], returncode=0)

In [9]:
subprocess.run(["docker", "tag", docker_image_name, f"{username}/{docker_image_name}:1.0"], check=True)
subprocess.run(["docker", "push", f"{username}/{docker_image_name}:1.0"], check=True)

The push refers to repository [docker.io/csharizededa/modelrepository]
b73af414994c: Preparing
b73af414994c: Pushed
1.0: digest: sha256:1ac539244e19dd4ecd91301990841ee89c89c7145cbfe9236e026a891c9a44e7 size: 523


CompletedProcess(args=['docker', 'push', 'csharizededa/modelrepository:1.0'], returncode=0)

In [10]:
#Create Edge AI Application profile
profile_data = {
    "profile_name": "nvidia_triton_profile",
    "oci_triton_inference_server": "csharizededa/tritonserver:6.0",
    "oci_model_repository": "csharizededa/ngcmodels:1.0",
    "oci_prometheus_server": "csharizededa/zprometheus:1.0",
    "oci_edge_ai_app": "csharizededa/edgeapp:1.0"
}
profile = zededa.create_profile(profile_data)
print("Created Profile:", json.dumps(profile, indent=4, sort_keys=True))

# Test get_profile
all_profiles = zededa.get_profile()
#print("All Profiles:", all_profiles)



Creating image for: oci_triton_inference_server


FileNotFoundError: [Errno 2] No such file or directory: '/home/cshari/zededa/fleet/backend/payload_templates/image_create.json'

## 13. Deploy the profile on your Edge fleet using device tags

In [None]:
# Let's collect edge devices matching a given tag
devices = zededa.get_devices(tag='"edgeDeviceGroup":"jetsons"')
#print("Devices:", devices)

# Create a profile deployment with the profile we created and the devices we selected above
profile_deployment = zededa.create_profile_deployment("nvidia_triton_profile", '"edgeDeviceGroup":"jetsons"')
print("Profile Deployment:", json.dumps(profile_deployment, indent=4, sort_keys=True))
