## TAO remote client - Data-Services
### The workflow in a nutshell
TAO Data Services include 4 key pipelines:
1. Offline data augmentation using DALI
2. Auto labeling using TAO Mask Auto-labeler (MAL)
3. Annotation conversion
4. Groundtruth analytics

## Learning Objectives

In this notebook, you will learn how to leverage the simplicity and convenience of TAO to:

* Convert KITTI dataset to COCO format
* Run auto-labeling to generate pseudo masks for KITTI bounding boxes
* Apply data augmentation to the KITTI dataset with bounding boxe refinement
* Run data analytics to collect useful statistics on the original and augmented KITTI dataset

### Table of contents

1. [Create a cloud workspace](#head-2)
1. [Convert KITTI data to COCO format](#head-1)
1. [Generate pseudo-masks with the auto-labeler](#head-2)
1. [Apply data augmentation](#head-3)
1. [Perform data analytics](#head-4)
1. [Perform data validation](#head-5)


### Requirements
Please find the server requirements [here](https://docs.nvidia.com/tao/tao-toolkit/text/tao_toolkit_api/api_setup.html#)

### Install TAO remote client

In [None]:
# # SKIP this step IF you have already installed the TAO-Client wheel.
! pip3 install nvidia-tao-client

In [None]:
# # View the version of the TAO-Client
! tao --version

In [None]:
import os
import subprocess
import json
import time
from IPython.display import clear_output

In [None]:
# Restore variable in case of jupyter session restart and resume execution where it left off
%store -r workspace_id
%store -r kitti_dataset_id
%store -r coco_dataset_id
%store -r coco_mask_dataset_id
%store -r convert_job_id
%store -r auto_labeling_job_id
%store -r coco_mask_augmented_dataset_id
%store -r analyze_job_id
%store -r validate_annotations_job_id
%store -r validate_images_job_id

In [None]:
namespace = 'default'
job_map = {}

### Common Functions used across the notebook

#### Function to parse logs

In [None]:
def my_tail(model_name_cli, job_id):
	status = None
	while True:
		time.sleep(10)
		clear_output(wait=True)
		response = subprocess.getoutput(f"tao {model_name_cli} get-job-metadata --job-id {job_id}")
		response = json.loads(response)
		if response and "status" in response.keys() and response.get("status") in ("Done", "Error", "Canceled", "Paused"):
			print(json.dumps(response.get("job_details", {}), indent=4))
			status = response.get("status")
			assert status == "Done", f"Status is not Done, it is {status}"
			break

		logs = subprocess.getoutput(f"tao {model_name_cli} get-job-logs --job-id {job_id}")
		if not logs:
			continue
		log_content_lines = logs.split("\n")        
		for line in log_content_lines:
			print(line.strip())
			if line.strip() == "Error EOF":
				status = "Error"
				break
			elif line.strip() == "Done EOF":
				status = "Done"
				break
		if status is not None:
			break
	return status

#### Function to load login details from saved config

In [None]:
def load_tao_credentials_from_config():
    """Load TAO credentials from ~/.tao/config and set as environment variables"""
    from configparser import ConfigParser
    from pathlib import Path
    import os
    
    config_path = Path.home() / '.tao' / 'config'
    
    if not config_path.exists():
        print(f"Warning: Config file not found at {config_path}")
        print("Please run 'tao login' first")
        return False
    
    try:
        parser = ConfigParser()
        parser.read(config_path)
        
        # Read from [CURRENT] section
        if parser.has_section('CURRENT'):
            section = parser['CURRENT']
        else:
            print("Warning: No [CURRENT] section found in config file")
            return False
        
        # Set environment variables
        if 'tao_base_url' in section:
            os.environ['TAO_BASE_URL'] = section['tao_base_url']
            print(f"✓ TAO_BASE_URL set to: {section['tao_base_url']}")
        
        if 'tao_org' in section:
            os.environ['TAO_ORG'] = section['tao_org']
            print(f"✓ TAO_ORG set to: {section['tao_org']}")
        
        if 'tao_token' in section:
            os.environ['TAO_TOKEN'] = section['tao_token']
            print(f"✓ TAO_TOKEN set (expires: check token if auth fails)")
        
        return True
        
    except Exception as e:
        print(f"Error reading config file: {e}")
        return False

### FIXME's <a class="anchor" id="head-2"></a>

1. Assign a workdir in FIXME 1
1. Assign the ip_address and port_number in FIXME 2 ([info](https://docs.nvidia.com/tao/tao-toolkit/text/tao_toolkit_api/api_rest_api.html))
1. Assign the ngc_key variable in FIXME 3
1. Assign the ngc_org_name variable in FIXME 4
1. Set cloud storage details in FIXME 5
1. Assign path of kitti dataset relative to the bucket in FIXME 6
1. Database backup/restore archive filename in FIXME 10

#### Set API service's host information

In [None]:
# FIXME 4: Set TAO API environment variables

# Set to your TAO API endpoint
os.environ["TAO_BASE_URL"] = os.environ.get("TAO_BASE_URL", "https://your_tao_ip_address:port/api/v2")

#### Set NGC Personal key for authentication and NGC org to access API services

In [None]:
os.environ["NGC_KEY"] = ngc_key = os.environ.get("NGC_KEY", "your_ngc_key")  # FIXME6 example: (Add NGC Personal key)

In [None]:
os.environ["NGC_ORG"] = ngc_org_name = os.environ.get("NGC_ORG", "nvstaging")  # FIXME7 your NGC ORG

### Login <a class="anchor" id="head-3"></a>

In [None]:
# Exchange NGC_API_KEY for JWT
! tao login --ngc-org-name {ngc_org_name} --ngc-key {ngc_key} --enable-telemetry

# Load credentials when this cell runs
load_tao_credentials_from_config()

### Get NVCF gpu details <a class="anchor" id="head-2"></a>

 One of the keys of the response json are to be used as platform_id when you run each job

In [None]:
# # Valid only for NVCF backend during TAO-API helm deployment currently
# # response = json.loads(subprocess.getoutput(f'tao get-gpu-types'))
# print((json.dumps(response, indent=4)))

### Create cloud workspace
This workspace will be the place where your datasets reside and your results of TAO API jobs will be pushed to.

If you want to have different workspaces for dataset and experiment, duplocate the workspace creation part and adjust the metadata accordingly.

In [None]:
# FIXME 7: Dataset Cloud bucket details to download dataset or push job artifacts for jobs

cloud_metadata = {}

# A Representative name for this cloud info
os.environ["TAO_WORKSPACE_NAME"] = cloud_metadata["name"] = os.environ.get("TAO_WORKSPACE_NAME", "AWS workspace info")

# Cloud specific details. Below is assuming AWS.
cloud_metadata["cloud_specific_details"] = {}

 # Whether it is AWS, HuggingFace or Azure
os.environ["TAO_WORKSPACE_CLOUD_TYPE"] = cloud_metadata["cloud_specific_details"]["cloud_type"] = os.environ.get("TAO_WORKSPACE_CLOUD_TYPE", "aws")

# Bucket region
os.environ["TAO_WORKSPACE_CLOUD_REGION"] = cloud_metadata["cloud_specific_details"]["cloud_region"] = os.environ.get("TAO_WORKSPACE_CLOUD_REGION", "us-west-1")

# Bucket name
os.environ["TAO_WORKSPACE_CLOUD_BUCKET_NAME"] = cloud_metadata["cloud_specific_details"]["cloud_bucket_name"] = os.environ.get("TAO_WORKSPACE_CLOUD_BUCKET_NAME", "bucket_name")

# Access and Secret keys
os.environ["TAO_WORKSPACE_CLOUD_ACCESS_KEY"] = cloud_metadata["cloud_specific_details"]["access_key"] = os.environ.get("TAO_WORKSPACE_CLOUD_ACCESS_KEY", "access_key")
os.environ["TAO_WORKSPACE_CLOUD_SECRET_KEY"] = cloud_metadata["cloud_specific_details"]["secret_key"] = os.environ.get("TAO_WORKSPACE_CLOUD_SECRET_KEY", "secret_key")

In [None]:
workspace_id = subprocess.getoutput(f"tao annotations create-workspace --name 'AWS Workspace' --cloud-type {cloud_metadata["cloud_specific_details"]["cloud_type"]} --cloud-specific-details '{json.dumps(cloud_metadata["cloud_specific_details"])}'")
print(workspace_id)
%store workspace_id

In [None]:
# #Optional: Restore database with a mongodump file saved in workspace dump/archive/{backup_filename}
# backup_file_name = "mongodump.tar.gz" # FIXME 7
# response = subprocess.getoutput(f"tao annotations restore-workspace --workspace-id {workspace_id} --backup_file_name {backup_file_name}")
# print(response)

## 1. Convert KITTI data to COCO format <a class="anchor" id="head-1"></a>
We would first convert the dataset from KITTI to COCO formats.

### Define the task and action

### Create dataset
We support both KITTI and COCO data formats

KITTI dataset follow the directory structure displayed below:
```
$DATA_DIR/dataset
 images
   image_name_1.jpg
   image_name_2.jpg
| ...
 labels
 image_name_1.txt
 image_name_2.txt
 ...
```

And COCO dataset follow the directory structure displayed below:
```
$DATA_DIR/dataset
 images
   image_name_1.jpg
   image_name_2.jpg
| ...
 annotations.json
```
For this notebook, we will be using the KITTI object detection dataset for this example. To find more details, please visit [here](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=2d).

### Create a kitti Dataset

In [None]:
# FIXME5 : Set path relative to cloud bucket
os.environ["TAO_KITTI_DATASET_PATH"] = kitti_dataset_path = os.environ.get("TAO_KITTI_DATASET_PATH", "/data/tao_od_synthetic_subset_train_convert_cleaned/")

In [None]:
# Create dataset
kitti_dataset_id = subprocess.getoutput(f"tao annotations create-dataset --dataset-type object_detection --dataset-format kitti --workspace-id {workspace_id} --cloud-file-path {kitti_dataset_path} --use-for '{json.dumps(['testing'])}'")
print(kitti_dataset_id)
%store kitti_dataset_id

In [None]:
# Check progress
while True:
    clear_output(wait=True)
    response = subprocess.getoutput(f"tao annotations get-dataset-metadata --dataset-id {kitti_dataset_id} ")
    try:
        response = json.loads(response)
    except Exception as e:
        print(response)
        raise e
    print(json.dumps(response, sort_keys=True, indent=4))
    if response.get("status") == "invalid_pull":
        raise ValueError("Dataset pull failed")
    if response.get("status") == "pull_complete":
        break
    time.sleep(5)

### Dataset format conversion action 


#### Get specs


In [None]:
# Default model specs
annotation_conversion_specs_response = subprocess.getoutput(f"tao annotations get-job-schema --action annotation_format_convert")
annotation_conversion_specs_schema = json.loads(annotation_conversion_specs_response)
annotation_conversion_specs = annotation_conversion_specs_schema.get("default", {})
print(json.dumps(annotation_conversion_specs, indent=4))

In [None]:
# Set specs
annotation_conversion_specs["data"]["input_format"] = "KITTI"
annotation_conversion_specs["data"]["output_format"] = "COCO"
print(json.dumps(annotation_conversion_specs, indent=4))

#### Run action 


In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
coco_dataset_id = kitti_dataset_id
convert_job_id = subprocess.getoutput(f"tao annotations create-job --kind dataset --dataset-id {kitti_dataset_id} --action annotation_format_convert --specs '{json.dumps(annotation_conversion_specs)}'")
print(convert_job_id)
%store coco_dataset_id
%store convert_job_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("annotations", convert_job_id)

In [None]:
# After the action is completed the format of dataset will be converted to coco from kitti
print(subprocess.getoutput(f"tao annotations get-dataset-metadata --dataset-id {kitti_dataset_id} "))

## 2. Generate pseudo-masks with the auto-labeler <a class="anchor" id="head-2"></a>
Here we will use a pretrained MAL model to generate pseudo-masks for the converted KITTI data. 

### Define the task and action

### Create a coco Dataset - If you already have data in coco detection format(without masks) and skipped step 1

In [None]:
# # Create dataset
# coco_dataset_id = subprocess.getoutput(f"tao annotations create-dataset --dataset-type object_detection --dataset-format coco --workspace-id {workspace_id} --cloud-file-path {coco_dataset_path} --use-for '{json.dumps(['testing'])}'")
# print(coco_dataset_id)
# %store coco_dataset_id

In [None]:
# # Check progress
# while True:
#     clear_output(wait=True)
#     response = subprocess.getoutput(f"tao annotations get-dataset-metadata --dataset-id {coco_dataset_id} ")
#     try:
#         response = json.loads(response)
#     except Exception as e:
#         print(response)
#         raise e
#     print(json.dumps(response, sort_keys=True, indent=4))
#     if response.get("status") == "invalid_pull":
#         raise ValueError("Dataset pull failed")
#     if response.get("status") == "pull_complete":
#         break
#     time.sleep(5)

### Assign PTM

In [None]:
# List base experiments (PTMs) using TAO SDK  
filter_params = {"network_arch": "auto_label"}
message = subprocess.getoutput(f"tao auto_label list-base-experiments --filter-params '{json.dumps(filter_params)}'")
message = json.loads(message)
# Store base experiments list for reuse
base_experiments = message

print(f" Available base experiments (PTMs) for auto_label:")
print("name\t\t\t     model id\t\t\t     network architecture")
print("-" * 120)

for exp in base_experiments:
    exp_name = exp.get("name", "N/A")
    exp_id = exp.get("id", "N/A")
    exp_arch = exp.get("network_arch", "N/A")
    print(f"{exp_name}\t{exp_id}\t{exp_arch}")

In [None]:
pretrained_map = {"auto_label" : "mask_auto_label:trainable_v1.1"}

In [None]:
# Get pretrained model using TAO SDK
selected_ptm_id = None

# Search for PTM with given NGC path
for exp in base_experiments:
    ngc_path = exp.get("ngc_path", "")
    if ngc_path.endswith(pretrained_map["auto_label"]):
        selected_ptm_id = exp.get("id")
        print(" Selected PTM metadata:")
        print(json.dumps(exp, indent=4))
        break

if not selected_ptm_id:
    print(f" PTM with NGC path ending in '{pretrained_map['auto_label']}' not found!")

In [None]:
if selected_ptm_id:
    print(f"PTM ID {selected_ptm_id} will be used as base_experiment_id in job creation")
    update_data = json.dumps({"base_experiment_ids": [selected_ptm_id]})
    updated_dataset = subprocess.getoutput(f"tao auto_label update-dataset --dataset-id {coco_dataset_id} --update-data '{update_data}'")
    print(updated_dataset)
else:
    raise ValueError("No PTM found, Auto-Labeling cant' be performed")

### Auto labeling action

#### Get specs

In [None]:
# Default model specs
auto_label_generate_specs_response = subprocess.getoutput(f"tao auto_label get-job-schema --action auto_label")
print(auto_label_generate_specs_response)
auto_label_generate_specs_schema = json.loads(auto_label_generate_specs_response)
auto_label_generate_specs = auto_label_generate_specs_schema.get("default", {})
print(json.dumps(auto_label_generate_specs, indent=4))

In [None]:
# Set specs
auto_label_generate_specs["gpu_ids"] = [0]
print(json.dumps(auto_label_generate_specs, indent=4))

### Run action

In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
coco_mask_dataset_id = kitti_dataset_id
parent = convert_job_id
auto_labeling_job_id = subprocess.getoutput(f"tao auto_label create-job --kind dataset --dataset-id {coco_dataset_id} --parent-job-id {parent} --action auto_label --specs '{json.dumps(auto_label_generate_specs)}'")
print(auto_labeling_job_id)
%store auto_labeling_job_id
%store coco_mask_dataset_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("auto_label", auto_labeling_job_id)

## 3. Apply data augmentation <a class="anchor" id="head-3"></a>
In this section, we run offline augmentation with the original dataset. During the augmentation process, we can use the pseudo-masks generated from the last step to refine the distorted or rotated bounding boxes.

### Define the task and action

### Create a coco mask Dataset - If you already have data in coco segmentation format and skipped step 1 and 2

In [None]:
# # Create dataset
# coco_mask_dataset_id = subprocess.getoutput(f"tao annotations create-dataset --dataset-type object_detection --dataset-format coco  --workspace-id {workspace_id} --cloud-file-path {coco_mask_dataset_path} --use-for '{json.dumps(['testing'])}'")
# print(coco_mask_dataset_id)
# %store coco_mask_dataset_id

In [None]:
# # Check progress
# while True:
#     clear_output(wait=True)
#     response = subprocess.getoutput(f"tao annotations get-dataset-metadata --dataset-id {coco_mask_dataset_id} ")
#     try:
#         response = json.loads(response)
#     except Exception as e:
#         print(response)
#         raise e
#     print(json.dumps(response, sort_keys=True, indent=4))
#     if response.get("status") == "invalid_pull":
#         raise ValueError("Dataset pull failed")
#     if response.get("status") == "pull_complete":
#         break
#     time.sleep(5)

### Run data augmentation action


#### Get specs


In [None]:
# Default model specs
augmentation_generate_specs_response = subprocess.getoutput(f"tao augmentation get-job-schema --action augment")
augmentation_generate_specs_schema = json.loads(augmentation_generate_specs_response)
augmentation_generate_specs = augmentation_generate_specs_schema.get("default", {})
print(json.dumps(augmentation_generate_specs, indent=4))

In [None]:
# Change any spec key if required
print(json.dumps(augmentation_generate_specs, indent=4))

#### Run action


In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
parent = auto_labeling_job_id
coco_mask_augmented_dataset_id = subprocess.getoutput(f"tao augmentation create-job --kind dataset --dataset-id {coco_mask_dataset_id} --action augment --parent-job-id {parent} --specs '{json.dumps(augmentation_generate_specs)}'")
print(coco_mask_augmented_dataset_id)
%store coco_mask_augmented_dataset_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("augmentation", coco_mask_augmented_dataset_id)

In [None]:
# After the augment action you'll get a new dataset
print(subprocess.getoutput(f"tao augmentation get-dataset-metadata --dataset-id {coco_mask_augmented_dataset_id} "))

## 4. Perform data analytics  <a class="anchor" id="head-4"></a>
Next, we perform analytics with the KITTI dataset.

### Run Data analytics annotation analytics action


#### Get specs


In [None]:
# Default model specs
analytics_analyze_specs_response = subprocess.getoutput(f"tao analytics get-job-schema --action analyze")
analytics_analyze_specs_schema = json.loads(analytics_analyze_specs_response)
analytics_analyze_specs = analytics_analyze_specs_schema.get("default", {})
print(json.dumps(analytics_analyze_specs, indent=4))

In [None]:
# Set specs
analytics_analyze_specs["data"]["input_format"] = "COCO"
print(json.dumps(analytics_analyze_specs, indent=4))

#### Run action


In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
parent = convert_job_id
analyze_job_id = subprocess.getoutput(f"tao analytics create-job --kind dataset --dataset-id {coco_dataset_id} --action analyze --parent-job-id {parent} --specs '{json.dumps(analytics_analyze_specs)}'")
print(analyze_job_id)
%store analyze_job_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("analytics", analyze_job_id)

## 5. Perform data validation  <a class="anchor" id="head-5"></a>
Next, we perform validate the annotations and images.

### Run Data annotation validation action

#### Get specs


In [None]:
# Default model specs
validate_annotations_specs_response = subprocess.getoutput(f"tao analytics get-job-schema --action validate_annotations")
validate_annotations_specs_schema = json.loads(validate_annotations_specs_response)
validate_annotations_specs = validate_annotations_specs_schema.get("default", {})
print(json.dumps(validate_annotations_specs, indent=4))

In [None]:
# Set specs
validate_annotations_specs["data"]["input_format"] = "COCO"
print(json.dumps(validate_annotations_specs, indent=4))

#### Run action


In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
parent = convert_job_id
validate_annotations_job_id = subprocess.getoutput(f"tao analytics create-job --kind dataset --dataset-id {coco_dataset_id} --action validate_annotations --parent-job-id {parent} --specs '{json.dumps(validate_annotations_specs)}'")
print(validate_annotations_job_id)
%store validate_annotations_job_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("analytics", validate_annotations_job_id)

### Run Data image validation action - removes corrupted images and creates a new dataset

#### Get specs


In [None]:
# Default model specs
validate_images_specs_response = subprocess.getoutput(f"tao image get-job-schema --action validate_images")
validate_images_specs_schema = json.loads(validate_images_specs_response)
validate_images_specs = validate_images_specs_schema.get("default", {})
print(json.dumps(validate_images_specs, indent=4))

In [None]:
# Make changes to the specs if necessary

#### Run action


In [None]:
# Add --platform_id uuid for NVCF backend, where the uuid is a key from output of tao gpu-types
# Run action
validate_images_job_id = subprocess.getoutput(f"tao image create-job --kind dataset --dataset-id {kitti_dataset_id} --action validate_images --specs '{json.dumps(validate_images_specs)}'")
print(validate_images_job_id)
%store validate_images_job_id

In [None]:
# Check status (the file won't exist until the backend Toolkit container is running -- can take several minutes)
status = my_tail("image", validate_images_job_id)

In [None]:
# # Optional: Backup database with a mongodump file saved in workspace dump/archive/{backup_filename}
# backup_file_name = "mongodump.tar.gz" # FIXME 7
# subprocess.getoutput(f"tao image backup-workspace --workspace-id {workspace_id} --backup_file_name {backup_file_name}")

### Delete dataset <a class="anchor" id="head-21"></a>

#### Delete original kitti dataset <a class="anchor" id="head-21"></a>

In [None]:
subprocess.getoutput(f"tao image delete-dataset --dataset-id {kitti_dataset_id}")

#### Delete coco augment dataset <a class="anchor" id="head-21"></a>

In [None]:
subprocess.getoutput(f"tao image delete-dataset --dataset-id {coco_mask_augmented_dataset_id}")