# Tracing, Evaluating, and Profiling your Agent

In this notebook, we will walk through the advanced capabilities of NVIDIA NeMo Agent toolkit (NAT) for <a href="https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/observe/index.html"> observability</a>, <a href="https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html">evaluation</a>, and <a href="https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html">profiling</a>, from setting up Phoenix tracing to running comprehensive workflow assessments and performance analysis.

# Table of Contents

- [0.0) Setup](#setup)
  - [0.1) Prerequisites](#prereqs)
  - [0.2) API Keys](#api-keys)
  - [0.3) Data Sources](#data-sources)
  - [0.4) Installing NeMo Agent Toolkit](#installing-nat)
- [1.0) Creating a Tool-Calling Workflow](#creating-workflow)
  - [1.1) Total Product Sales Tool](#product-sales-tool)
  - [1.2) Sales Per Day Tool](#sales-per-day-tool)
  - [1.3) Detect Outliers Tool](#detect-outliers-tool)
  - [1.4) Technical Specs Retrieval Tool](#technical-specs-tool)
  - [1.5) Data Analysis/Plotting Tools](#plotting-tools)
  - [1.6) Register The Tools](#register-tools)
  - [1.7) Workflow Configuration File](#workflow-config)
  - [1.8) Testing/Verifying Workflow Installation](#verify-tools)
- [2.0) Observing a Workflow with Phoenix](#observe-workflow)
  - [2.1) Updating the Workflow Configuration For Telemetry](#update-config)
  - [2.2) Start Phoenix Server](#start-phoenix)
  - [2.3) Rerun the Workflow](#rerun-workflow)
  - [2.4) Viewing the Trace](#view-trace)
- [3.0) Evaluating a Workflow](#eval-workflow)
  - [3.1) Create an Evaluation Dataset](#eval-dataset)
  - [3.2) Updating the Workflow Configuration](#update-config-again)
  - [3.3) Running the Evaluation](#run-eval)
  - [3.4) Understanding Evaluation Results](#understand-eval)
- [4.0) Profiling a Workflow](#profile-workflow)
  - [4.1) Updating the Workflow Configuration](#update-profiling-workflow)
  - [4.2) Understanding the Profiler Configuration](#understand-profiler-config)
  - [4.3) Running the Profiler](#run-profiler)
  - [4.4) Understanding Profiler Output Files](#understand-profiler-output-files)
- [5.0) Notebook Summary](#summary)

<span style="color:rgb(0, 31, 153); font-style: italic;">Note: In Google Colab use the Table of Contents tab to navigate.</span>


<a id="setup"></a>
# 0.0) Setup


<a id="prereqs"></a>
## 0.1) Prerequisites

- **Platform:** Linux, macOS, or Windows
- **Python:** version 3.11, 3.12, or 3.13
- **Python Packages:** `pip`

<a id="api-keys"></a>
## 0.2) API Keys

For this notebook, you will need the following API keys to run all examples end-to-end:

- **NVIDIA Build:** You can obtain an NVIDIA Build API Key by creating an [NVIDIA Build](https://build.nvidia.com) account and generating a key at https://build.nvidia.com/settings/api-keys

Then you can run the cell below:

In [None]:
import getpass
import os

if "NVIDIA_API_KEY" not in os.environ:
    nvidia_api_key = getpass.getpass("Enter your NVIDIA API key: ")
    os.environ["NVIDIA_API_KEY"] = nvidia_api_key

<a id="data-sources"></a>
## 0.3) Data Sources

Several data files are required for this example. To keep this as a stand-alone example, the files are included here as cells which can be run to create them.

The following cell creates the `data` directory as well as a `rag` subdirectory

In [None]:
!mkdir -p data/rag

The following cell writes the `data/retail_sales_data.csv` file.

In [None]:
%%writefile data/retail_sales_data.csv
Date,StoreID,Product,UnitsSold,Revenue,Promotion
2024-01-01,S001,Laptop,1,1000,No
2024-01-01,S001,Phone,9,4500,No
2024-01-01,S001,Tablet,2,600,No
2024-01-01,S002,Laptop,9,9000,No
2024-01-01,S002,Phone,10,5000,No
2024-01-01,S002,Tablet,5,1500,No
2024-01-02,S001,Laptop,4,4000,No
2024-01-02,S001,Phone,11,5500,No
2024-01-02,S001,Tablet,7,2100,No
2024-01-02,S002,Laptop,7,7000,No
2024-01-02,S002,Phone,6,3000,No
2024-01-02,S002,Tablet,9,2700,No
2024-01-03,S001,Laptop,6,6000,No
2024-01-03,S001,Phone,7,3500,No
2024-01-03,S001,Tablet,8,2400,No
2024-01-03,S002,Laptop,3,3000,No
2024-01-03,S002,Phone,16,8000,No
2024-01-03,S002,Tablet,5,1500,No
2024-01-04,S001,Laptop,5,5000,No
2024-01-04,S001,Phone,11,5500,No
2024-01-04,S001,Tablet,9,2700,No
2024-01-04,S002,Laptop,2,2000,No
2024-01-04,S002,Phone,12,6000,No
2024-01-04,S002,Tablet,7,2100,No
2024-01-05,S001,Laptop,8,8000,No
2024-01-05,S001,Phone,18,9000,No
2024-01-05,S001,Tablet,5,1500,No
2024-01-05,S002,Laptop,7,7000,No
2024-01-05,S002,Phone,10,5000,No
2024-01-05,S002,Tablet,10,3000,No
2024-01-06,S001,Laptop,9,9000,No
2024-01-06,S001,Phone,11,5500,No
2024-01-06,S001,Tablet,5,1500,No
2024-01-06,S002,Laptop,5,5000,No
2024-01-06,S002,Phone,14,7000,No
2024-01-06,S002,Tablet,10,3000,No
2024-01-07,S001,Laptop,2,2000,No
2024-01-07,S001,Phone,15,7500,No
2024-01-07,S001,Tablet,6,1800,No
2024-01-07,S002,Laptop,0,0,No
2024-01-07,S002,Phone,7,3500,No
2024-01-07,S002,Tablet,12,3600,No
2024-01-08,S001,Laptop,5,5000,No
2024-01-08,S001,Phone,8,4000,No
2024-01-08,S001,Tablet,5,1500,No
2024-01-08,S002,Laptop,4,4000,No
2024-01-08,S002,Phone,11,5500,No
2024-01-08,S002,Tablet,9,2700,No
2024-01-09,S001,Laptop,6,6000,No
2024-01-09,S001,Phone,9,4500,No
2024-01-09,S001,Tablet,8,2400,No
2024-01-09,S002,Laptop,7,7000,No
2024-01-09,S002,Phone,11,5500,No
2024-01-09,S002,Tablet,8,2400,No
2024-01-10,S001,Laptop,6,6000,No
2024-01-10,S001,Phone,11,5500,No
2024-01-10,S001,Tablet,5,1500,No
2024-01-10,S002,Laptop,8,8000,No
2024-01-10,S002,Phone,5,2500,No
2024-01-10,S002,Tablet,6,1800,No
2024-01-11,S001,Laptop,5,5000,No
2024-01-11,S001,Phone,7,3500,No
2024-01-11,S001,Tablet,5,1500,No
2024-01-11,S002,Laptop,4,4000,No
2024-01-11,S002,Phone,10,5000,No
2024-01-11,S002,Tablet,4,1200,No
2024-01-12,S001,Laptop,2,2000,No
2024-01-12,S001,Phone,10,5000,No
2024-01-12,S001,Tablet,9,2700,No
2024-01-12,S002,Laptop,8,8000,No
2024-01-12,S002,Phone,10,5000,No
2024-01-12,S002,Tablet,14,4200,No
2024-01-13,S001,Laptop,3,3000,No
2024-01-13,S001,Phone,6,3000,No
2024-01-13,S001,Tablet,9,2700,No
2024-01-13,S002,Laptop,1,1000,No
2024-01-13,S002,Phone,12,6000,No
2024-01-13,S002,Tablet,7,2100,No
2024-01-14,S001,Laptop,4,4000,Yes
2024-01-14,S001,Phone,16,8000,Yes
2024-01-14,S001,Tablet,4,1200,Yes
2024-01-14,S002,Laptop,5,5000,Yes
2024-01-14,S002,Phone,14,7000,Yes
2024-01-14,S002,Tablet,6,1800,Yes
2024-01-15,S001,Laptop,9,9000,No
2024-01-15,S001,Phone,6,3000,No
2024-01-15,S001,Tablet,11,3300,No
2024-01-15,S002,Laptop,5,5000,No
2024-01-15,S002,Phone,10,5000,No
2024-01-15,S002,Tablet,4,1200,No
2024-01-16,S001,Laptop,6,6000,No
2024-01-16,S001,Phone,11,5500,No
2024-01-16,S001,Tablet,5,1500,No
2024-01-16,S002,Laptop,4,4000,No
2024-01-16,S002,Phone,7,3500,No
2024-01-16,S002,Tablet,4,1200,No
2024-01-17,S001,Laptop,6,6000,No
2024-01-17,S001,Phone,14,7000,No
2024-01-17,S001,Tablet,7,2100,No
2024-01-17,S002,Laptop,3,3000,No
2024-01-17,S002,Phone,7,3500,No
2024-01-17,S002,Tablet,6,1800,No
2024-01-18,S001,Laptop,7,7000,Yes
2024-01-18,S001,Phone,10,5000,Yes
2024-01-18,S001,Tablet,6,1800,Yes
2024-01-18,S002,Laptop,5,5000,Yes
2024-01-18,S002,Phone,16,8000,Yes
2024-01-18,S002,Tablet,8,2400,Yes
2024-01-19,S001,Laptop,4,4000,No
2024-01-19,S001,Phone,12,6000,No
2024-01-19,S001,Tablet,7,2100,No
2024-01-19,S002,Laptop,3,3000,No
2024-01-19,S002,Phone,12,6000,No
2024-01-19,S002,Tablet,8,2400,No
2024-01-20,S001,Laptop,6,6000,No
2024-01-20,S001,Phone,8,4000,No
2024-01-20,S001,Tablet,6,1800,No
2024-01-20,S002,Laptop,8,8000,No
2024-01-20,S002,Phone,9,4500,No
2024-01-20,S002,Tablet,8,2400,No
2024-01-21,S001,Laptop,3,3000,No
2024-01-21,S001,Phone,9,4500,No
2024-01-21,S001,Tablet,5,1500,No
2024-01-21,S002,Laptop,8,8000,No
2024-01-21,S002,Phone,15,7500,No
2024-01-21,S002,Tablet,7,2100,No
2024-01-22,S001,Laptop,1,1000,No
2024-01-22,S001,Phone,15,7500,No
2024-01-22,S001,Tablet,5,1500,No
2024-01-22,S002,Laptop,11,11000,No
2024-01-22,S002,Phone,4,2000,No
2024-01-22,S002,Tablet,4,1200,No
2024-01-23,S001,Laptop,3,3000,No
2024-01-23,S001,Phone,8,4000,No
2024-01-23,S001,Tablet,8,2400,No
2024-01-23,S002,Laptop,6,6000,No
2024-01-23,S002,Phone,12,6000,No
2024-01-23,S002,Tablet,12,3600,No
2024-01-24,S001,Laptop,2,2000,No
2024-01-24,S001,Phone,14,7000,No
2024-01-24,S001,Tablet,6,1800,No
2024-01-24,S002,Laptop,1,1000,No
2024-01-24,S002,Phone,5,2500,No
2024-01-24,S002,Tablet,7,2100,No
2024-01-25,S001,Laptop,7,7000,No
2024-01-25,S001,Phone,11,5500,No
2024-01-25,S001,Tablet,11,3300,No
2024-01-25,S002,Laptop,6,6000,No
2024-01-25,S002,Phone,11,5500,No
2024-01-25,S002,Tablet,5,1500,No
2024-01-26,S001,Laptop,5,5000,Yes
2024-01-26,S001,Phone,22,11000,Yes
2024-01-26,S001,Tablet,7,2100,Yes
2024-01-26,S002,Laptop,6,6000,Yes
2024-01-26,S002,Phone,24,12000,Yes
2024-01-26,S002,Tablet,3,900,Yes
2024-01-27,S001,Laptop,7,7000,Yes
2024-01-27,S001,Phone,20,10000,Yes
2024-01-27,S001,Tablet,6,1800,Yes
2024-01-27,S002,Laptop,4,4000,Yes
2024-01-27,S002,Phone,8,4000,Yes
2024-01-27,S002,Tablet,6,1800,Yes
2024-01-28,S001,Laptop,10,10000,No
2024-01-28,S001,Phone,15,7500,No
2024-01-28,S001,Tablet,12,3600,No
2024-01-28,S002,Laptop,6,6000,No
2024-01-28,S002,Phone,11,5500,No
2024-01-28,S002,Tablet,10,3000,No
2024-01-29,S001,Laptop,3,3000,No
2024-01-29,S001,Phone,16,8000,No
2024-01-29,S001,Tablet,5,1500,No
2024-01-29,S002,Laptop,6,6000,No
2024-01-29,S002,Phone,17,8500,No
2024-01-29,S002,Tablet,2,600,No
2024-01-30,S001,Laptop,3,3000,No
2024-01-30,S001,Phone,11,5500,No
2024-01-30,S001,Tablet,2,600,No
2024-01-30,S002,Laptop,6,6000,No
2024-01-30,S002,Phone,16,8000,No
2024-01-30,S002,Tablet,8,2400,No
2024-01-31,S001,Laptop,5,5000,Yes
2024-01-31,S001,Phone,22,11000,Yes
2024-01-31,S001,Tablet,9,2700,Yes
2024-01-31,S002,Laptop,3,3000,Yes
2024-01-31,S002,Phone,14,7000,Yes
2024-01-31,S002,Tablet,4,1200,Yes
2024-02-01,S001,Laptop,2,2000,No
2024-02-01,S001,Phone,7,3500,No
2024-02-01,S001,Tablet,11,3300,No
2024-02-01,S002,Laptop,6,6000,No
2024-02-01,S002,Phone,11,5500,No
2024-02-01,S002,Tablet,5,1500,No
2024-02-02,S001,Laptop,2,2000,No
2024-02-02,S001,Phone,9,4500,No
2024-02-02,S001,Tablet,7,2100,No
2024-02-02,S002,Laptop,5,5000,No
2024-02-02,S002,Phone,9,4500,No
2024-02-02,S002,Tablet,12,3600,No
2024-02-03,S001,Laptop,9,9000,No
2024-02-03,S001,Phone,12,6000,No
2024-02-03,S001,Tablet,9,2700,No
2024-02-03,S002,Laptop,10,10000,No
2024-02-03,S002,Phone,6,3000,No
2024-02-03,S002,Tablet,10,3000,No
2024-02-04,S001,Laptop,6,6000,No
2024-02-04,S001,Phone,5,2500,No
2024-02-04,S001,Tablet,8,2400,No
2024-02-04,S002,Laptop,6,6000,No
2024-02-04,S002,Phone,10,5000,No
2024-02-04,S002,Tablet,10,3000,No
2024-02-05,S001,Laptop,7,7000,No
2024-02-05,S001,Phone,13,6500,No
2024-02-05,S001,Tablet,11,3300,No
2024-02-05,S002,Laptop,8,8000,No
2024-02-05,S002,Phone,11,5500,No
2024-02-05,S002,Tablet,8,2400,No
2024-02-06,S001,Laptop,5,5000,No
2024-02-06,S001,Phone,14,7000,No
2024-02-06,S001,Tablet,4,1200,No
2024-02-06,S002,Laptop,2,2000,No
2024-02-06,S002,Phone,11,5500,No
2024-02-06,S002,Tablet,7,2100,No
2024-02-07,S001,Laptop,6,6000,No
2024-02-07,S001,Phone,7,3500,No
2024-02-07,S001,Tablet,9,2700,No
2024-02-07,S002,Laptop,2,2000,No
2024-02-07,S002,Phone,8,4000,No
2024-02-07,S002,Tablet,9,2700,No
2024-02-08,S001,Laptop,5,5000,No
2024-02-08,S001,Phone,12,6000,No
2024-02-08,S001,Tablet,3,900,No
2024-02-08,S002,Laptop,8,8000,No
2024-02-08,S002,Phone,5,2500,No
2024-02-08,S002,Tablet,8,2400,No
2024-02-09,S001,Laptop,6,6000,Yes
2024-02-09,S001,Phone,18,9000,Yes
2024-02-09,S001,Tablet,5,1500,Yes
2024-02-09,S002,Laptop,7,7000,Yes
2024-02-09,S002,Phone,18,9000,Yes
2024-02-09,S002,Tablet,5,1500,Yes
2024-02-10,S001,Laptop,9,9000,No
2024-02-10,S001,Phone,6,3000,No
2024-02-10,S001,Tablet,8,2400,No
2024-02-10,S002,Laptop,7,7000,No
2024-02-10,S002,Phone,5,2500,No
2024-02-10,S002,Tablet,6,1800,No
2024-02-11,S001,Laptop,6,6000,No
2024-02-11,S001,Phone,11,5500,No
2024-02-11,S001,Tablet,2,600,No
2024-02-11,S002,Laptop,7,7000,No
2024-02-11,S002,Phone,5,2500,No
2024-02-11,S002,Tablet,9,2700,No
2024-02-12,S001,Laptop,5,5000,No
2024-02-12,S001,Phone,5,2500,No
2024-02-12,S001,Tablet,4,1200,No
2024-02-12,S002,Laptop,1,1000,No
2024-02-12,S002,Phone,14,7000,No
2024-02-12,S002,Tablet,15,4500,No
2024-02-13,S001,Laptop,3,3000,No
2024-02-13,S001,Phone,18,9000,No
2024-02-13,S001,Tablet,8,2400,No
2024-02-13,S002,Laptop,5,5000,No
2024-02-13,S002,Phone,8,4000,No
2024-02-13,S002,Tablet,6,1800,No
2024-02-14,S001,Laptop,4,4000,No
2024-02-14,S001,Phone,9,4500,No
2024-02-14,S001,Tablet,6,1800,No
2024-02-14,S002,Laptop,4,4000,No
2024-02-14,S002,Phone,6,3000,No
2024-02-14,S002,Tablet,7,2100,No
2024-02-15,S001,Laptop,4,4000,Yes
2024-02-15,S001,Phone,26,13000,Yes
2024-02-15,S001,Tablet,5,1500,Yes
2024-02-15,S002,Laptop,2,2000,Yes
2024-02-15,S002,Phone,14,7000,Yes
2024-02-15,S002,Tablet,6,1800,Yes
2024-02-16,S001,Laptop,7,7000,No
2024-02-16,S001,Phone,9,4500,No
2024-02-16,S001,Tablet,1,300,No
2024-02-16,S002,Laptop,6,6000,No
2024-02-16,S002,Phone,12,6000,No
2024-02-16,S002,Tablet,10,3000,No
2024-02-17,S001,Laptop,5,5000,No
2024-02-17,S001,Phone,8,4000,No
2024-02-17,S001,Tablet,14,4200,No
2024-02-17,S002,Laptop,4,4000,No
2024-02-17,S002,Phone,13,6500,No
2024-02-17,S002,Tablet,7,2100,No
2024-02-18,S001,Laptop,6,6000,Yes
2024-02-18,S001,Phone,22,11000,Yes
2024-02-18,S001,Tablet,9,2700,Yes
2024-02-18,S002,Laptop,2,2000,Yes
2024-02-18,S002,Phone,10,5000,Yes
2024-02-18,S002,Tablet,12,3600,Yes
2024-02-19,S001,Laptop,6,6000,No
2024-02-19,S001,Phone,12,6000,No
2024-02-19,S001,Tablet,3,900,No
2024-02-19,S002,Laptop,3,3000,No
2024-02-19,S002,Phone,4,2000,No
2024-02-19,S002,Tablet,7,2100,No


The following cell writes the RAG product catalog file, `data/product_catalog.md`.

In [None]:
%%writefile data/rag/product_catalog.md
# Product Catalog: Smartphones, Laptops, and Tablets

## Smartphones

The Veltrix Solis Z9 is a flagship device in the premium smartphone segment. It builds on a decade of design iterations that prioritize screen-to-body ratio, minimal bezels, and high refresh rate displays. The 6.7-inch AMOLED panel with 120Hz refresh rate delivers immersive visual experiences, whether in gaming, video streaming, or augmented reality applications. The display's GorillaGlass Fusion coating provides scratch resistance and durability, and the thin form factor is engineered using a titanium-aluminum alloy chassis to reduce weight without compromising rigidity.

Internally, the Solis Z9 is powered by the OrionEdge V14 chipset, a 4nm process SoC designed for high-efficiency workloads. Its AI accelerator module handles on-device tasks such as voice transcription, camera optimization, and intelligent background app management. The inclusion of 12GB LPDDR5 RAM and a 256GB UFS 3.1 storage system allows for seamless multitasking, instant app launching, and rapid data access. The device supports eSIM and dual physical SIM configurations, catering to global travelers and hybrid network users.

Photography and videography are central to the Solis Z9 experience. The triple-camera system incorporates a periscope-style 8MP telephoto lens with 5x optical zoom, a 12MP ultra-wide sensor with macro capabilities, and a 64MP main sensor featuring optical image stabilization (OIS) and phase detection autofocus (PDAF). Night mode and HDRX+ processing enable high-fidelity image capture in challenging lighting conditions.

Software-wise, the device ships with LunOS 15, a lightweight Android fork optimized for modular updates and privacy compliance. The system supports secure containers for work profiles and AI-powered notifications that summarize app alerts across channels. Facial unlock is augmented by a 3D IR depth sensor, providing reliable biometric security alongside the ultrasonic in-display fingerprint scanner.

The Solis Z9 is a culmination of over a decade of design experimentation in mobile form factors, ranging from curved-edge screens to under-display camera arrays. Its balance of performance, battery efficiency, and user-centric software makes it an ideal daily driver for content creators, mobile gamers, and enterprise users.

## Laptops

The Cryon Vanta 16X represents the latest evolution of portable computing power tailored for professional-grade workloads.

The Vanta 16X features a unibody chassis milled from aircraft-grade aluminum using CNC machining. The thermal design integrates vapor chamber cooling and dual-fan exhaust architecture to support sustained performance under high computational loads. The 16-inch 4K UHD display is color-calibrated at the factory and supports HDR10+, making it suitable for cinematic video editing and high-fidelity CAD modeling.

Powering the device is Intel's Core i9-13900H processor, which includes 14 cores with a hybrid architecture combining performance and efficiency cores. This allows the system to dynamically balance power consumption and raw speed based on active workloads. The dedicated Zephira RTX 4700G GPU features 8GB of GDDR6 VRAM and is optimized for CUDA and Tensor Core operations, enabling applications in real-time ray tracing, AI inference, and 3D rendering.

The Vanta 16X includes a 2TB PCIe Gen 4 NVMe SSD, delivering sequential read/write speeds above 7GB/s, and 32GB of high-bandwidth DDR5 RAM. The machine supports hardware-accelerated virtualization and dual-booting, and ships with VireoOS Pro pre-installed, with official drivers available for Fedora, Ubuntu LTS, and NebulaOS.

Input options are expansive. The keyboard features per-key RGB lighting and programmable macros, while the haptic touchpad supports multi-gesture navigation and palm rejection. Port variety includes dual Thunderbolt 4 ports, a full-size SD Express card reader, HDMI 2.1, 2.5G Ethernet, three USB-A 3.2 ports, and a 3.5mm TRRS audio jack. A fingerprint reader is embedded in the power button and supports biometric logins via Windows Hello.

The history of the Cryon laptop line dates back to the early 2010s, when the company launched its first ultrabook aimed at mobile developers. Since then, successive generations have introduced carbon fiber lids, modular SSD bays, and convertible form factors. The Vanta 16X continues this tradition by integrating a customizable BIOS, a modular fan assembly, and a trackpad optimized for creative software like Blender and Adobe Creative Suite.

Designed for software engineers, data scientists, film editors, and 3D artists, the Cryon Vanta 16X is a workstation-class laptop in a portable shell.

## Tablets

The Nebulyn Ark S12 Ultra reflects the current apex of tablet technology, combining high-end hardware with software environments tailored for productivity and creativity.

The Ark S12 Ultra is built around a 12.9-inch OLED display that supports 144Hz refresh rate and HDR10+ dynamic range. With a resolution of 2800 x 1752 pixels and a contrast ratio of 1,000,000:1, the screen delivers vibrant color reproduction ideal for design and media consumption. The display supports true tone adaptation and low blue-light filtering for prolonged use.

Internally, the tablet uses Qualcomm's Snapdragon 8 Gen 3 SoC, which includes an Adreno 750 GPU and an NPU for on-device AI tasks. The device ships with 16GB LPDDR5X RAM and 512GB of storage with support for NVMe expansion via a proprietary magnetic dock. The 11200mAh battery enables up to 15 hours of typical use and recharges to 80 percent in 45 minutes via 45W USB-C PD.

The Ark's history traces back to the original Nebulyn Tab, which launched in 2014 as an e-reader and video streaming device. Since then, the line has evolved through multiple iterations that introduced stylus support, high-refresh screens, and multi-window desktop modes. The current model supports NebulynVerse, a DeX-like environment that allows external display mirroring and full multitasking with overlapping windows and keyboard shortcuts.

Input capabilities are central to the Ark S12 Ultra’s appeal. The Pluma Stylus 3 features magnetic charging, 4096 pressure levels, and tilt detection. It integrates haptic feedback to simulate traditional pen strokes and brush textures. The device also supports a SnapCover keyboard that includes a trackpad and programmable shortcut keys. With the stylus and keyboard, users can effectively transform the tablet into a mobile workstation or digital sketchbook.

Camera hardware includes a 13MP main sensor and a 12MP ultra-wide front camera with center-stage tracking and biometric unlock. Microphone arrays with beamforming enable studio-quality call audio. Connectivity includes Wi-Fi 7, Bluetooth 5.3, and optional LTE/5G with eSIM.

Software support is robust. The device runs NebulynOS 6.0, based on Android 14L, and supports app sandboxing, multi-user profiles, and remote device management. Integration with cloud services, including SketchNimbus and ThoughtSpace, allows for real-time collaboration and syncing of content across devices.

This tablet is targeted at professionals who require a balance between media consumption, creativity, and light productivity. Typical users include architects, consultants, university students, and UX designers.

## Comparative Summary

Each of these devices—the Veltrix Solis Z9, Cryon Vanta 16X, and Nebulyn Ark S12 Ultra—represents a best-in-class interpretation of its category. The Solis Z9 excels in mobile photography and everyday communication. The Vanta 16X is tailored for high-performance applications such as video production and AI prototyping. The Ark S12 Ultra provides a canvas for creativity, note-taking, and hybrid productivity use cases.

## Historical Trends and Design Evolution

Design across all three categories is converging toward modularity, longevity, and environmental sustainability. Recycled materials, reparability scores, and software longevity are becoming integral to brand reputation and product longevity. Future iterations are expected to feature tighter integration with wearable devices, ambient AI experiences, and cross-device workflows.

<a id="installing-nat"></a>
## 0.4) Installing NeMo Agent Toolkit

The recommended way to install NAT is through `pip` or `uv pip`.

First, we will install `uv` which offers parallel downloads and faster dependency resolution.

In [None]:
!pip install uv

NeMo Agent toolkit can be installed through the PyPI `nvidia-nat` package.

There are several optional subpackages available for NAT. For this example, we will rely on three subpackages:
* The `langchain` subpackage contains useful components for integrating and running within [LangChain](https://python.langchain.com/docs/introduction/).
* The `llama-index` subpackage contains useful components for integrating and running within [LlamaIndex](https://developers.llamaindex.ai/python/framework/).
* The `phoenix` subpackage contains components for integrating with [Phoenix](https://phoenix.arize.com/).
* The `profiling` subpackage contains components common for profiling with NeMo Agent toolkit.

In [None]:
!uv pip install "nvidia-nat[langchain,llama-index,phoenix,profiling]"

<a id="creating-workflow"></a>
# 1.0) Creating a Tool-Calling Workflow

In the previous notebook we went through a complex multi-agent example with several new tools. If you already have the example installed, you can skip this section.

In [None]:
!nat workflow create retail_sales_agent

The following cells adding additional tools to the workflow and register them.

* Sales Per Day Tool
* Detect Outliers Tool
* Total Product Sales Data Tool
* LlamaIndex RAG Tool
* Data Visualization Tools
* Tool Registration

<a id="product-sales-tool"></a>
## 1.1) Total Product Sales Tool

In [None]:
%%writefile retail_sales_agent/src/retail_sales_agent/total_product_sales_data_tool.py
from pydantic import Field

from nat.builder.builder import Builder
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.builder.function_info import FunctionInfo
from nat.cli.register_workflow import register_function
from nat.data_models.function import FunctionBaseConfig


class GetTotalProductSalesDataConfig(FunctionBaseConfig, name="get_total_product_sales_data"):
    """Get total sales data by product."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=GetTotalProductSalesDataConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def get_total_product_sales_data_function(config: GetTotalProductSalesDataConfig, _builder: Builder):
    """Get total sales data for a specific product."""
    import pandas as pd

    df = pd.read_csv(config.data_path)

    async def _get_total_product_sales_data(product_name: str) -> str:
        """
        Retrieve total sales data for a specific product.

        Args:
            product_name: Name of the product

        Returns:
            String message containing total sales data
        """
        df['Product'] = df["Product"].apply(lambda x: x.lower())
        revenue = df[df['Product'] == product_name]['Revenue'].sum()
        units_sold = df[df['Product'] == product_name]['UnitsSold'].sum()

        return f"Revenue for {product_name} are {revenue} and total units sold are {units_sold}"

    yield FunctionInfo.from_fn(
        _get_total_product_sales_data,
        description=_get_total_product_sales_data.__doc__)

<a id="sales-per-day-tool"></a>
## 1.2) Sales Per Day Tool

In [None]:
%%writefile retail_sales_agent/src/retail_sales_agent/sales_per_day_tool.py
from pydantic import Field

from nat.builder.builder import Builder
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.builder.function_info import FunctionInfo
from nat.cli.register_workflow import register_function
from nat.data_models.function import FunctionBaseConfig


class GetSalesPerDayConfig(FunctionBaseConfig, name="get_sales_per_day"):
    """Get total sales across all products per day."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=GetSalesPerDayConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def sales_per_day_function(config: GetSalesPerDayConfig, builder: Builder):
    """Get total sales across all products per day."""
    import pandas as pd

    df = pd.read_csv(config.data_path)
    df['Product'] = df["Product"].apply(lambda x: x.lower())

    async def _get_sales_per_day(date: str, product: str) -> str:
        """
        Calculate total sales data across all products for a specific date.

        Args:
            date: Date in YYYY-MM-DD format
            product: Product name

        Returns:
            String message with the total sales for the day
        """
        if date == "None":
            return "Please provide a date in YYYY-MM-DD format."
        total_revenue = df[(df['Date'] == date) & (df['Product'] == product)]['Revenue'].sum()
        total_units_sold = df[(df['Date'] == date) & (df['Product'] == product)]['UnitsSold'].sum()

        return f"Total revenue for {date} is {total_revenue} and total units sold is {total_units_sold}"

    yield FunctionInfo.from_fn(
        _get_sales_per_day,
        description=_get_sales_per_day.__doc__)

<a id="detect-outliers-tool"></a>
## 1.3) Detect Outliers Tool

In [None]:
%%writefile retail_sales_agent/src/retail_sales_agent/detect_outliers_tool.py
from pydantic import Field

from nat.builder.builder import Builder
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.builder.function_info import FunctionInfo
from nat.cli.register_workflow import register_function
from nat.data_models.function import FunctionBaseConfig


class DetectOutliersIQRConfig(FunctionBaseConfig, name="detect_outliers_iqr"):
    """Detect outliers in sales data using IQR method."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=DetectOutliersIQRConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def detect_outliers_iqr_function(config: DetectOutliersIQRConfig, _builder: Builder):
    """Detect outliers in sales data using the Interquartile Range (IQR) method."""
    import pandas as pd

    df = pd.read_csv(config.data_path)

    async def _detect_outliers_iqr(metric: str) -> str:
        """
        Detect outliers in retail data using the IQR method.

        Args:
            metric: Specific metric to check for outliers

        Returns:
            Dictionary containing outlier analysis results
        """
        if metric == "None":
            column = "Revenue"
        else:
            column = metric

        q1 = df[column].quantile(0.25)
        q3 = df[column].quantile(0.75)
        iqr = q3 - q1
        outliers = df[(df[column] < q1 - 1.5 * iqr) | (df[column] > q3 + 1.5 * iqr)]

        return f"Outliers in {column} are {outliers.to_dict('records')}"

    yield FunctionInfo.from_fn(
        _detect_outliers_iqr,
        description=_detect_outliers_iqr.__doc__)


<a id="technical-specs-tool"></a>
## 1.4) Technical Specs Retrieval Tool

In [None]:
%%writefile retail_sales_agent/src/retail_sales_agent/llama_index_rag_tool.py
import logging
import os

from pydantic import Field 

from nat.builder.builder import Builder
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.builder.function_info import FunctionInfo
from nat.cli.register_workflow import register_function
from nat.data_models.component_ref import EmbedderRef
from nat.data_models.component_ref import LLMRef
from nat.data_models.function import FunctionBaseConfig

logger = logging.getLogger(__name__)


class LlamaIndexRAGConfig(FunctionBaseConfig, name="llama_index_rag"):

    llm_name: LLMRef = Field(description="The name of the LLM to use for the RAG engine.")
    embedder_name: EmbedderRef = Field(description="The name of the embedder to use for the RAG engine.")
    data_dir: str = Field(description="The directory containing the data to use for the RAG engine.")
    description: str = Field(description="A description of the knowledge included in the RAG system.")
    collection_name: str = Field(default="context", description="The name of the collection to use for the RAG engine.")


def _walk_directory(root: str):
    for root, dirs, files in os.walk(root):
        for file_name in files:
            yield os.path.join(root, file_name)


@register_function(config_type=LlamaIndexRAGConfig, framework_wrappers=[LLMFrameworkEnum.LLAMA_INDEX])
async def llama_index_rag_tool(config: LlamaIndexRAGConfig, builder: Builder):
    from llama_index.core import Settings
    from llama_index.core import SimpleDirectoryReader
    from llama_index.core import StorageContext
    from llama_index.core import VectorStoreIndex
    from llama_index.core.node_parser import SentenceSplitter

    llm = await builder.get_llm(config.llm_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)
    embedder = await builder.get_embedder(config.embedder_name, wrapper_type=LLMFrameworkEnum.LLAMA_INDEX)

    Settings.embed_model = embedder
    Settings.llm = llm

    files = list(_walk_directory(config.data_dir))
    docs = SimpleDirectoryReader(input_files=files).load_data()
    logger.info("Loaded %s documents from %s", len(docs), config.data_dir)

    parser = SentenceSplitter(
        chunk_size=400,
        chunk_overlap=20,
        separator=" ",
    )
    nodes = parser.get_nodes_from_documents(docs)

    index = VectorStoreIndex(nodes)

    query_engine = index.as_query_engine(similarity_top_k=3, )

    async def _arun(inputs: str) -> str:
        """
        Search product catalog for information about tablets, laptops, and smartphones
        Args:
            inputs: user query about product specifications
        """
        try:
            response = query_engine.query(inputs)
            return str(response.response)

        except Exception as e:
            logger.error("RAG query failed: %s", e)
            return f"Sorry, I couldn't retrieve information about that product. Error: {str(e)}"

    yield FunctionInfo.from_fn(_arun, description=config.description)

<a id="plotting-tools"></a>
## 1.5) Data Analysis/Plotting Tools

This is a new set of tools that will be registered to the data analysis and plotting agent. This set of tools allows the registered agent to plot the results of upstream data analysis tasks.

In [None]:
%%writefile retail_sales_agent/src/retail_sales_agent/data_visualization_tools.py
from pydantic import Field

from nat.builder.builder import Builder
from nat.builder.framework_enum import LLMFrameworkEnum
from nat.builder.function_info import FunctionInfo
from nat.cli.register_workflow import register_function
from nat.data_models.component_ref import LLMRef
from nat.data_models.function import FunctionBaseConfig


class PlotSalesTrendForStoresConfig(FunctionBaseConfig, name="plot_sales_trend_for_stores"):
    """Plot sales trend for a specific store."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=PlotSalesTrendForStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def plot_sales_trend_for_stores_function(config: PlotSalesTrendForStoresConfig, _builder: Builder):
    """Create a visualization of sales trends over time."""
    import matplotlib.pyplot as plt
    import pandas as pd

    df = pd.read_csv(config.data_path)

    async def _plot_sales_trend_for_stores(store_id: str) -> str:
        if store_id not in df["StoreID"].unique():
            data = df
            title = "Sales Trend for All Stores"
        else:
            data = df[df["StoreID"] == store_id]
            title = f"Sales Trend for Store {store_id}"

        plt.figure(figsize=(10, 5))
        trend = data.groupby("Date")["Revenue"].sum()
        trend.plot(title=title)
        plt.xlabel("Date")
        plt.ylabel("Revenue")
        plt.tight_layout()
        plt.savefig("sales_trend.png")

        return "Sales trend plot saved to sales_trend.png"

    yield FunctionInfo.from_fn(
        _plot_sales_trend_for_stores,
        description=(
            "This tool can be used to plot the sales trend for a specific store or all stores. "
            "It takes in a store ID creates and saves an image of a plot of the revenue trend for that store."))


class PlotAndCompareRevenueAcrossStoresConfig(FunctionBaseConfig, name="plot_and_compare_revenue_across_stores"):
    """Plot and compare revenue across stores."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=PlotAndCompareRevenueAcrossStoresConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def plot_revenue_across_stores_function(config: PlotAndCompareRevenueAcrossStoresConfig, _builder: Builder):
    """Create a visualization comparing sales trends between stores."""
    import matplotlib.pyplot as plt
    import pandas as pd

    df = pd.read_csv(config.data_path)

    async def _plot_revenue_across_stores(arg: str) -> str:
        pivot = df.pivot_table(index="Date", columns="StoreID", values="Revenue", aggfunc="sum")
        pivot.plot(figsize=(12, 6), title="Revenue Trends Across Stores")
        plt.xlabel("Date")
        plt.ylabel("Revenue")
        plt.legend(title="StoreID")
        plt.tight_layout()
        plt.savefig("revenue_across_stores.png")

        return "Revenue trends across stores plot saved to revenue_across_stores.png"

    yield FunctionInfo.from_fn(
        _plot_revenue_across_stores,
        description=(
            "This tool can be used to plot and compare the revenue trends across stores. Use this tool only if the "
            "user asks for a comparison of revenue trends across stores."
            "It takes in a single string as input (which is ignored) and creates and saves an image of a plot of the revenue trends across stores."
        ))


class PlotAverageDailyRevenueConfig(FunctionBaseConfig, name="plot_average_daily_revenue"):
    """Plot average daily revenue for stores and products."""
    data_path: str = Field(description="Path to the data file")


@register_function(config_type=PlotAverageDailyRevenueConfig, framework_wrappers=[LLMFrameworkEnum.LANGCHAIN])
async def plot_average_daily_revenue_function(config: PlotAverageDailyRevenueConfig, _builder: Builder):
    """Create a bar chart showing average daily revenue by day of week."""
    import matplotlib.pyplot as plt
    import pandas as pd

    df = pd.read_csv(config.data_path)

    async def _plot_average_daily_revenue(arg: str) -> str:
        daily_revenue = df.groupby(["StoreID", "Product", "Date"])["Revenue"].sum().reset_index()

        avg_daily_revenue = daily_revenue.groupby(["StoreID", "Product"])["Revenue"].mean().unstack()

        avg_daily_revenue.plot(kind="bar", figsize=(12, 6), title="Average Daily Revenue per Store by Product")
        plt.ylabel("Average Revenue")
        plt.xlabel("Store ID")
        plt.xticks(rotation=0)
        plt.legend(title="Product", bbox_to_anchor=(1.05, 1), loc='upper left')
        plt.tight_layout()
        plt.savefig("average_daily_revenue.png")

        return "Average daily revenue plot saved to average_daily_revenue.png"

    yield FunctionInfo.from_fn(
        _plot_average_daily_revenue,
        description=("This tool can be used to plot the average daily revenue for stores and products "
                     "It takes in a single string as input and creates and saves an image of a grouped bar chart "
                     "of the average daily revenue"))

<a id="register-tools"></a>
## 1.6) Register The Tools

In [None]:
%%writefile -a retail_sales_agent/src/retail_sales_agent/register.py

from . import sales_per_day_tool
from . import detect_outliers_tool
from . import total_product_sales_data_tool
from . import llama_index_rag_tool
from . import data_visualization_tools

<a id="workflow-config"></a>
## 1.7) Workflow Configuration File

The following cell creates a basic workflow configuration file

In [None]:
%%writefile retail_sales_agent/configs/config.yml
llms:
  nim_llm:
    _type: nim
    model_name: meta/llama-3.3-70b-instruct
    temperature: 0.0
    max_tokens: 2048
    context_window: 32768
    api_key: $NVIDIA_API_KEY

embedders:
  nim_embedder:
    _type: nim
    model_name: nvidia/nv-embedqa-e5-v5
    truncate: END
    api_key: $NVIDIA_API_KEY

functions:
  total_product_sales_data:
    _type: get_total_product_sales_data
    data_path: data/retail_sales_data.csv
  sales_per_day:
    _type: get_sales_per_day
    data_path: data/retail_sales_data.csv
  detect_outliers:
    _type: detect_outliers_iqr
    data_path: data/retail_sales_data.csv

  data_analysis_agent:
    _type: tool_calling_agent
    tool_names:
      - total_product_sales_data
      - sales_per_day
      - detect_outliers
    llm_name: nim_llm
    max_history: 10
    max_iterations: 15
    description: |
      A helpful assistant that can answer questions about the retail sales CSV data.
      Use the tools to answer the questions.
      Input is a single string.
    verbose: false

  product_catalog_rag:
    _type: llama_index_rag
    llm_name: nim_llm
    embedder_name: nim_embedder
    collection_name: product_catalog_rag
    data_dir: data/rag/
    description: "Search product catalog for TabZen tablet, AeroBook laptop, NovaPhone specifications"

  rag_agent:
    _type: react_agent
    llm_name: nim_llm
    tool_names: [product_catalog_rag]
    max_history: 3
    max_iterations: 5
    max_retries: 2
    description: |
      An assistant that can only answer questions about products.
      Use the product_catalog_rag tool to answer questions about products.
      Do not make up any information.
    verbose: false

  plot_sales_trend_for_stores:
    _type: plot_sales_trend_for_stores
    data_path: data/retail_sales_data.csv
  plot_and_compare_revenue_across_stores:
    _type: plot_and_compare_revenue_across_stores
    data_path: data/retail_sales_data.csv
  plot_average_daily_revenue:
    _type: plot_average_daily_revenue
    data_path: data/retail_sales_data.csv

  data_visualization_agent:
    _type: react_agent
    llm_name: nim_llm
    tool_names:
      - plot_sales_trend_for_stores
      - plot_and_compare_revenue_across_stores
      - plot_average_daily_revenue
    max_history: 10
    max_iterations: 15
    description: |
      You are a data visualization expert.
      You can only create plots and visualizations based on user requests.
      Only use available tools to generate plots.
      You cannot analyze any data.
    verbose: false
    handle_parsing_errors: true
    max_retries: 2
    retry_parsing_errors: true

workflow:
  _type: react_agent
  tool_names: [data_analysis_agent, data_visualization_agent, rag_agent]
  llm_name: nim_llm
  verbose: true
  handle_parsing_errors: true
  max_retries: 2
  system_prompt: |
    Answer the following questions as best you can.
    You may communicate and collaborate with various experts to answer the questions.

    {tools}

    You may respond in one of two formats.
    Use the following format exactly to communicate with an expert:

    Question: the input question you must answer
    Thought: you should always think about what to do
    Action: the action to take, should be one of [{tool_names}]
    Action Input: the input to the action (if there is no required input, include "Action Input: None")
    Observation: wait for the expert to respond, do not assume the expert's response

    ... (this Thought/Action/Action Input/Observation can repeat N times.)
    Use the following format once you have the final answer:

    Thought: I now know the final answer
    Final Answer: the final answer to the original input question

<a id="verify-tools"></a>
## 1.8) Testing/Verifying Workflow Installation

You can verify the workflow was successfully set up by running the following example:


In [None]:
!nat run --config_file retail_sales_agent/configs/config.yml \
  --input "What is the Ark S12 Ultra tablet and what are its specifications?" \
  --input "How do laptop sales compare to phone sales?" \
  --input "Plot average daily revenue"

<a id="observe-workflow"></a>
# 2.0) Observing a Workflow with Phoenix

> **Note:** _This portion of the example will only work when the notebook is run locally. It may not work through Google Colab and other online notebook environments._

Phoenix is an open-source observability platform designed for monitoring, debugging, and improving LLM applications and AI agents. It provides a web-based interface for visualizing and analyzing traces from LLM applications, agent workflows, and ML pipelines. Phoenix automatically captures key metrics such as latency, token usage, and costs, and displays the inputs and outputs at each step, making it invaluable for debugging complex agent behaviors and identifying performance bottlenecks in AI workflows.

<a id="update-config"></a>
## 2.1) Updating the Workflow Configuration For Telemetry

We will need to update the workflow configuration file to support telemetry tracing with Phoenix.

To do this, we will first copy the original configuration:

In [None]:
!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/phoenix_config.yml

Then we will append necessary configuration components to the `phoenix_config.yml` file:

In [None]:
%%writefile -a retail_sales_agent/configs/phoenix_config.yml

general:
  telemetry:
    logging:
      console:
        _type: console
        level: WARN
    tracing:
      phoenix:
        _type: phoenix
        endpoint: http://localhost:6006/v1/traces
        project: retail_sales_agent


<a id="start-phoenix"></a>
## 2.2) Start Phoenix Server

First, we will install Phoenix:

In [None]:
!uv pip install arize-phoenix

Then, we will ensure the service is publicly accessible:

In [None]:
%env PHOENIX_HOST=0.0.0.0

Finally, we will start the server:

In [None]:
%%bash --bg
# phoenix will run on port 6006
phoenix serve

<a id="rerun-workflow"></a>
## 2.3) Rerun the Workflow

Instead of the original workflow configuration, we will run with the updated `phoenix_config.yml` file:

In [None]:
!nat run --config_file retail_sales_agent/configs/phoenix_config.yml \
  --input "What is the Ark S12 Ultra tablet and what are its specifications?" \
  --input "How do laptop sales compare to phone sales?" \
  --input "Plot average daily revenue"

<a id="view-trace"></a>
## 2.3) Viewing the trace

You can access the Phoenix server at http://localhost:6006

<a id="eval-workflow"></a>
# 3.0) Evaluating a Workflow

After setting up observability, the next step is to evaluate your workflow's performance against a test dataset. NAT provides a powerful evaluation framework that can assess your agent's responses using various metrics and evaluators.

For detailed information on evaluation, please refer to the [Evaluating NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/evaluate.html).


<a id="eval-dataset"></a>
## 3.1) Create an Evaluation Dataset

For evaluating this workflow, we will created a sample dataset.

The dataset will contain three test cases covering different query types. Each entry contains a question and the expected answer that the agent should provide.


In [None]:
%%writefile retail_sales_agent/data/eval_data.json
[
    {
        "id": "1",
        "question": "How do laptop sales compare to phone sales?",
        "answer": "Phone sales are higher than laptop sales in terms of both revenue and units sold. Phones generated a revenue of 561,000 with 1,122 units sold, whereas laptops generated a revenue of 512,000 with 512 units sold."
    },
    {
        "id": "2",
        "question": "What is the Ark S12 Ultra tablet and what are its specifications?",
        "answer": "The Ark S12 Ultra Ultra tablet features a 12.9-inch OLED display with a 144Hz refresh rate, HDR10+ dynamic range, and a resolution of 2800 x 1752 pixels. It has a contrast ratio of 1,000,000:1. The device is powered by Qualcomm's Snapdragon 8 Gen 3 SoC, which includes an Adreno 750 GPU and an NPU for on-device AI tasks. It comes with 16GB LPDDR5X RAM and 512GB of storage, with support for NVMe expansion via a proprietary magnetic dock. The tablet has a 11200mAh battery that enables up to 15 hours of typical use and recharges to 80 percent in 45 minutes via 45W USB-C PD. Additionally, it features a 13MP main sensor and a 12MP ultra-wide front camera, microphone arrays with beamforming, Wi-Fi 7, Bluetooth 5.3, and optional LTE/5G with eSIM. The device runs NebulynOS 6.0, based on Android 14L, and supports app sandboxing, multi-user profiles, and remote device management. It also includes the Pluma Stylus 3 with magnetic charging, 4096 pressure levels, and tilt detection, as well as a SnapCover keyboard with a trackpad and programmable shortcut keys."
    },
    {
        "id": "3",
        "question": "What were the laptop sales on Feb 16th 2024?",
        "answer": "On February 16th, 2024, the total laptop sales were 13 units, generating a total revenue of $13,000."
    }
]

<a id="update-config-again"></a>
## 3.2) Updating the Workflow Configuration

Workflow configuration files can contain extra settings relevant for evaluation and profiling.

To do this, we will first copy the original configuration:

In [None]:
!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/config_eval.yml

*Then* we will append necessary configuration components to the `config_eval.yml` file:

In [None]:
%%writefile -a retail_sales_agent/configs/config_eval.yml

eval:
  general:
    output_dir: ./eval_output
    verbose: true
    dataset:
        _type: json
        file_path: ./retail_sales_agent/data/eval_data.json

  evaluators:
    rag_accuracy:
      _type: ragas
      metric: AnswerAccuracy
      llm_name: nim_llm
    rag_groundedness:
      _type: ragas
      metric: ResponseGroundedness
      llm_name: nim_llm
    rag_relevance:
      _type: ragas
      metric: ContextRelevance
      llm_name: nim_llm
    trajectory_accuracy:
      _type: trajectory
      llm_name: nim_llm


<a id="run-eval"></a>
## 3.3) Running the Evaluation

The `nat eval` command executes the workflow against all entries in the dataset and evaluates the results using configured evaluators. Run the cell below to evaluate the retail sales agent workflow.


In [None]:
!nat eval --config_file retail_sales_agent/configs/config_eval.yml

<a id="understand-eval"></a>
## 3.4) Understanding Evaluation Results

The `nat eval` command runs the workflow on all entries in the dataset and produces several output files:

- **`workflow_output.json`**: Contains the raw outputs from the workflow for each input in the dataset
- **Evaluator-specific files**: Each configured evaluator generates its own output file with scores and reasoning

#### Evaluation Scores

Each evaluator provides:
- An **average score** across all dataset entries (0-1 scale, where 1 is perfect)
- **Individual scores** for each entry with detailed reasoning
- **Performance metrics** to help identify areas for improvement

All evaluation results are stored in the `output_dir` specified in the configuration file.


<a id="profile-workflow"></a>
## 4.0)  Profiling a Workflow

Profiling provides deep insights into your workflow's performance characteristics, helping you identify bottlenecks, optimize resource usage, and improve overall efficiency.

For detailed information on profiling, please refer to the [Profiling and Performance Monitoring of NVIDIA NeMo Agent Toolkit Workflows](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/profiler.html).


<a id="update-profiling-workflow"></a>
## 4.1) Updating the Workflow Configuration

Workflow configuration files can contain extra settings relevant for evaluation and profiling.

To do this, we will first copy the original configuration:

In [None]:
!cp retail_sales_agent/configs/config.yml retail_sales_agent/configs/config_profile.yml

*Then* we will append necessary configuration components to the `config_profile.yml` file:

In [None]:
%%writefile -a retail_sales_agent/configs/config_profile.yml

eval:
  general:
    output_dir: ./profile_output
    verbose: true
    dataset:
        _type: json
        file_path: ./retail_sales_agent/data/eval_data.json

    profiler:
        token_uniqueness_forecast: true
        workflow_runtime_forecast: true
        compute_llm_metrics: true
        csv_exclude_io_text: true
        prompt_caching_prefixes:
          enable: true
          min_frequency: 0.1
        bottleneck_analysis:
          enable_nested_stack: true
        concurrency_spike_analysis:
          enable: true
          spike_threshold: 7


<a id="understand-profiler-config"></a>
## 4.2) Understanding the Profiler Configuration

We will reuse the same configuration as evaluation.

The profiler is configured through the `profiler` section of your workflow configuration file. It runs alongside the `nat eval` command and offers several analysis options:

#### Key Configuration Options:

- **`token_uniqueness_forecast`**: Computes the inter-query token uniqueness forecast, predicting the expected number of unique tokens in the next query based on tokens used in previous queries

- **`workflow_runtime_forecast`**: Calculates the expected workflow runtime based on historical query performance

- **`compute_llm_metrics`**: Computes inference optimization metrics including latency, throughput, and other performance indicators

- **`csv_exclude_io_text`**: Prevents large text from being dumped into output CSV files, preserving CSV structure and readability

- **`prompt_caching_prefixes`**: Identifies common prompt prefixes that can be pre-populated in KV caches for improved performance

- **`bottleneck_analysis`**: Analyzes workflow performance measures such as bottlenecks, latency, and concurrency spikes
  - `simple_stack`: Provides a high-level analysis
  - `nested_stack`: Offers detailed analysis of nested bottlenecks (e.g., tool calls inside other tool calls)

- **`concurrency_spike_analysis`**: Identifies concurrency spikes in your workflow. The `spike_threshold` parameter (e.g., 7) determines when to flag spikes based on the number of concurrent running functions

#### Output Directory

The `output_dir` parameter specifies where all profiler outputs will be stored for later analysis.


<a id="run-profiler"></a>
## 4.3) Running the Profiler

The profiler runs as part of the `nat eval` command. When properly configured, it will collect performance data across all evaluation runs and generate comprehensive profiling reports.


In [None]:
!nat eval --config_file retail_sales_agent/configs/config_profile.yml

<a id="understand-profiler-output-files"></a>
## 4.4) Understanding Profiler Output Files

Based on the profiler configuration, the following files will be generated in the `output_dir`:

**Core Output Files:**

1. **`all_requests_profiler_traces.json`**: Raw usage statistics collected by the profiler, including:
   - Raw traces of LLM interactions
   - Tool input and output data
   - Runtime measurements
   - Execution metadata

2. **`inference_optimization.json`**: Workflow-specific performance metrics with confidence intervals:
   - 90%, 95%, and 99% confidence intervals for latency
   - Throughput statistics
   - Workflow runtime predictions

3. **`standardized_data_all.csv`**: Standardized usage data in CSV format containing:
   - Prompt tokens and completion tokens
   - LLM input/output
   - Framework information
   - Additional metadata


**Advanced Analysis Files**

4. **Analysis Reports**: JSON files and text reports for any advanced techniques enabled:
   - Concurrency analysis results
   - Bottleneck analysis reports
   - PrefixSpan pattern mining results

These files provide comprehensive insights into your workflow's performance and can be used for optimization and debugging.

**Gantt Chart**

We can also view a Gantt chart of the profile run:

In [None]:
from IPython.display import Image

Image("profile_output/gantt_chart.png")

<a id="summary"></a>
# 5.0) Notebook Summary

In this notebook, we covered the complete workflow for observability, evaluation, and profiling in NeMo Agent Toolkit:

**Observability with Phoenix**
- Configured tracing in the workflow configuration
- Started the Phoenix server for real-time monitoring
- Executed workflows with automatic trace capture
- Visualized agent execution flow and LLM interactions


**Evaluation with `nat eval`**
- Created a comprehensive evaluation dataset
- Ran automated evaluations across multiple test cases
- Reviewed evaluation metrics and scores
- Analyzed workflow performance against expected outputs



**Profiling for Performance Optimization**
- Configured advanced profiling options
- Collected performance metrics and usage statistics
- Generated detailed profiling reports
- Identified bottlenecks and optimization opportunities



These three pillars—observability, evaluation, and profiling—work together to provide a complete picture of your agent's behavior, accuracy, and performance, enabling you to build production-ready AI applications with confidence.