# Getting Started with NeMo Agent Toolkit

In this notebook, we walk through the basics of using NeMo Agent toolkit (NAT), from installation all the way to creating and running a simple workflow. The intention of this notebook is to get new NAT users up and running with a high level understanding of our YAML-first approach, while gaining some intuition towards how NAT workflows can quikcly be embedded into your projects.

## Table of Contents

- [0) Setup](#setup)
  - [0.1) Prerequisites](#prereqs)
  - [0.2) API Keys](#api-keys)
  - [0.3) Installing NeMo Agent Toolkit](#installing-nat)
- [1) Creating Your First Workflow](#create-first-workflow)
  - [1.1) What is a NAT workflow?](#what-is-a-workflow)
  - [1.2) Create your first workflow](#create-first-workflow)
  - [1.3) Interpret your first workflow](#interpret-first-workflow)
    - [Interpreting Directory Structure](#directory-structure)
    - [Interpreting Configuration File](#configuration-file)
    - [Interpreting Workflow Functions](#workflow-functions)
    - [Tying It Together](#tying-it-together)
- [2) Running Your First Workflow](#run-first-workflow)
    - [2.1) Run with the CLI](#run-cli)
    - [2.2) Run as a NAT server](#run-server)
    - [2.3) Running NAT Embedded within Python](#run-embedded)
- [Next Steps](#next-steps)

<span style="color: #FFD600; font-style: italic;">Note: Table of Contents links may not work in Google Colab.</span>



<a id="setup"></a>
# 0) Setup

<a id="prereqs"></a>
### 0.1) Prerequisites

- **Platform:** Linux, macOS, or Windows
- **Python:** version 3.11, 3.12, or 3.13
- **Python Packages:** `pip`

<a id="api-keys"></a>
### 0.2) API Keys

For this notebook, you will need the following API keys to run all examples end-to-end:

- **NVIDIA Build:** You can obtain an NVIDIA Build API Key by creating an [NVIDIA Build](https://build.nvidia.com) account and generating a key at https://build.nvidia.com/settings/api-keys

Then you can run the cell below:

In [None]:
import getpass
import os

if "NVIDIA_API_KEY" not in os.environ:
    nvidia_api_key = getpass.getpass("Enter your NVIDIA API key: ")
    os.environ["NVIDIA_API_KEY"] = nvidia_api_key

<a id="installing-nat"></a>
### 0.3) Installing NeMo Agent Toolkit

The recommended way to install NAT is through `pip` or `uv pip`.

First, we will install `uv` which offers parallel downloads and faster dependency resolution.

In [None]:
!pip install uv

NeMo Agent toolkit can be installed through the PyPI `nvidia-nat` package.

There are several optional subpackages available for NAT. The `langchain` subpackage contains useful components for integrating and running within [LangChain](https://python.langchain.com/docs/introduction/). Since LangChain will be used later in this notebook, let's install NAT with the optional `langchain` subpackage.

In [None]:
!uv pip install "nvidia-nat[langchain]"

<a id="creating-your-first-workflow"></a>
# 1.0) Creating Your First Workflow

<a id="what-is-a-workflow"></a>
### 1.1) What is a NAT workflow?

A [workflow](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/about/index.html) in NeMo Agent Toolkit is a structured specification of how agents, models, tools (called functions), embedders, and other components are composed together to carry out a specific task. It defines which components are used, how they are connected, and how they behave when executing the task.

NAT provides a convenient command-line interface called `nat` which is accessible in your active Python environment. It serves at the entrypoint to most toolkit functions.

The `nat workflow create` command allows us to create a new workflow.

<a id="create-first-workflow"></a>
### 1.2) Create your first workflow

In [None]:
!nat workflow create getting_started

<a id="interpret-first-workflow"></a>
### 1.3) Interpret your first workflow

<a id="directory-structure"></a>
#### Interpreting Directory Structure
We can inspect the structure of the created **workflow directory**, which we've named `/getting_started`, and contains the configuration files, source code, and data needed to define and run the workflow.

In [None]:
!find getting_started/

A summary of the high-level components are outlined below.

* `configs` (symbolic link to `src/getting_started/configs`)
* `data` (symbolic link to `src/getting_started/data`)
* `pyproject.toml` Python project configuration file
* `src`
  * `getting_started`
    * `__init__.py` Module init file (empty)
    * `configs` Configuration directory for workflow specifications
      * `config.yml` Workflow configuration file
    * `data` Data directory for any dependent files
    * `getting_started.py` User-defined code for workflow execution
    * `register.py` Automatic registration of project components


<a id="configuration-file"></a>
#### Interpreting Configuration File
The workflow configuration file, `getting_started/configs/config.yml`, describes the operational characteristics of the entire workflow. Let's load its contents in the next cell and understand what this first workflow can do out of the box.

In [None]:
%load getting_started/configs/config.yml

The above workflow configuration has the following components:
- a [built-in `current_datetime`](https://docs.nvidia.com/nemo/agent-toolkit/latest/api/nat/tool/datetime_tools/index.html#nat.tool.datetime_tools.current_datetime) function
- a workflow-defined `getting_started` function
- an LLM
- an entrypoint workflow of a [built-in ReAct agent](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/about/react-agent.html)

By default we create a [ReAct agent](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/about/react-agent.html) equipped with both of the functions above. When called, the Agent decides which functions to call (if any) based on the intent of user input. The agent uses the LLM to help make reasoning decisions and then performs a subsequent action.

This workflow configuration file is a YAML-serialized version of the [`Config`](https://docs.nvidia.com/nemo/agent-toolkit/latest/api/nat/data_models/config/index.html#nat.data_models.config.Config) class. Each category within the high-level configuration specifies runtime configuration settings for their corresponding components. For instance, the `workflow` category contains all configuration settings for the workflow entrypoint. This configuration file is validated as typed Pydantic models and fields. All configuration classes have validation rules, default values, and [documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/workflows/workflow-configuration.html#workflow-configuration-file) which enable type-safe configuration management, automatic schema generation, and validation across the entire plugin ecosystem.

* `general` - General configuration section. Contains high-level configurations for front-end definitions.
* `authentication` - Authentication provides an interface for defining and interacting with various authentication providers.
* `llms` - LLMs provide an interface for interacting with LLM providers.
* `embedders` - Embedders provide an interface for interacting with embedding model providers.
* `retreivers` - Retrievers provide an interface for searching and retrieving documents.
* `memory` - Configurations for Memory. Memories provide an interface for storing and retrieving.
* `object_stores` - Object Stores provide a CRUD interface for objects and data.
* `eval` - The evaluation section provides configuration options related to the profiling and evaluation of NAT workflows.
* `tcc_strategies` (experimental) - Test Time Compute (TTC) strategy definitions.

#### Type Safety and Validation

Many components within the workflow configuration specify `_type`. This YAML key is used to indicate the type of the component so NAT can properly validate and instantiate a component within the workflow. For example, [`NIMModelConfig`](https://docs.nvidia.com/nemo/agent-toolkit/latest/api/nat/llm/nim_llm/index.html#nat.llm.nim_llm.NIMModelConfig) is a subclass of [`LLMBaseConfig`](https://docs.nvidia.com/nemo/agent-toolkit/latest/api/nat/data_models/llm/index.html#nat.data_models.llm.LLMBaseConfig) so when we specify: `_type: nim` in the configuration the toolkit knows to validate the configuration with `NIMModelConfig`.

<span style="color: green">**Note:** Not all configuration components are required. The simplest workflow configuration needs to only define <code>workflow</code>.</span>



<a id="workflow-functions"></a>
### Interpreting Workflow Functions

Next, let's inspect the contents of the generated workflow function:

In [None]:
%load getting_started/src/getting_started/getting_started.py

##### Function Configuration

The `GettingStartedFunctionConfig` specifies `FunctionBaseConfig` as a base class. There is also a `name` specified. This name is used by the toolkit to create a static mapping when `_type` is specified anywhere where a `FunctionBaseConfig` is expected, such as `workflow` or under `functions`.

##### Function Registration

NeMo Agent toolkit relies on a configuration with builder pattern to define most components. For functions, `@register_function` is a decorator that must be specified to inform the toolkit that a function should be accessible automatically by name when referenced. The decorator requires that a `config_type` is specified. This is done to ensure type safety and validation.

The parameters to the decorated function are always:

1. the configuration type of the function/component (FunctionBaseConfig)
2. a Builder which can be used to dynamically query and get other workflow components (Builder)

##### Function Implementation

The core logic of the `getting_started` function is embedded as a function within the outer function registration. This is done for a few reasons:

* Enables dynamic importing of libraries and modules on an as-needed basis.
* Enables context manager-like resources within to support automatic closing of resources.
* Provides the most flexibility to users when defining their own functions.

Near the end of the function registration implementation, we `yield` a `FunctionInfo` object. `FunctionInfo` is a wrapper around any type of function. It is also possible to specify additional information such as schema and converters if your function relies on transformations.

NAT relies on `yield` rather `return` so resources can stay alive during the lifetime of the function or workflow.

<a id="tying-it-together"></a>
##### Tying It Together

Looking back at the configuration file, the `workflow`'s `_type` is `getting_started`. This means that the configuration of `workflow` will be validated based on the `GettingStartedFunctionConfig` implementation.

The `register.py` file tells NAT what should automatically be imported so it is available when the toolkit is loaded.

In [None]:
%load getting_started/src/getting_started/register.py

<a id="run-first-workflow"></a>
## 2.0) Running Your First Workflow

<a id="run-cli"></a>
### 2.1) Run with the CLI

You can run a workflow by using `nat run` CLI command:

In [None]:
!nat run --config_file getting_started/configs/config.yml \
         --input "Can you echo back my name, Will?"

<a id="run-server"></a>
### 2.2) Run as a NAT server

NAT provides another mechanism for running workflows through `nat serve`. `nat serve` creates and launches a REST FastAPI web server for interfacing with the toolkit as though it was an OpenAI-compatible endpoint. To learn more about all endpoints served by `nat serve`, please consult [this documentation](https://docs.nvidia.com/nemo/agent-toolkit/latest/reference/api-server-endpoints.html).

<span style="color: red"><i>note: If running this notebook in a cloud provider such as Google Colab, `dask` may be installed. If it is, you will first have to uninstall it via:</i></span>

In [None]:
!pip uninstall -y dask

To start the FastAPI web server, issue the following command:

In [None]:
%%bash --bg
nat serve --config_file getting_started/configs/config.yml

It will take several seconds for the server to be reachable. The default port for the server is `8000` with `localhost` access.

Note that `--input` was not required for `nat serve`. To issue a request to the server, you can then do:

In [None]:
%%bash

# Issue a request to the background service
curl --request POST \
  --url http://localhost:8000/chat \
  --header 'Content-Type: application/json' \
  --data '{
    "messages": [
        {
          "role": "user",
          "content": "What is the current time?"
        }
      ]
    }' | jq

In [None]:
# Terminate the process after completion
!pkill -9 -f "nat serve"

<a id="run-embedded"></a>
### 2.3) Running NAT Embedded within Python

The final way to run a NAT workflow is by embedding it into an already existing Python application or library.

Consider the following code:

In [None]:
%%writefile nat_embedded.py
import asyncio
import sys

from nat.runtime.loader import load_workflow
from nat.utils.type_utils import StrPath


async def get_callable_for_workflow(config_file: StrPath):
    """
    Creates an end-to-end async callable which can run a NAT workflow.

    Note that this "yields" the callable so you have to access via an
    asynchronous generator:

      async for callable in get_callable_for_workflow(..)):
          # use callable here

    Args:
        config_file (StrPath): a valid path to a NAT configuration file

    Yields:
        The callable
    """
    # load a given workflow from a configuration file
    async with load_workflow(config_file) as workflow:

        # create an async callable that runs the workflow
        async def single_call(input_str: str) -> str:

            # run the input through the workflow
            async with workflow.run(input_str) as runner:
                # wait for the result and cast it to a string
                return await runner.result(to_type=str)

        yield single_call


async def amain():
    async for callable in get_callable_for_workflow(sys.argv[1]):
        # read queries from stdin and process them serially
        query_num = 1
        try:
            while True:
                query = input()
                result = await callable(query)
                print(f"Query {query_num}: {query}")
                print(f"Result {query_num}: {result}")
                query_num += 1
        except EOFError:
            pass


asyncio.run(amain())

Then we can run it as a normal Python program as shown below, or better yet, integrate with your existing services.

In [None]:
%%bash
python nat_embedded.py getting_started/configs/config.yml <<EOF
What are you capable of doing?
What does the 'current_datetime' tool do?
What does the 'getting_started' tool do?
What is the current time?
Can you echo back my name, Evan?
What is the current time?
Can you echo back my name, Will?
EOF

<a id="next-steps"></a>
### 3.0) Next Steps

As you may already have agents codifed, and don't need NAT to bring up your first agent, we also support bringing existing agents into the NAT framework. In the next iteration of this series, "Bringing Your Own Agent into NeMo Agent Toolkit" (2_bringing_your_own_agent.ipynb) we will walk you through adapting existing agents into NAT.