-
Notifications
You must be signed in to change notification settings - Fork 16
docs: add agent runtime primer #145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rapids-bot
merged 2 commits into
NVIDIA:main
from
mnajafian-nv:docs/add-agent-runtime-primer
May 22, 2026
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| <!-- | ||
| SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| SPDX-License-Identifier: Apache-2.0 | ||
| --> | ||
|
|
||
| # Agent Runtime Primer | ||
|
|
||
| NeMo Relay is a portable runtime layer for agent systems that already have an | ||
| application, framework, or model provider. Use this primer when you need to | ||
| understand what NeMo Relay adds before running [Quick Start](quick-start.md). | ||
|
|
||
| Agent applications usually cross several boundaries in one request: an entry | ||
| point starts work, the agent calls a model, the model asks for tools, tools call | ||
| services, and tracing or policy systems need to understand the result. Without a | ||
| shared runtime layer, each boundary tends to grow its own wrappers, callback | ||
| shape, trace vocabulary, and cleanup rules. | ||
|
|
||
| NeMo Relay gives those boundaries one execution model. | ||
|
|
||
| ## What NeMo Relay Adds | ||
|
|
||
| NeMo Relay does not decide what your agent should do. It describes and manages | ||
| what happens when your agent crosses runtime boundaries. | ||
|
|
||
| The core runtime model has five parts: | ||
|
|
||
| - **Scopes** describe where work belongs. They preserve parent-child | ||
| relationships across requests, agent runs, tools, LLM calls, background work, | ||
| and nested functions. | ||
| - **Managed tool and LLM calls** attach work to the active scope, run middleware | ||
| in a consistent order, and emit lifecycle events. The application result is | ||
| preserved unless registered intercepts or guardrails intentionally change the | ||
| execution path. | ||
| - **Middleware** runs around managed execution. Intercepts can transform or wrap | ||
| real calls. Guardrails can block execution or sanitize emitted observability | ||
| payloads. | ||
| - **Events** record what happened. NeMo Relay emits Agent Trajectory | ||
| Observability Format (ATOF) lifecycle records that subscribers and exporters | ||
| can consume. | ||
| - **Plugins** package reusable runtime behavior so teams can install middleware, | ||
| subscribers, exporters, or adaptive behavior from configuration instead of | ||
| repeating setup code in every application. | ||
|
|
||
| The simplest mental model is: | ||
|
|
||
| ```text | ||
| app or framework boundary | ||
| -> NeMo Relay scope | ||
| -> managed tool or LLM call | ||
| -> middleware | ||
| -> lifecycle event | ||
| -> subscriber or exporter | ||
| ``` | ||
|
|
||
| ## What NeMo Relay Does Not Replace | ||
|
|
||
| NeMo Relay sits below the choices your application already makes. | ||
|
|
||
| It does not replace: | ||
|
|
||
| - your agent framework or orchestration logic | ||
| - your model provider or provider SDK | ||
| - your application business logic | ||
| - your production observability backend | ||
| - NeMo Agent Toolkit | ||
|
|
||
| Instead, it gives those systems a shared runtime contract for call boundaries, | ||
| policy hooks, event emission, and export. | ||
|
|
||
| ## Choose The Boundary You Own | ||
|
|
||
| Where you start depends on who owns the call boundary. | ||
|
|
||
| If your application directly calls tools or model providers, start by | ||
| instrumenting the application boundary. Add scopes first, then wrap the tool and | ||
| LLM calls your code owns. | ||
|
|
||
| If a framework owns scheduling, retries, callbacks, or provider payloads, use a | ||
| framework integration. The integration should preserve framework behavior while | ||
| adding NeMo Relay scopes, managed calls, codecs, middleware, and events at stable | ||
| framework boundaries. | ||
|
|
||
| If you need the same behavior across multiple services or teams, package it as a | ||
| plugin. Plugins are the configuration-driven path for reusable middleware, | ||
| subscribers, exporters, and adaptive components. | ||
|
|
||
| ## Read Next | ||
|
|
||
| The following pages help you choose the next step for your integration. | ||
|
|
||
| - Use [Quick Start](quick-start.md) for the smallest binding-specific example. | ||
| - Use [Instrument Applications](../instrument-applications/about.md) when you | ||
| own the tool or LLM call site. | ||
| - Use [Integrate into Frameworks](../integrate-frameworks/about.md) when a | ||
| framework owns invocation, scheduling, retries, callbacks, or provider | ||
| payloads. | ||
| - Use [Build Plugins](../build-plugins/about.md) when behavior should be | ||
| reusable and activated from configuration. | ||
| - Use [Observability](../plugins/observability/about.md) when you need to export | ||
| runtime events to ATIF, OpenTelemetry, or OpenInference. | ||
| - Use [Adaptive](../plugins/adaptive/about.md) after baseline instrumentation is | ||
| working and you want to tune behavior from observed runtime signals. | ||
|
mnajafian-nv marked this conversation as resolved.
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.