fix: preserve Hermes error-path export consistency by mnajafian-nv · Pull Request #227 · NVIDIA/NeMo-Relay

mnajafian-nv · 2026-06-04T22:53:59Z

Overview

This PR tightens Hermes error-path observability consistency by adding exporter-visible api_request_error coverage for ATOF, ATIF, and OpenInference, and by fixing OpenInference metadata so mixed-fidelity Hermes spans reflect the completed event rather than the request start.

I confirm this contribution is my own work, or I have the right to submit it under this project's license.
I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

Adds Hermes ATOF validation that api_request_error exports a lossy error-path LLM end event with the expected api_call_id, status, retry fields, error payload, and fidelity metadata.
Adds Hermes ATIF validation that api_request_error produces an exportable agent step with the expected error response fields and observed-event fidelity markers.
Adds Hermes OpenInference validation that api_request_error produces JSON output, JSON mime type, and final-event fidelity metadata on the finished LLM span.
Fixes OpenInference LLM span metadata handling so finished spans reflect end-event metadata instead of retaining start-event metadata for mixed-fidelity Hermes flows.
Keeps the change focused on Hermes error-path consistency without expanding into unrelated tool-result or routed-provider follow-up work.

Validated with:

cargo test -p nemo-relay-cli serve_listener_hermes_api_request_error_writes_lossy_atof_error_event -- --nocapture
cargo test -p nemo-relay-cli hermes_api_request_error_writes_atif_error_step_and_fidelity -- --nocapture
cargo test -p nemo-relay hermes_exact_api_payloads_emit_openinference_text_usage_and_metadata -- --nocapture
cargo test -p nemo-relay hermes_api_request_error_emits_openinference_json_output_and_metadata -- --nocapture
cargo test -p nemo-relay-cli hermes -- --nocapture
uv run pre-commit run --all-files

Where should the reviewer start?

Start in crates/core/src/observability/openinference.rs for the metadata handling fix, then review crates/cli/tests/coverage/server_tests.rs for the Hermes ATOF error-path regression, crates/cli/tests/coverage/session_tests.rs for the ATIF error-path regression, and crates/core/tests/unit/observability/openinference_tests.rs for the OpenInference mixed-fidelity error-path coverage.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to Hermes observability consistency work.

Summary by CodeRabbit

Tests
- Added integration tests for gateway error handling with observability event tracking
- Added tests for routed upstream request tracking with usage and cost metrics
- Added test for session error reporting with fidelity metadata
Bug Fixes
- Improved metadata handling in observability event spans for accurate error tracking
Documentation
- Updated dependency license attributions

Signed-off-by: mnajafian-nv <mnajafian@nvidia.com>

coderabbitai · 2026-06-04T22:54:11Z

Walkthrough

OpenInference metadata attribute handling is updated to ensure LLM error events properly track fidelity: start spans strip oi::METADATA, while end spans conditionally append it when event metadata is serializable. This change is validated across unit tests, ATIF session integration, and ATOF server integration tests. License attributions for multiple Rust crates are updated to reflect Apache-2.0 as the primary license.

Changes

Observability Error Path and Metadata Tracking

Layer / File(s)	Summary
OpenInference metadata attribute logic `crates/core/src/observability/openinference.rs`	LLM start spans filter out `oi::METADATA`; end spans conditionally add it when event metadata serializes to JSON, enabling metadata on error-path responses.
OpenInference Hermes error event unit test `crates/core/tests/unit/observability/openinference_tests.rs`	New test validates Hermes LLM error spans emit lossy metadata (`provider_payload_exact=false`), JSON-formatted error output (status, retry count, retryable, reason, message), and `application/json` mime type.
ATOF observability integration tests `crates/cli/tests/coverage/server_tests.rs`	Two new server tests verify ATOF event export: one checks lossy fidelity markers on Hermes api_request_error; the other verifies routed gateway requests to mock Anthropic/OpenAI upstreams emit complete metadata (model, tool-call IDs, usage, cached tokens).
ATIF session integration test with error fidelity `crates/cli/tests/coverage/session_tests.rs`	New test validates ATIF step generation for Hermes api_request_error, asserting error detail fields and fidelity markers (exact payload for requests, lossy for errors) in observed events.

License Attribution Updates

Layer / File(s)	Summary
Multiple crate license attribution updates `ATTRIBUTIONS-Rust.md`	Crates `block-buffer`, `crypto-common`, `digest`, `md-5`, and `version_check` transition from dual MIT/Apache to Apache-2.0 primary license with embedded MIT license sections removed or restructured. `generic-array` inline MIT license text replaces file reference.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

NVIDIA/NeMo-Relay#219: New ATIF Hermes error test directly extends existing Hermes ATIF tests in same file, both validating provider_payload_exact/fidelity_source observability contracts.
NVIDIA/NeMo-Relay#215: Both PRs extend Hermes observability testing and OpenInference metadata handling via same wrapped observability contract for error-path payloads and fidelity markers.
NVIDIA/NeMo-Relay#220: Both PRs modify crates/core/tests/unit/observability/openinference_tests.rs to validate OpenInference span attribute/metadata behavior in different error/fallback paths.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 62.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows Conventional Commits format with type 'fix', proper scope, and concise imperative summary under 72 characters.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description covers all required template sections with clear details: overview with confirmation checkboxes, comprehensive technical details of changes, explicit guidance on review order, and related issues context.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@ATTRIBUTIONS-Rust.md`:
- Line 3127: The generated crate attribution blocks emit headings like "###
License: ..." without the required blank lines above and below, causing MD022
failures; update the attribution renderer/template (e.g., the function or
template that builds each crate block—look for names like render_crate_block /
render_attribution_block or the license_heading/template fragment that produces
"### License:") to always insert a blank line before the "### License: <url>"
heading and a blank line after it (i.e., ensure the heading is surrounded by
single empty lines in the emitted markdown for every crate), and apply this
change globally so all occurrences (including the reported crate blocks) are
fixed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ddb72c54-84cc-431f-bc61-807de1b8bd52

📥 Commits

Reviewing files that changed from the base of the PR and between f5195ae and eb5dea1.

📒 Files selected for processing (5)

ATTRIBUTIONS-Rust.md
crates/cli/tests/coverage/server_tests.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/src/observability/openinference.rs
crates/core/tests/unit/observability/openinference_tests.rs

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Check / Run
GitHub Check: Preview docs

🧰 Additional context used

📓 Path-based instructions (25)

**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Use snake_case naming convention for Rust identifiers (e.g., nemo_relay_tool_call)

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for all FFI work since it is Rust work
Run just test-rust to validate FFI changes
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting on FFI work

When Rust files changed as part of Go work, also run cargo fmt --all, just test-rust, and cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all when Rust files are changed as part of Node work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files are changed as part of Node work
Run just test-rust when Rust files are changed as part of Node work

**/*.rs: Run cargo fmt --all to format all Rust code
Run cargo clippy --workspace --all-targets -- -D warnings to enforce all clippy lints as errors

**/*.rs: Run cargo fmt --all when Rust files changed as part of WebAssembly work
Run cargo clippy --workspace --all-targets -- -D warnings when Rust files changed as part of WebAssembly work

**/*.rs: If any Rust code changed, always run just test-rust
If any Rust code changed, also run cargo fmt --all
If any Rust code changed, also run cargo clippy --workspace --all-targets -- -D warnings
Run Rust formatting with cargo fmt --all
Run Rust linting with cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Use cargo fmt for Rust code formatting
Run cargo clippy -- -D warnings to lint Rust code and treat all warnings as errors
Use Rust snake_case naming convention for Rust identifiers
Include SPDX license header in all Rust source files using double-slash comment syntax
Validate Rust code with uv run pre-commit run --all-files to enforce cargo fmt formatting check, cargo clippy lints, and cargo deny aud...

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

crates/core/src/observability/{atif,otel,openinference}.rs

📄 CodeRabbit inference engine (.agents/skills/maintain-observability/SKILL.md)

When changing event fields in ATIF, OpenTelemetry, or OpenInference observability surfaces, keep the core event model in crates/core/src/observability/atif.rs, crates/core/src/observability/otel.rs, and crates/core/src/observability/openinference.rs in sync

Files:

crates/core/src/observability/openinference.rs

**/{Cargo.toml,**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Maintain consistency between Rust package names in Cargo.toml and their actual usage across the codebase

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

**/*.{h,hpp,c,cpp,rs}

📄 CodeRabbit inference engine (.agents/skills/maintain-packaging/SKILL.md)

Ensure FFI header and library naming follows consistent conventions across platform-specific builds

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

{crates/core,crates/adaptive}/**/*

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/observability/openinference_tests.rs

**/*.{rs,toml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update Rust crate names and module prefixes during coordinated rename operations

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

If the change touched crates/core or shared runtime semantics, also use validate-change for broader validation

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/observability/openinference_tests.rs

crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If crates/core or crates/adaptive changed, run the full matrix across Rust, Python, Go, Node.js, and WebAssembly

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/observability/openinference_tests.rs

**/*.{rs,py,js,ts,tsx,jsx,go,sh,toml,yaml,yml,md}

📄 CodeRabbit inference engine (AGENTS.md)

Keep SPDX headers on source, docs, scripts, and configuration files. The project is Apache-2.0.

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs
ATTRIBUTIONS-Rust.md

**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Follow binding naming conventions: Rust and Python use snake_case, C FFI exports prefixed nemo_relay_, Go uses PascalCase for public APIs, Node.js uses camelCase.

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

crates/**/*.rs

📄 CodeRabbit inference engine (AGENTS.md)

crates/**/*.rs: Keep async behavior on the existing tokio-based model. Bindings should preserve callback and future lifetimes rather than blocking or hiding async work unexpectedly.
Use Json = serde_json::Value in Rust-facing runtime APIs for JSON payload handling.

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

crates/core/src/observability/openinference.rs
crates/core/tests/unit/observability/openinference_tests.rs

**

⚙️ CodeRabbit configuration file

**:

AGENTS.md

This file provides guidance to agents, including Claude Code and OpenAI Codex, when working in this repository.

Project Overview

NeMo Relay is a multi-language agent runtime framework for execution scopes, lifecycle events, middleware, plugins, and observability around tool and LLM calls. The core runtime is Rust. Primary supported bindings are Rust, Python, and Node.js. Go, WebAssembly, and the raw C FFI are experimental and source-first.

The shared runtime model is:

Scope stacks decide where work belongs and which scope-local behavior is visible.

Middleware registries decide what guardrails and intercepts run around managed calls.

Plugins install reusable runtime behavior from configuration.

Events record runtime behavior in ATOF form.

Subscribers and exporters consume events in-process or export them to ATIF, OpenTelemetry, OpenInference, or other backends.

Repository Structure

The repository layout separates the Rust runtime, language bindings, documentation,
integration patches, and agent-facing skills.
crates/
  core/       # Rust core runtime crate, published as nemo-relay
  adaptive/   # Adaptive runtime primitives and plugin components
  python/     # PyO3 native extension for the Python package
  ffi/        # Raw C ABI layer used by downstream bindings such as Go
  node/       # NAPI Node.js binding and JavaScript/TypeScript entry points
  wasm/       # wasm-bindgen WebAssembly binding and JS wrappers
python/
  nemo_relay/  # Python wrapper package: scopes, tools, LLM, middleware, typed helpers, plugins, adaptive helpers
  tests/      # Python tests
go/
  nemo_relay/  # Experimental Go CGo binding and tests
fern/         # Fern documentation site
scripts/      # Stable wrappers and helper scripts; build/test/docs entry points live in justfile
third_party/  # P...

Files:

crates/core/src/observability/openinference.rs
crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs
ATTRIBUTIONS-Rust.md

{crates/adaptive/**/*.rs,**/*test*.{rs,py,go,ts,js},**/*adaptive*test*.{rs,py,go,ts,js},docs/plugins/adaptive/**}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain documented and tested validation and report behavior for adaptive surfaces

Files:

crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_relay/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

crates/cli/tests/coverage/session_tests.rs
crates/core/tests/unit/observability/openinference_tests.rs
crates/cli/tests/coverage/server_tests.rs

**/*.{md,rst,html,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

**/*.{md,rst,html,txt}: Always spell NVIDIA in all caps. Do not use Nvidia, nvidia, nVidia, nVIDIA, or NV.
Use an NVIDIA before a noun because the name starts with an 'en' sound.
Do not add a registered trademark symbol after NVIDIA when referring to the company.
Use trademark symbols with product names only when the document type or legal guidance requires them.
Verify official capitalization, spacing, and hyphenation for product names.
Precede NVIDIA product names with NVIDIA on first mention when it is natural and accurate.
Do not rewrite product names for grammar or title-case rules.
Preserve third-party product names according to the owner's spelling.
Include the company name and full model qualifier on first use when it helps identify the model.
Preserve the official capitalization and punctuation of model names.
Use shorter family names only after the full name is established.
Spell out a term on first use and put the acronym in parentheses unless the acronym is widely understood by the intended audience.
Use the acronym on later mentions after it has been defined.
For long documents, reintroduce the full term if readers might lose context.
Form plurals of acronyms with s, not an apostrophe, such as GPUs.
In headings, common acronyms can remain abbreviated. Spell out the term in the first or second sentence of the body.
Common terms such as CPU, GPU, PC, API, and UI usually do not need to be spelled out for developer audiences.

Files:

ATTRIBUTIONS-Rust.md

**/*.{md,rst,html}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-brand-terminology.md)

Link the first mention of a product name when the destination helps the reader.

Files:

ATTRIBUTIONS-Rust.md

**/*.md

📄 CodeRabbit inference engine (.agents/skills/contribute-integration/SKILL.md)

Documentation must be updated if activation or usage changed

**/*.md: Use title case consistently in technical documentation headings
Avoid quotation marks, ampersands, and exclamation marks in headings
Keep product, event, research, and whitepaper names in their official title case
Use title case for table headers
Do not force social-media sentence case into technical docs
Format code elements, commands, parameters, package names, and expressions in monospace
Format directories, file names, and paths in monospace using backticks
Use angle brackets inside monospace for variables inside paths, such as /home/<username>/.login
Format error messages and strings in quotation marks, keeping literal code strings in code formatting when clearer
Format UI buttons, menus, fields, and labels in bold
Use angle brackets between UI labels for menu paths, such as File > Save As
Use italics for new terms on first use, sparingly and only when introducing the term
Use italics for publication titles
Format keyboard shortcuts in plain text, such as Press Ctrl+Alt+Delete
Use owner/repo link text for GitHub repositories, preferring [NVIDIA/NeMo](link) over prose references like 'the GitHub repo'
Introduce every code block with a complete sentence
Do not make a code block complete the grammar of the previous sentence
Do not continue a sentence after a code block
Use syntax highlighting when the format supports it for code blocks
Avoid the word 'snippet' unless the surrounding docs already use it as a term of art
Keep inline method, function, and class references consistent with nearby docs, omitting empty parentheses for prose readability when no call is shown
Use descriptive anchor text that matches the destination title when possible for links
Avoid raw URLs in running text
Avoid generic anchor text such as 'here,' 'this page,' and 'read more'
Include acronyms in link text when a linked term includes an acronym
Do not link long sentences or multiple sentences
Avoid links ...

Files:

ATTRIBUTIONS-Rust.md

**/{docs,examples,**/*.md,*.patch,*.diff,.github,*.sh,*.yaml,*.yml}

📄 CodeRabbit inference engine (.agents/skills/rename-surfaces/SKILL.md)

Update documentation, examples, CI configuration, and patch artifacts when performing rename operations

Files:

ATTRIBUTIONS-Rust.md

**/*.{md,rst,txt}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)

Spell NVIDIA in all caps. Do not use Nvidia, nvidia, or NV.

Files:

ATTRIBUTIONS-Rust.md

**/*.{md,rst}

📄 CodeRabbit inference engine (.agents/skills/review-doc-style/assets/nvidia-style-guide.md)

**/*.{md,rst}: Format commands, code elements, expressions, package names, file names, and paths as inline code.
Use descriptive link text. Avoid raw URLs and weak anchors such as "here" or "read more."
Use title case consistently for technical documentation headings.
Introduce code blocks, lists, tables, and images with complete sentences.
Write procedures as imperative steps. Keep steps parallel and split long procedures into smaller tasks.
Prefer active voice, present tense, short sentences, contractions, and plain English.
Use can for possibility and reserve may for permission.
Use after for temporal relationships instead of once.
Prefer refer to over see when the wording points readers to another resource.
Avoid culture-specific idioms, unnecessary Latinisms, jokes, and marketing exaggeration in technical docs.
Spell out months in body text, avoid ordinal dates, and use clear time zones.
Spell out whole numbers from zero through nine unless they are technical values, parameters, versions, or UI values.
Use numerals for 10 or greater and include commas in thousands.
Do not add trademark symbols to learning-oriented docs unless the source, platform, or legal guidance explicitly requires them.

Files:

ATTRIBUTIONS-Rust.md

{docs/**,README.md,CONTRIBUTING.md,**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Run docs link validation with just docs-linkcheck when links change

Files:

ATTRIBUTIONS-Rust.md

{docs/**,README.md,**/Cargo.toml,**/package.json,**/*.md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Ensure renamed public surfaces are reflected consistently in manifests and docs for large or public-facing changes

Files:

ATTRIBUTIONS-Rust.md

**/*.{md,mdx,py,sh,yaml,yml,toml,json}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current

Files:

ATTRIBUTIONS-Rust.md

**/*.{html,md,mdx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license header in HTML and Markdown files using HTML comment syntax

Files:

ATTRIBUTIONS-Rust.md

🪛 markdownlint-cli2 (0.22.1)

ATTRIBUTIONS-Rust.md

[warning] 3127-3127: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Above