Add E2E tests for Simple Calculator Observability example#1019
Conversation
Signed-off-by: David Gardner <dagardner@nvidia.com>
…to david-observe-simple-calc-e2e Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
…roduction Monitoring Platforms' heading, implying that phoneix is only for local dev Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
WalkthroughRemoved OpenAI LLM entries from observability example configs, expanded README platform headings/links, and added integration tests and fixtures for Weave, Phoenix, and OpenTelemetry observability backends plus test-support fixtures for WANDB/weave. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Test as Test Suite
participant Config as YAML Config Loader
participant Workflow as Calculator Workflow
participant Tracer as Observability Backend
Test->>Config: load config (weave/phoenix/otel)
Config-->>Test: config ready
Test->>Tracer: init tracer/project (if needed)
Test->>Workflow: run_workflow(question)
activate Workflow
Workflow->>Workflow: execute steps / call LLM (nim_llm)
Workflow-->>Test: return result
deactivate Workflow
Workflow->>Tracer: emit traces/events
Test->>Tracer: fetch/validate traces
Tracer-->>Test: traces validated
Test->>Tracer: cleanup (project/trace artifacts)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)**/README.@(md|ipynb)📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Files:
**/*⚙️ CodeRabbit configuration file
Files:
examples/**/*⚙️ CodeRabbit configuration file
Files:
🪛 LanguageToolexamples/observability/simple_calculator_observability/README.md[grammar] ~216-~216: There might be a mistake here. (QB_NEW_EN) [grammar] ~217-~217: There might be a mistake here. (QB_NEW_EN) [grammar] ~218-~218: There might be a mistake here. (QB_NEW_EN) ⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
examples/observability/simple_calculator_observability/README.md (1)
20-20: Use the correct product name capitalizationPrefer “NeMo Agent Toolkit” (capital T), not “NeMo Agent toolkit.”
Apply this diff:
-This example demonstrates how to implement **observability and tracing capabilities** using the NVIDIA NeMo Agent toolkit. You'll learn to monitor, trace, and analyze your AI agent's behavior in real-time using the Simple Calculator workflow. +This example demonstrates how to implement **observability and tracing capabilities** using the NVIDIA NeMo Agent Toolkit. You'll learn to monitor, trace, and analyze your AI agent's behavior in real-time using the Simple Calculator workflow.As per coding guidelines
packages/nvidia_nat_test/src/nat/test/plugin.py (1)
238-250: Move return statement to else block.The
return weavestatement at line 245 should be in anelseblock to clarify that it only executes when the import succeeds, improving code structure and addressing the static analysis hint.Apply this diff:
@pytest.fixture(name="weave", scope='session') def require_weave_fixture(fail_missing: bool) -> types.ModuleType: """ Use for integration tests that require Weave to be running. """ try: import weave - return weave except Exception as e: reason = "Weave must be installed to run weave based tests" if fail_missing: raise RuntimeError(reason) from e pytest.skip(reason=reason) + else: + return weave
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
examples/observability/simple_calculator_observability/README.md(3 hunks)examples/observability/simple_calculator_observability/configs/config-catalyst.yml(0 hunks)examples/observability/simple_calculator_observability/configs/config-galileo.yml(0 hunks)examples/observability/simple_calculator_observability/configs/config-langfuse.yml(0 hunks)examples/observability/simple_calculator_observability/configs/config-langsmith.yml(0 hunks)examples/observability/simple_calculator_observability/configs/config-patronus.yml(0 hunks)examples/observability/simple_calculator_observability/configs/config-phoenix.yml(1 hunks)examples/observability/simple_calculator_observability/configs/config-weave.yml(1 hunks)examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py(1 hunks)packages/nvidia_nat_test/src/nat/test/plugin.py(2 hunks)
💤 Files with no reviewable changes (5)
- examples/observability/simple_calculator_observability/configs/config-catalyst.yml
- examples/observability/simple_calculator_observability/configs/config-langsmith.yml
- examples/observability/simple_calculator_observability/configs/config-langfuse.yml
- examples/observability/simple_calculator_observability/configs/config-galileo.yml
- examples/observability/simple_calculator_observability/configs/config-patronus.yml
🧰 Additional context used
📓 Path-based instructions (10)
**/*.{yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
In workflow/config YAML, set llms.._type: nat_test_llm to stub responses.
Files:
examples/observability/simple_calculator_observability/configs/config-phoenix.ymlexamples/observability/simple_calculator_observability/configs/config-weave.yml
**/*.{py,yaml,yml}
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.{py,yaml,yml}: Configure response_seq as a list of strings; values cycle per call, and [] yields an empty string.
Configure delay_ms to inject per-call artificial latency in milliseconds for nat_test_llm.
Files:
examples/observability/simple_calculator_observability/configs/config-phoenix.ymlexamples/observability/simple_calculator_observability/configs/config-weave.ymlpackages/nvidia_nat_test/src/nat/test/plugin.pyexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
**/configs/**
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Configuration files consumed by code must be stored next to that code in a configs/ folder
Files:
examples/observability/simple_calculator_observability/configs/config-phoenix.ymlexamples/observability/simple_calculator_observability/configs/config-weave.yml
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
examples/observability/simple_calculator_observability/configs/config-phoenix.ymlexamples/observability/simple_calculator_observability/configs/config-weave.ymlpackages/nvidia_nat_test/src/nat/test/plugin.pyexamples/observability/simple_calculator_observability/README.mdexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/observability/simple_calculator_observability/configs/config-phoenix.ymlexamples/observability/simple_calculator_observability/configs/config-weave.ymlexamples/observability/simple_calculator_observability/README.mdexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
**/*.py
📄 CodeRabbit inference engine (.cursor/rules/nat-test-llm.mdc)
**/*.py: Programmatic use: create TestLLMConfig(response_seq=[...], delay_ms=...), add with builder.add_llm("", cfg).
When retrieving the test LLM wrapper, use builder.get_llm(name, wrapper_type=LLMFrameworkEnum.) and call the framework’s method (e.g., ainvoke, achat, call).
**/*.py: In code comments/identifiers use NAT abbreviations as specified: nat for API namespace/CLI, nvidia-nat for package name, NAT for env var prefixes; do not use these abbreviations in documentation
Follow PEP 20 and PEP 8; run yapf with column_limit=120; use 4-space indentation; end files with a single trailing newline
Run ruff check --fix as linter (not formatter) using pyproject.toml config; fix warnings unless explicitly ignored
Respect naming: snake_case for functions/variables, PascalCase for classes, UPPER_CASE for constants
Treat pyright warnings as errors during development
Exception handling: use bare raise to re-raise; log with logger.error() when re-raising to avoid duplicate stack traces; use logger.exception() when catching without re-raising
Provide Google-style docstrings for every public module, class, function, and CLI command; first line concise and ending with a period; surround code entities with backticks
Validate and sanitize all user input, especially in web or CLI interfaces
Prefer httpx with SSL verification enabled by default and follow OWASP Top-10 recommendations
Use async/await for I/O-bound work; profile CPU-heavy paths with cProfile or mprof before optimizing; cache expensive computations with functools.lru_cache or external cache; leverage NumPy vectorized operations when beneficial
Files:
packages/nvidia_nat_test/src/nat/test/plugin.pyexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
packages/*/src/**/*.py
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Importable Python code inside packages must live under packages//src/
Files:
packages/nvidia_nat_test/src/nat/test/plugin.py
{src/**/*.py,packages/*/src/**/*.py}
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
All public APIs must have Python 3.11+ type hints on parameters and return values; prefer typing/collections.abc abstractions; use typing.Annotated when useful
Files:
packages/nvidia_nat_test/src/nat/test/plugin.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.
Files:
packages/nvidia_nat_test/src/nat/test/plugin.py
**/README.@(md|ipynb)
📄 CodeRabbit inference engine (.cursor/rules/general.mdc)
Ensure READMEs follow the naming convention; avoid deprecated names; use “NeMo Agent Toolkit” (capital T) in headings
Files:
examples/observability/simple_calculator_observability/README.md
🪛 Ruff (0.14.0)
packages/nvidia_nat_test/src/nat/test/plugin.py
245-245: Consider moving this statement to an else block
(TRY300)
examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
48-48: Unused function argument: wandb_api_key
(ARG001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: CI Pipeline / Check
🔇 Additional comments (12)
examples/observability/simple_calculator_observability/README.md (4)
56-77: Phoenix section rename looks goodClear, concise, and consistent with the rest of the doc.
79-100: File-based tracing header update is goodInstructions are straightforward; no issues spotted.
103-103: Langfuse hyperlink update LGTMImproves readability and navigability.
212-224: All referenced config files exist inexamples/observability/simple_calculator_observability/configsVerified that the directory contains exactly the eight YAML files listed in the table.examples/observability/simple_calculator_observability/configs/config-weave.yml (1)
42-42: LGTM! Removal of unused OpenAI LLM configuration.The removal of the
openai_llmconfiguration aligns with the PR objectives. The workflow correctly referencesnim_llm(line 52), which remains configured.packages/nvidia_nat_test/src/nat/test/plugin.py (1)
227-235: LGTM! Follows established fixture pattern.The
wandb_api_keyfixture correctly follows the pattern established by other API key fixtures in this file, usingrequire_env_variablesto handle missing environment variables appropriately.examples/observability/simple_calculator_observability/configs/config-phoenix.yml (1)
61-61: LGTM! Consistent removal of unused OpenAI LLM.This change mirrors the cleanup in
config-weave.yml. The workflow correctly usesnim_llm(line 71), which remains properly configured.examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py (5)
16-24: LGTM! Standard test imports.The imports are appropriate for the test scenarios and follow best practices.
27-44: LGTM! Well-structured test fixtures.The fixtures for configuration directory, API key, question, and expected answer are appropriately scoped and follow pytest best practices.
64-71: LGTM! Well-structured Weave integration test.The test correctly loads the config, overrides the project name with the test fixture, and validates the workflow. The use of the
@pytest.mark.integrationand@pytest.mark.usefixtures("wandb_api_key")decorators is appropriate.
74-80: LGTM! Phoenix integration test follows correct pattern.The test appropriately loads the Phoenix configuration, overrides the trace endpoint URL with the fixture value, and validates the workflow execution.
84-107: LGTM! Comprehensive OTEL validation.The test correctly:
- Creates a temporary output file for OTEL traces
- Loads and configures the OTEL file exporter
- Validates that traces were generated
- Verifies that the expected
calculator_multiplyfunction appears in the trace ancestryThe validation logic is thorough and appropriate for an E2E test.
examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
Show resolved
Hide resolved
willkill07
left a comment
There was a problem hiding this comment.
Approving, but changes need to be made to the README
examples/observability/simple_calculator_observability/README.md
Outdated
Show resolved
Hide resolved
Signed-off-by: David Gardner <dagardner@nvidia.com>
|
/merge |
Description
This example contains 8 distinct workflows, each requiring different keys/services, this PR adds E2E tests for the following workflows:
Remove unused LLM from configs
Misc documentation improvement
By Submitting this PR I confirm:
Summary by CodeRabbit
Documentation
Changes
Tests