Add E2E test for RagaAI Catalyst#1194
Conversation
…ault, update instructions on generating an API key Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
Signed-off-by: David Gardner <dagardner@nvidia.com>
This reverts commit 275d000. Signed-off-by: David Gardner <dagardner@nvidia.com>
This reverts commit d7b8dbc. Signed-off-by: David Gardner <dagardner@nvidia.com>
…eep the traces won't be uploaded before the dataset is Signed-off-by: David Gardner <dagardner@nvidia.com>
WalkthroughUpdates add detailed Catalyst credential/project setup and a NAT_SPAN_PREFIX environment variable in the README, remove the Catalyst Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Integration Test
participant Exporter as RAGATraceExporterOptWrite
participant Catalyst as Catalyst API / Dataset
participant FS as Local File (optional)
Note over Test,Exporter: Test runs workflow and emits OTEL spans
Test->>Exporter: invoke exporter (spans)
alt debug_mode = true
Exporter->>FS: write trace file (local debug artifact)
end
Exporter->>Catalyst: send spans / create dataset
Catalyst-->>Exporter: ack / dataset created
Exporter-->>Test: export result
Test->>Catalyst: poll Dataset API for dataset presence
Catalyst-->>Test: dataset found / not found
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.py (1)
36-49: Aligndebug_modebehavior with documented semantics for local trace filesThe docstrings for
RAGATraceExporterOptWrite(lines 42–43),DynamicTraceExporterOptWrite(lines 163–164), andRagaAICatalystMixin.__init__(lines 218–219) all consistently state:
- When
debug_modeis False (default) → create localrag_agent_traces.json.- When
debug_modeis True → skip local file creation for cleaner operation.However, the implementation at line 142 contradicts this:
if self.debug_mode: with open(os.path.join(os.getcwd(), 'rag_agent_traces.json'), 'w', encoding="utf-8") as f: json.dump(ragaai_trace, f, cls=TracerJSONEncoder, indent=2)This means:
- Default (
debug_mode=False) does NOT create the file (contradicts docstring).- Setting
debug_mode=TrueDOES enable file creation (contradicts documented intent to skip).To match the documented behavior, invert the condition:
- if self.debug_mode: - with open(os.path.join(os.getcwd(), 'rag_agent_traces.json'), 'w', encoding="utf-8") as f: - json.dump(ragaai_trace, f, cls=TracerJSONEncoder, indent=2) + if not self.debug_mode: + with open(os.path.join(os.getcwd(), 'rag_agent_traces.json'), 'w', encoding="utf-8") as f: + json.dump(ragaai_trace, f, cls=TracerJSONEncoder, indent=2)Alternatively, if the implementation intent is correct and
debug_mode=Trueshould enable local artifacts, update all three docstrings to reflect thatTrueenables file creation andFalsedisables it.Also applies to: 163–164, 218–219
🧹 Nitpick comments (4)
packages/nvidia_nat_test/src/nat/test/plugin.py (1)
350-379: Tighten Catalyst fixtures: unused dependency + teardown robustness
catalyst_project_name_fixture(catalyst_keys)intentionally depends oncatalyst_keysbut never uses it, which Ruff flags (ARG001). To keep the dependency while making intent clear, consider renaming the parameter or marking it ignored, e.g.:-@pytest.fixture(name="catalyst_project_name") -def catalyst_project_name_fixture(catalyst_keys) -> str: +@pytest.fixture(name="catalyst_project_name") +def catalyst_project_name_fixture(_catalyst_keys) -> str: # noqa: ARG001 return os.environ.get("NAT_CI_CATALYST_PROJECT_NAME", "nat-e2e")
- In
catalyst_dataset_name_fixture, anyImportErroror API failure fromragaai_catalyst.Datasetduring teardown will surface as a hard error rather than following the commonfail_missing/skip convention used by other integration fixtures in this file (e.g.,galileo_project_fixture,weave,langsmith_client). If you want consistent behavior, consider adding afail_missing: boolparameter and handling missingragaai_catalystsimilarly:-@pytest.fixture(name="catalyst_dataset_name") -def catalyst_dataset_name_fixture(catalyst_project_name: str, project_name: str) -> str: +@pytest.fixture(name="catalyst_dataset_name") +def catalyst_dataset_name_fixture(catalyst_project_name: str, + project_name: str, + fail_missing: bool) -> str: @@ - from ragaai_catalyst import Dataset - ds = Dataset(catalyst_project_name) - if dataset_name in ds.list_datasets(): - ds.delete_dataset(dataset_name) + try: + from ragaai_catalyst import Dataset + ds = Dataset(catalyst_project_name) + if dataset_name in ds.list_datasets(): + ds.delete_dataset(dataset_name) + except ImportError as e: + reason = "Catalyst integration tests require the `ragaai_catalyst` package to be installed." + if fail_missing: + raise RuntimeError(reason) from e + pytest.skip(reason=reason)This keeps CI-friendly cleanup while aligning with the existing integration-fixture patterns.
packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.py (1)
59-152: Simplifylogger.exceptioncalls (drop redundanteandexc_info=True)Functionally, the exception handling is fine and matches the guideline of using
logger.exceptionwhen not re-raising. However, all of these calls:logger.exception("Error in convert_json_format function: %s: %s", trace_id, e, exc_info=True) ... logger.exception("Error converting trace %s: %s", trace_id, str(e), exc_info=True) ... logger.exception("Error exporting spans: %s", e, exc_info=True)are more verbose than necessary:
logger.exceptionalready setsexc_info=Trueby default.- Including
e(orstr(e)) in the message is usually redundant, since the stack trace will show the exception type and message.To address Ruff TRY401 and keep logs clean, you can simplify to:
- except Exception as e: - logger.exception("Error in convert_json_format function: %s: %s", trace_id, e, exc_info=True) + except Exception: + logger.exception("Error in convert_json_format function for trace %s", trace_id)and similarly for the other blocks, including the final catch-all in
prepare_traceand the one inexport_otel_spans.This keeps full stack traces while avoiding redundant arguments and lint warnings.
Also applies to: 248-248
examples/observability/simple_calculator_observability/README.md (1)
212-237: Tidy ordered list numbering in Catalyst setup sectionThe new Catalyst instructions read cleanly, but the ordered list now has two
3.items:
3. Set the NAT_SPAN_PREFIX environment variable…3. Run the workflow:Markdown will auto-renumber, but for readability (and to avoid docs linters complaining), consider either:
- Renumbering explicitly:
3. Set the NAT_SPAN_PREFIX environment variable to `aiq` for RagaAI Catalyst compatibility: 4. Run the workflow:or
- Making all list items
1.and relying on Markdown’s auto-numbering.Content-wise, the NAT_SPAN_PREFIX step is great and aligns with the new tests.
examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py (1)
240-267: Catalyst E2E test looks good; consider minor observability/robustness tweaksThe end-to-end Catalyst test wiring is sound:
- Uses
catalyst_project_name/catalyst_dataset_namefixtures andcatalyst_keysto ensure proper context.- Waits for ingestion with an initial sleep plus a bounded polling loop.
- Verifies dataset existence via
Dataset.list_datasets().Two optional improvements you might consider:
Reduce hard-coded initial sleep
You already have a polling loop with a deadline; you could drop or shorten the initialawait asyncio.sleep(5)and rely more on the loop to keep the test a bit snappier when Catalyst is fast.Handle Dataset import errors consistently
If you decide to make thecatalyst_dataset_namefixture follow afail_missing/skip pattern (as suggested inplugin.py), you might want to mirror that here as well (wrapping theDatasetimport intry/except ImportErrorand skipping whenfail_missingis false), for consistent behavior across all Catalyst integration tests.If you’re happy with current CI timings and assumptions about Catalyst availability, the current implementation is acceptable as-is.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
examples/observability/simple_calculator_observability/README.md(1 hunks)examples/observability/simple_calculator_observability/configs/config-catalyst.yml(0 hunks)examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py(3 hunks)packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.py(5 hunks)packages/nvidia_nat_test/src/nat/test/plugin.py(1 hunks)
💤 Files with no reviewable changes (1)
- examples/observability/simple_calculator_observability/configs/config-catalyst.yml
🧰 Additional context used
📓 Path-based instructions (3)
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Referenced Documentation Contents
ci/vale/styles/config/vocabularies/nat/reject.txt:
Not directly related to PR objectives; no actionable changes for this PR. Reserved for broader policy checks....
Files:
packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.pyexamples/observability/simple_calculator_observability/README.mdpackages/nvidia_nat_test/src/nat/test/plugin.pyexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
packages/**/*
⚙️ CodeRabbit configuration file
packages/**/*: - This directory contains optional plugin packages for the toolkit, each should contain apyproject.tomlfile. - Thepyproject.tomlfile should declare a dependency onnvidia-nator another package with a name starting
withnvidia-nat-. This dependency should be declared using~=<version>, and the version should be a two
digit version (ex:~=1.0).
- Not all packages contain Python code, if they do they should also contain their own set of tests, in a
tests/directory at the same level as thepyproject.tomlfile.
Files:
packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.pypackages/nvidia_nat_test/src/nat/test/plugin.py
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/observability/simple_calculator_observability/README.mdexamples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
🪛 Ruff (0.14.5)
packages/nvidia_nat_ragaai/src/nat/plugins/ragaai/mixin/ragaai_catalyst_mixin.py
59-59: Redundant exception object included in logging.exception call
(TRY401)
67-67: Redundant exception object included in logging.exception call
(TRY401)
75-75: Redundant exception object included in logging.exception call
(TRY401)
84-84: Redundant exception object included in logging.exception call
(TRY401)
91-91: Redundant exception object included in logging.exception call
(TRY401)
101-101: Redundant exception object included in logging.exception call
(TRY401)
108-108: Redundant exception object included in logging.exception call
(TRY401)
116-116: Redundant exception object included in logging.exception call
(TRY401)
133-133: Redundant exception object included in logging.exception call
(TRY401)
146-146: Redundant exception object included in logging.exception call
(TRY401)
149-149: Consider moving this statement to an else block
(TRY300)
151-151: Redundant exception object included in logging.exception call
(TRY401)
248-248: Redundant exception object included in logging.exception call
(TRY401)
packages/nvidia_nat_test/src/nat/test/plugin.py
363-363: Unused function argument: catalyst_keys
(ARG001)
examples/observability/simple_calculator_observability/tests/test_simple_calc_observability.py
Show resolved
Hide resolved
examples/observability/simple_calculator_observability/README.md
Outdated
Show resolved
Hide resolved
|
It seems that we might be missing otel mixin tests for pycls: RagaAICatalystMixin. Thoughts @dagardner-nv ? |
Signed-off-by: David Gardner <dagardner@nvidia.com>
…NeMo-Agent-Toolkit into david-test-ragaai-catalyst Signed-off-by: David Gardner <dagardner@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/observability/simple_calculator_observability/README.md (1)
38-44: Update Prerequisites to include Catalyst.Line 44 lists supported observability platforms but omits Catalyst, which is now fully documented in the Catalyst Integration section. The Prerequisites should reflect all major integrations to guide users correctly.
Apply this diff to add Catalyst to the Prerequisites:
-3. **Observability platform**: Access to at least one of the supported platforms (Phoenix, Langfuse, LangSmith, Weave, or Patronus) +3. **Observability platform**: Access to at least one of the supported platforms (Phoenix, Langfuse, LangSmith, Weave, Patronus, or RagaAI Catalyst)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
examples/observability/simple_calculator_observability/README.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*
⚙️ CodeRabbit configuration file
**/*: # Code Review Instructions
- Ensure the code follows best practices and coding standards. - For Python code, follow
PEP 20 and
PEP 8 for style guidelines.- Check for security vulnerabilities and potential issues. - Python methods should use type hints for all parameters and return values.
Example:def my_function(param1: int, param2: str) -> bool: pass- For Python exception handling, ensure proper stack trace preservation:
- When re-raising exceptions: use bare
raisestatements to maintain the original stack trace,
and uselogger.error()(notlogger.exception()) to avoid duplicate stack trace output.- When catching and logging exceptions without re-raising: always use
logger.exception()
to capture the full stack trace information.Documentation Review Instructions - Verify that documentation and comments are clear and comprehensive. - Verify that the documentation doesn't contain any TODOs, FIXMEs or placeholder text like "lorem ipsum". - Verify that the documentation doesn't contain any offensive or outdated terms. - Verify that documentation and comments are free of spelling mistakes, ensure the documentation doesn't contain any
words listed in the
ci/vale/styles/config/vocabularies/nat/reject.txtfile, words that might appear to be
spelling mistakes but are listed in theci/vale/styles/config/vocabularies/nat/accept.txtfile are OK.Misc. - All code (except .mdc files that contain Cursor rules) should be licensed under the Apache License 2.0,
and should contain an Apache License 2.0 header comment at the top of each file.
- Confirm that copyright years are up-to date whenever a file is changed.
Files:
examples/observability/simple_calculator_observability/README.md
examples/**/*
⚙️ CodeRabbit configuration file
examples/**/*: - This directory contains example code and usage scenarios for the toolkit, at a minimum an example should
contain a README.md or file README.ipynb.
- If an example contains Python code, it should be placed in a subdirectory named
src/and should
contain apyproject.tomlfile. Optionally, it might also contain scripts in ascripts/directory.- If an example contains YAML files, they should be placed in a subdirectory named
configs/. - If an example contains sample data files, they should be placed in a subdirectory nameddata/, and should
be checked into git-lfs.
Files:
examples/observability/simple_calculator_observability/README.md
🔇 Additional comments (2)
examples/observability/simple_calculator_observability/README.md (2)
212-217: Verify config-catalyst.yml supports project name configuration.Line 217 instructs users to update the project name in
config-catalyst.yml, but this documentation doesn't clarify the exact config field or structure. Given that the PR removed theendpointfield, ensure the config file documentation aligns with what fields are actually configurable.Verify that:
- The config-catalyst.yml file contains a configurable project name field (or clarify how project name is set via environment variable)
- Update the documentation if project name is set via environment variable instead of config file
46-52: Ignore this review comment—the suggested changes are incorrect.The current Installation section is correct and complete. The single command
uv pip install -e examples/observability/simple_calculator_observabilityautomatically installs all required dependencies, includingnat_simple_calculator, which is already declared inpyproject.toml.The suggested diff has two problems:
uv pip install -e ".[ragaai]"— The[ragaai]optional extra does not exist inpyproject.toml. RagaAI Catalyst is optional and only required if users want to use that specific observability platform. The README already provides complete Catalyst setup instructions (including environment variable configuration) in the "RagaAI Catalyst Integration" section.
uv pip install -e examples/getting_started/simple_calculator— This is redundant. Thenat_simple_calculatorpackage is already declared as a dependency and will be installed automatically.Likely an incorrect or invalid review comment.
The This PR is part of an effort to add E2E test coverage for all of our examples, this PR adds that for the Catalyst section of the However it is lacking a unittest, which is outside the scope of this PR (although I did end up having to fix some stuff here and there to get the example working). |
Signed-off-by: David Gardner <dagardner@nvidia.com>
|
/ok to test fc22457 |
|
/merge |
* Document the need to set `NAT_SPAN_PREFIX=aiq` * Update the documentation to reflect Catalyst UI changes, and the need to create the project in the Catalyst UI prior to running the workflow. * Remove the `endpoint` entry from `config-catalyst.yml`, `CatalystTelemetryExporter` has a reasonable default value for this, and overriding this with an environment variable shouldn't be required * Replace calls to `print` with logging calls * Add E2E test ## By Submitting this PR I confirm: - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. ## Summary by CodeRabbit * **Documentation** * Expanded Catalyst setup with combined credentials/project step, step-by-step API key and project guidance, optional endpoint note, NAT_SPAN_PREFIX instructions, workflow run steps, and dashboard/dataset trace viewing. * **New Features** * Added optional debug mode for controlling local trace file writes. * Made Catalyst endpoint optional and documented customization. * **Bug Fixes** * Improved error logging to include exception context during trace export. * **Tests** * Added environment-aware fixtures and full Catalyst workflow integration tests with span-prefix compatibility and dataset polling/cleanup. <sub>✏️ Tip: You can customize this high-level summary in your review settings.</sub> Authors: - David Gardner (https://github.com/dagardner-nv) Approvers: - Bryan Bednarski (https://github.com/bbednarski9) URL: NVIDIA#1194 Signed-off-by: Sangharsh Aglave <aglave@synopsys.com>
Description
NAT_SPAN_PREFIX=aiqendpointentry fromconfig-catalyst.yml,CatalystTelemetryExporterhas a reasonable default value for this, and overriding this with an environment variable shouldn't be requiredprintwith logging callsBy Submitting this PR I confirm:
Summary by CodeRabbit
Documentation
New Features
Bug Fixes
Tests
✏️ Tip: You can customize this high-level summary in your review settings.