Add ATIF trajectory exporter for Phoenix visualization and debugging#1869
Conversation
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsTimed out fetching pipeline failures after 30000ms Comment |
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 684-692: In _trajectory_time_bounds(), prefer invocation
timestamps per-boundary instead of the current all-or-nothing check: compute
has_inv_first = (inv_first != float("inf")) and has_inv_last = (inv_last !=
float("inf")), set first_ts = inv_first if has_inv_first else iso_first and
last_ts = inv_last if has_inv_last else iso_last, then normalize missing values
(e.g., if first_ts == float("inf") set first_ts = 0.0; if last_ts is
missing/invalid — float("inf") or 0.0 — set last_ts = first_ts) so start and end
fallback independently while keeping the existing normalization logic.
- Around line 512-513: The current logic sets start_epoch = 0.0 when
start_timestamp is missing, producing huge bogus spans; change it so start_epoch
is not pinned to the Unix epoch: compute end_epoch = invocation.end_timestamp if
invocation and invocation.end_timestamp is not None else None, and set
start_epoch = invocation.start_timestamp if invocation and
invocation.start_timestamp is not None else (end_epoch if end_epoch is not None
else None) so that a missing start with a present end yields a zero-length or
end-anchored span instead of 0.0; update any downstream code that assumes
numeric start_epoch to handle None accordingly (references: start_epoch,
end_epoch, invocation in atif_trajectory_exporter.py).
- Around line 130-132: In convert(), validate and sanitize trajectory_data
before accessing fields: check that trajectory_data is a dict and contains a
non-empty string "session_id", an "agent" dict with a non-empty string "name",
and "steps" is a list (possibly empty) of dicts; if any validation fails, log a
clear error including the file identifier and skip/return early instead of
letting KeyError/TypeError propagate. Use the existing variable names
session_id, agent_name, steps and the convert() entry point to locate where to
add these checks and error logging.
- Around line 173-180: The try/except around
AtifStepExtra.model_validate(extra_raw) is too broad; replace the generic
"except Exception" with catching pydantic's ValidationError only: import
ValidationError from pydantic and use "except ValidationError:" so validation
failures are handled as intended while other unexpected errors still surface;
keep the existing handling (checking _is_terminal_agent_step, setting
last_agent_msg via _message_to_str, and logger.debug) unchanged and located in
the except block for AtifStepExtra.model_validate.
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py`:
- Around line 81-91: The except block around the call to
self._http_exporter.export(...) currently logs and swallows exceptions; change
it to log with logger.error (not logger.exception) and then re-raise the caught
exception so the CLI fails properly: inside the try/except that wraps
dangerously_using_project(self._project) and the call to
self._http_exporter.export(otel_spans), replace the existing except Exception:
logger.exception(...) swallow with an except Exception: logger.error("Error
exporting trajectory spans to Phoenix: %s", e) (or similar) followed by a bare
raise to preserve the original traceback.
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 67-88: Replace the current per-file handling inside the for path
in args.files loop with a narrow try/except around each file read/parse/export
so one bad input doesn't abort the batch: for each path, first check
path.exists() and path.is_file(), then open and json.load the file and call
exporter.export(trajectory) inside a try block that catches FileNotFoundError,
PermissionError, json.JSONDecodeError and any exporter exceptions; log an error
for that specific path and continue to the next file, and after the loop exit
with a non-zero status if any file failed (track a failure flag) so the CLI
returns failure but still processes all inputs.
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md`:
- Around line 85-97: The fenced code block in README.md (the workflow ASCII
diagram) lacks a language tag which triggers MD040; update the fence opening
(the triple backticks before the WORKFLOW diagram) to include a language
identifier such as text (e.g., replace "```" with "```text") so the block is
explicitly annotated; ensure the closing fence remains "```" and no other
content changes in the WORKFLOW diagram.
- Around line 74-79: The fenced code block containing the ATIF Trajectory
diagram needs a language annotation to satisfy markdownlint; edit the README.md
fenced block that currently starts with ``` and change it to include a language
(for example, ```text) so the block becomes ```text followed by the diagram
lines (ATIF Trajectory (dict) -> ATIFTrajectorySpanExporter.convert() ->
convert_spans_to_otel_batch() -> HTTPSpanExporter.export()) and closing ```.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 9887c7d4-5532-4271-be3a-96bfe43dba78
📒 Files selected for processing (5)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.mdpackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/__init__.pypackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.pypackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.pypackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (1)
75-81:⚠️ Potential issue | 🟠 MajorHandle
OSErroron file read to keep batch processing resilient.Between Line 70 and Line 76 there is still a TOCTOU window;
open()can fail withFileNotFoundError/otherOSErrorand currently escape the handler.Proposed fix
- try: - with open(path) as f: + try: + with path.open(encoding="utf-8") as f: trajectory = json.load(f) - except (PermissionError, json.JSONDecodeError) as e: + except (OSError, json.JSONDecodeError) as e: logging.error("Failed to read/parse %s: %s", path, e) has_failure = True continueAs per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` around lines 75 - 81, The try/except around reading JSON only catches PermissionError and json.JSONDecodeError but misses other OSError subclasses (e.g., FileNotFoundError, race conditions) raised by open(); update the exception handling in the block that opens and json.loads the file in export_atif_trajectory_to_phoenix.py (the try using open(path) and json.load) to also catch OSError (or FileNotFoundError) so the code logs the error via logging.error("Failed to read/parse %s: %s", path, e), sets has_failure = True, and continues, preserving batch resiliency.packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (1)
192-199:⚠️ Potential issue | 🟠 MajorNarrow the catch at Line 194 to
ValidationErroronly.Catching
Exceptionhere can silently hide converter defects unrelated to payload validation and skip spans unexpectedly. Keep validation fallback behavior, but catch only Pydantic validation failures.Proposed fix
+from pydantic import ValidationError ... - except Exception: + except ValidationError: # Agent step without usable extra — capture output only if _is_terminal_agent_step(step): last_agent_msg = _message_to_str(step.get("message", "")) logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id")) continue#!/bin/bash # Verify broad exception handling around model_validate remains/narrows correctly. rg -n -C2 'AtifStepExtra\.model_validate|except Exception|ValidationError' \ packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.pyAs per coding guidelines, "Use ruff via ruff check --fix as configured in pyproject.toml; fix warnings unless explicitly ignored in pyproject.toml".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 192 - 199, Narrow the broad except in the AtifStepExtra.model_validate block to catch only Pydantic validation failures: replace the generic "except Exception" with "except ValidationError" (import ValidationError from pydantic) so that conversion errors are still handled the same way (preserve the fallback setting of last_agent_msg via _is_terminal_agent_step and the logger.debug skip) but other unrelated exceptions are not swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 394-399: The code is indexing tool call payloads with
tc["function_name"] which can raise KeyError on malformed entries; update uses
of tc[...] (e.g., where tool_span is created via _create_tool_span and other
similar spots handling tool_calls) to safely access fields with
tc.get("function_name", "<unknown>") (or skip/continue when critical fields are
missing), log a warning when required fields are absent, and ensure the loop
continues rather than raising—apply the same defensive access pattern to
arguments, output, and any other tc["..."] occurrences in this module.
- Around line 696-705: The current truthy checks treat numeric epoch 0.0 as
missing; update the timestamp presence checks in atif_trajectory_exporter.py to
use explicit None checks: replace conditions like if inv.get("start_timestamp")
and if inv.get("end_timestamp") with if inv.get("start_timestamp") is not None
and if inv.get("end_timestamp") is not None, and similarly change the tool
invocation checks to if ti is not None and ti.get("start_timestamp") is not None
and if ti is not None and ti.get("end_timestamp") is not None so inv_first and
inv_last correctly include zero epochs for inv and extra["tool_invocations"].
- Around line 181-183: The loop iterating over steps (for step in steps) assumes
each element is a dict and calls step.get("source", ""), which will raise
AttributeError for malformed items; update the loop in the
export_trajectory_to_phoenix code (or the function containing the loop) to
validate each item before using .get — e.g., if not isinstance(step, dict):
optionally log or warn and continue, or coerce step = {} — then safely call
source = step.get("source", "") so non-dict entries are skipped/handled without
raising.
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Line 30: Add a Google-style docstring to the public CLI entrypoint function
main() that briefly describes its purpose (exports ATIF trajectories to
Phoenix), documents parameters (none) and return value (None), and notes any
side effects or exceptions (e.g., calls external services, may exit on failure);
place the docstring immediately under the def main(): signature in
export_atif_trajectory_to_phoenix.py so tools and users can discover the CLI
behavior.
- Around line 95-100: The current broad except block around
exporter.export(trajectory) logs errors with logging.error and loses the
traceback; change the handler to log the full exception context (use
logging.exception or logging.error(..., exc_info=True)) so stack traces are
preserved when catching exceptions in the export loop; update the except block
that references exporter.export, logging.error, has_failure and path to call
logging.exception("Failed to export %s", path) (or logging.error with
exc_info=True) and leave the has_failure and continue logic intact.
---
Duplicate comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 192-199: Narrow the broad except in the
AtifStepExtra.model_validate block to catch only Pydantic validation failures:
replace the generic "except Exception" with "except ValidationError" (import
ValidationError from pydantic) so that conversion errors are still handled the
same way (preserve the fallback setting of last_agent_msg via
_is_terminal_agent_step and the logger.debug skip) but other unrelated
exceptions are not swallowed.
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 75-81: The try/except around reading JSON only catches
PermissionError and json.JSONDecodeError but misses other OSError subclasses
(e.g., FileNotFoundError, race conditions) raised by open(); update the
exception handling in the block that opens and json.loads the file in
export_atif_trajectory_to_phoenix.py (the try using open(path) and json.load) to
also catch OSError (or FileNotFoundError) so the code logs the error via
logging.error("Failed to read/parse %s: %s", path, e), sets has_failure = True,
and continues, preserving batch resiliency.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 31025cb3-627e-4b00-bbd2-bb7f5cbfadff
📒 Files selected for processing (4)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.mdpackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.pypackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.pypackages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (1)
- packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (4)
394-399:⚠️ Potential issue | 🟠 MajorAvoid hard-indexing malformed tool call payloads.
tc["function_name"]can raise and stop export for the whole trajectory. Guard type + required field, then continue.Suggested fix
tc = tool_calls[idx] + if not isinstance(tc, dict): + logger.debug("Skipping non-dict tool call at index %d: %s", idx, type(tc).__name__) + continue + tool_name = tc.get("function_name") + if not isinstance(tool_name, str) or not tool_name: + logger.debug("Skipping tool call with invalid function_name at index %d", idx) + continue t_anc = tool_ancestry[idx] t_inv = tool_invocations[idx] if idx < len(tool_invocations) else None ... tool_span = self._create_tool_span( ancestry=t_anc, invocation=t_inv, - tool_name=tc["function_name"], + tool_name=tool_name, tool_args=tc.get("arguments", {}),As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
Also applies to: 505-510
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 394 - 399, The code is hard-indexing tc["function_name"] when building tool_span which can raise if tc is malformed; update the call sites that pass tc to _create_tool_span (e.g., where tool_span = self._create_tool_span(... tool_name=tc["function_name"], ...) and the similar block at the later occurrence) to first validate that tc is a dict and contains a non-empty string "function_name" (and optionally that "arguments" is a dict), and if validation fails log a warning and skip/continue that tool call instead of indexing it; ensure you reference tc, _create_tool_span, tool_name and tool_args when adding the guard so malformed payloads do not abort the entire export.
192-199:⚠️ Potential issue | 🟡 MinorCatch
ValidationErrorinstead ofExceptionaroundmodel_validate().Line 194 swallows unexpected converter bugs and silently skips spans. Restrict this to Pydantic validation failures only.
Suggested fix
+from pydantic import ValidationError ... - except Exception: + except ValidationError: # Agent step without usable extra — capture output only if _is_terminal_agent_step(step): last_agent_msg = _message_to_str(step.get("message", "")) logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id")) continue#!/bin/bash # Verify broad exception handling around model_validate in this module. rg -n -C2 "model_validate|except Exception|except ValidationError" packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py # Confirm installed pydantic version in the environment used for checks. python - <<'PY' import pydantic print("pydantic", pydantic.__version__) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 192 - 199, Change the broad except Exception around AtifStepExtra.model_validate to catch only pydantic.ValidationError: import ValidationError from pydantic, replace "except Exception:" with "except ValidationError:" in the block that calls AtifStepExtra.model_validate(extra_raw); keep the existing handling (capture last_agent_msg via _is_terminal_agent_step/_message_to_str, logger.debug and continue) and allow other unexpected exceptions to propagate so converter bugs are not silently swallowed.
148-160:⚠️ Potential issue | 🟠 MajorValidate
stepsentries before any.get()access.
convert()validates thatstepsis a list, but Line 159 passes it to_trajectory_time_bounds(), which assumes each item is a dict. A malformed element will raiseAttributeErrorand abort the export.Suggested fix
- steps = trajectory_data.get("steps") - if not isinstance(steps, list): + steps_raw = trajectory_data.get("steps") + if not isinstance(steps_raw, list): logger.error("Trajectory %s: 'steps' is not a list (got %r), skipping", session_id, type(steps).__name__) return [] + steps = [step for step in steps_raw if isinstance(step, dict)] + if len(steps) != len(steps_raw): + logger.debug( + "Trajectory %s: skipped %d non-dict step entries", + session_id, + len(steps_raw) - len(steps), + )As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 148 - 160, The code calls self._trajectory_time_bounds(steps) assuming every element in steps is a dict; first validate and sanitize entries in convert() before any .get() access by checking that steps is a list of dicts (and optionally contains expected keys/timestamp fields) and either filter out non-dict/malformed entries or return [] and log an error with session_id; update the logic around the steps variable (used by _trajectory_time_bounds, span_lookup, delegation_refs, etc.) so only validated dict entries are passed to _trajectory_time_bounds and the rest of the export flow.
684-702:⚠️ Potential issue | 🟡 MinorUse explicit
is not Nonechecks for epoch fields.Current truthy checks drop valid
0.0timestamps, which can skew workflow boundaries.Suggested fix
- if inv.get("start_timestamp"): + if inv.get("start_timestamp") is not None: inv_first = min(inv_first, inv["start_timestamp"]) - if inv.get("end_timestamp"): + if inv.get("end_timestamp") is not None: inv_last = max(inv_last, inv["end_timestamp"]) ... - if ti and ti.get("start_timestamp"): + if ti and ti.get("start_timestamp") is not None: inv_first = min(inv_first, ti["start_timestamp"]) - if ti and ti.get("end_timestamp"): + if ti and ti.get("end_timestamp") is not None: inv_last = max(inv_last, ti["end_timestamp"])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 684 - 702, The code in atif_trajectory_exporter.py uses truthy checks that drop valid zero timestamps; change checks to explicit "is not None" for timestamp fields: verify ts is not None before calling _iso_to_epoch and comparing to iso_first/iso_last, and replace inv.get("start_timestamp")/end_timestamp and ti.get("start_timestamp")/end_timestamp truthy checks with explicit "is not None" checks when updating inv_first and inv_last (refer to variables ts, epoch, _iso_to_epoch, iso_first, iso_last, extra, inv, inv_first, inv_last, tool_invocations, ti).
🧹 Nitpick comments (1)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (1)
96-104: Use Google-style docstrings for public API surfaces.
ATIFTrajectorySpanExporterandconvert()docstrings are NumPy-style (Parameters/Returnssections), but repository guidance requires Google-style docstrings.As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command".
Also applies to: 116-129
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 96 - 104, The ATIFTrajectorySpanExporter class and its convert() method use NumPy-style docstrings; change them to Google-style docstrings for public API surfaces per repository guidance. Update the class docstring for ATIFTrajectorySpanExporter to use a one-line summary, an optional longer description, and Google-style Args (span_prefix) and Attributes if needed; similarly replace the convert() docstring's Parameters/Returns sections with Google-style Args and Returns blocks, keeping types and descriptions intact. Ensure both docstrings are formatted consistently with other public classes/functions and that any references to NAT_SPAN_PREFIX or default values remain clear in the new docstrings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 590-594: You override span.context.trace_id for spans in sub_spans
but do not update the stored workflow trace attribute, causing inconsistent
metadata; inside the same loop that sets span.context.trace_id = parent_trace_id
(iterating sub_spans), also overwrite the span attribute keyed by
f"{span_prefix}.workflow.trace_id" (or the literal key used in the code) to
parent_trace_id so the span's attributes and context remain consistent — update
the attribute on the same span object (use span.attributes or span.tags
depending on implementation) after changing span.context.trace_id.
- Around line 47-49: The _iso_to_epoch function currently lets
datetime.fromisoformat(timestamp) raise ValueError and abort processing; wrap
the parse in a try/except that catches ValueError and TypeError, log a warning
that includes the offending timestamp (so you can trace malformed inputs) and
return a non-fatal sentinel (e.g. float('nan') or None) so the caller can
continue processing the rest of the file; update only _iso_to_epoch to handle
the error and include the timestamp in the log message for easier debugging.
---
Duplicate comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 394-399: The code is hard-indexing tc["function_name"] when
building tool_span which can raise if tc is malformed; update the call sites
that pass tc to _create_tool_span (e.g., where tool_span =
self._create_tool_span(... tool_name=tc["function_name"], ...) and the similar
block at the later occurrence) to first validate that tc is a dict and contains
a non-empty string "function_name" (and optionally that "arguments" is a dict),
and if validation fails log a warning and skip/continue that tool call instead
of indexing it; ensure you reference tc, _create_tool_span, tool_name and
tool_args when adding the guard so malformed payloads do not abort the entire
export.
- Around line 192-199: Change the broad except Exception around
AtifStepExtra.model_validate to catch only pydantic.ValidationError: import
ValidationError from pydantic, replace "except Exception:" with "except
ValidationError:" in the block that calls
AtifStepExtra.model_validate(extra_raw); keep the existing handling (capture
last_agent_msg via _is_terminal_agent_step/_message_to_str, logger.debug and
continue) and allow other unexpected exceptions to propagate so converter bugs
are not silently swallowed.
- Around line 148-160: The code calls self._trajectory_time_bounds(steps)
assuming every element in steps is a dict; first validate and sanitize entries
in convert() before any .get() access by checking that steps is a list of dicts
(and optionally contains expected keys/timestamp fields) and either filter out
non-dict/malformed entries or return [] and log an error with session_id; update
the logic around the steps variable (used by _trajectory_time_bounds,
span_lookup, delegation_refs, etc.) so only validated dict entries are passed to
_trajectory_time_bounds and the rest of the export flow.
- Around line 684-702: The code in atif_trajectory_exporter.py uses truthy
checks that drop valid zero timestamps; change checks to explicit "is not None"
for timestamp fields: verify ts is not None before calling _iso_to_epoch and
comparing to iso_first/iso_last, and replace
inv.get("start_timestamp")/end_timestamp and
ti.get("start_timestamp")/end_timestamp truthy checks with explicit "is not
None" checks when updating inv_first and inv_last (refer to variables ts, epoch,
_iso_to_epoch, iso_first, iso_last, extra, inv, inv_first, inv_last,
tool_invocations, ti).
---
Nitpick comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 96-104: The ATIFTrajectorySpanExporter class and its convert()
method use NumPy-style docstrings; change them to Google-style docstrings for
public API surfaces per repository guidance. Update the class docstring for
ATIFTrajectorySpanExporter to use a one-line summary, an optional longer
description, and Google-style Args (span_prefix) and Attributes if needed;
similarly replace the convert() docstring's Parameters/Returns sections with
Google-style Args and Returns blocks, keeping types and descriptions intact.
Ensure both docstrings are formatted consistently with other public
classes/functions and that any references to NAT_SPAN_PREFIX or default values
remain clear in the new docstrings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: d8367f5b-8caa-4e9c-bfcd-e9a94b02758c
📒 Files selected for processing (1)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (8)
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (2)
95-100:⚠️ Potential issue | 🟠 MajorPreserve the traceback for export failures.
This branch intentionally keeps the batch running, so
logging.error("... %s", e)drops the one thing you need for debugging the failing file: the stack trace.Suggested fix
- except Exception as e: - logging.error("Failed to export %s: %s", path, e) + except Exception: + logging.exception("Failed to export %s", path) has_failure = True continueAs per coding guidelines, "When catching and logging exceptions without re-raising, always use logger.exception() (equivalent to logger.error(exc_info=True)) to capture the full stack trace information".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` around lines 95 - 100, The except block around exporter.export(trajectory) is dropping the stack trace; replace logging.error("Failed to export %s: %s", path, e) with a call that records the full exception info (e.g., logging.exception or logging.error(..., exc_info=True)) so the traceback is preserved while still setting has_failure = True and continuing; update the except block in the export_atif_trajectory_to_phoenix.py code where exporter.export(trajectory) is called.
69-85:⚠️ Potential issue | 🟠 MajorKeep the full per-file load path inside the guarded block.
A bad input can still stop the batch here:
path.exists()/is_file()happens outside thetry, and a valid JSON payload that is not an object will hittrajectory.get(...)at Lines 83-85 withAttributeError. This should all degrade to a per-file failure and continue.Suggested fix
for path in args.files: - if not path.exists() or not path.is_file(): - logging.error("File not found or not a regular file: %s", path) - has_failure = True - continue - try: - with open(path) as f: + if not path.exists() or not path.is_file(): + raise ValueError(f"File not found or not a regular file: {path}") + + with path.open(encoding="utf-8") as f: trajectory = json.load(f) - except (PermissionError, json.JSONDecodeError) as e: - logging.error("Failed to read/parse %s: %s", path, e) + + if not isinstance(trajectory, dict): + raise ValueError(f"Top-level JSON must be an object: {path}") + + agent = trajectory.get("agent") + agent_name = agent.get("name", "unknown") if isinstance(agent, dict) else "unknown" + session_id = trajectory.get("session_id", "unknown") + steps = trajectory.get("steps", []) + num_steps = len(steps) if isinstance(steps, list) else 0 + except (OSError, ValueError, json.JSONDecodeError): + logging.exception("Failed to load %s", path) has_failure = True continue - - agent_name = trajectory.get("agent", {}).get("name", "unknown") - session_id = trajectory.get("session_id", "unknown") - num_steps = len(trajectory.get("steps", []))As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` around lines 69 - 85, The per-file processing currently checks path.exists()/is_file() outside the try, then parses JSON and accesses trajectory.get(...) which can raise AttributeError for bad payloads; move the full per-file load-and-validate logic into the try block so any file-check, open/read, json.load, and subsequent accesses (the code that computes agent_name, session_id, num_steps) are guarded, and catch PermissionError, json.JSONDecodeError, AttributeError (and optionally OSError) to log a per-file failure and continue; update the except to include these exceptions and ensure the loop for path in args.files still sets has_failure = True and continues on error.packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (6)
191-199:⚠️ Potential issue | 🟡 MinorCatch
ValidationErrorhere, not every exception.
AtifStepExtra.model_validate()should only suppress payload validation failures. CatchingExceptionalso hides unrelated converter bugs as “bad ATIF input”, which makes real regressions much harder to spot.Suggested fix
+from pydantic import ValidationError ... - except Exception: + except ValidationError: # Agent step without usable extra — capture output only if _is_terminal_agent_step(step): last_agent_msg = _message_to_str(step.get("message", ""))In Pydantic v2, what exception does `BaseModel.model_validate()` raise when validation fails?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 191 - 199, Replace the broad "except Exception" around AtifStepExtra.model_validate(extra_raw) with a targeted catch for Pydantic validation failures (catch pydantic.ValidationError from Pydantic v2) so converter bugs still surface; retain the existing handling inside the except (use _is_terminal_agent_step and _message_to_str to capture last_agent_msg, logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id")) and continue) and let any other exceptions propagate normally.
693-702:⚠️ Potential issue | 🟡 MinorUse explicit
is not Nonechecks for epoch fields.These truthy checks treat valid
0.0timestamps as missing, which can push the workflow bounds back to the ISO fallback even when authoritative invocation times are present.Suggested fix
- if inv.get("start_timestamp"): + if inv.get("start_timestamp") is not None: inv_first = min(inv_first, inv["start_timestamp"]) - if inv.get("end_timestamp"): + if inv.get("end_timestamp") is not None: inv_last = max(inv_last, inv["end_timestamp"]) @@ - if ti and ti.get("start_timestamp"): + if ti and ti.get("start_timestamp") is not None: inv_first = min(inv_first, ti["start_timestamp"]) - if ti and ti.get("end_timestamp"): + if ti and ti.get("end_timestamp") is not None: inv_last = max(inv_last, ti["end_timestamp"])🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 693 - 702, The loops updating inv_first and inv_last use truthy checks that treat 0.0 as missing; change the conditions in the block that iterates over inv entries and the subsequent loop over extra.get("tool_invocations") so they explicitly check "is not None" for start_timestamp and end_timestamp (e.g., replace "if inv.get('start_timestamp')" and "if inv.get('end_timestamp')" and the "ti" checks with "if inv.get('start_timestamp') is not None", etc.), ensuring inv_first and inv_last are updated when timestamps are zero-valued.
394-399:⚠️ Potential issue | 🟠 MajorGuard required tool-call fields before indexing them.
tc["function_name"]turns one malformed tool call into aKeyErrorthat aborts the rest of the trajectory export. This path should skip bad entries and keep the rest of the trace.Suggested fix
tc = tool_calls[idx] + if not isinstance(tc, dict): + logger.debug("Skipping non-dict tool call at index %d in trajectory %s", idx, session_id) + continue t_anc = tool_ancestry[idx] t_inv = tool_invocations[idx] if idx < len(tool_invocations) else None + tool_name = tc.get("function_name") + if not isinstance(tool_name, str) or not tool_name: + logger.debug("Skipping tool call without valid function_name at index %d in trajectory %s", idx, session_id) + continue obs_content: str | None = None if idx < len(obs_results): @@ tool_span = self._create_tool_span( ancestry=t_anc, invocation=t_inv, - tool_name=tc["function_name"], + tool_name=tool_name, tool_args=tc.get("arguments", {}), tool_output=obs_content, trace_id=trace_id,As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
Also applies to: 505-510
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 394 - 399, The code accesses tc["function_name"] and other required fields directly when building a tool span (see _create_tool_span call that passes tool_name=tc["function_name"]) which raises KeyError for malformed tool calls; update the exporter to validate/sanitize the tool-call dict (e.g., check 'function_name' exists and is a str, and validate any other required keys such as arguments/output) before calling _create_tool_span, and if a tool-call is missing required fields simply skip that entry (log a warning) so the rest of the trajectory export continues; apply the same guard logic to the other occurrence around lines 505-510 where tc fields are accessed.
181-183:⚠️ Potential issue | 🟠 MajorSkip non-dict
stepsentries before dereferencing them.
stepsis only validated as a list. A single malformed element still makesstep.get(...)raiseAttributeErrorand abort conversion for the file.As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 181 - 183, The loop over steps dereferences step with step.get("source", "") without ensuring each element is a dict; update the loop in atif_trajectory_exporter.py to skip non-dict entries by checking isinstance(step, dict) (or using a try/except AttributeError) before calling step.get, logging or continuing when an item is malformed so the conversion continues for the rest of the file; specifically modify the for step in steps: block to guard access to step.get("source", "") and similar dereferences.
591-603:⚠️ Potential issue | 🟡 MinorKeep trace metadata in sync when reusing the parent trace id.
These spans get
span.context.trace_id = parent_trace_id, but their stored${span_prefix}.workflow.trace_idattribute still points at the old generated sub-trace. That leaves the same span carrying two different trace ids.Suggested fix
for span in sub_spans: if span.context: span.context.trace_id = parent_trace_id + span.set_attribute( + f"{self._span_prefix}.workflow.trace_id", + f"{parent_trace_id:032x}", + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 591 - 603, When you overwrite span.context.trace_id for sub_spans, also update the stored workflow trace id attribute so the span doesn't carry two different trace ids: find where sub_spans are modified (the loop that sets span.context.trace_id = parent_trace_id) and update the span's attribute that holds the generated sub-trace id (the `${span_prefix}.workflow.trace_id` entry) to the new parent_trace_id (or remove it) for each span; likewise when linking the root_sub_span to parent_tool_span (root_sub_span.parent = parent_tool_span.model_copy()), ensure you set root_sub_span.context.trace_id and update or replace the same `${span_prefix}.workflow.trace_id` attribute on root_sub_span to keep metadata in sync.
47-49:⚠️ Potential issue | 🟠 MajorMake ISO timestamp parsing non-fatal.
_iso_to_epoch()is used by both root-bound calculation and individual span creation. One malformedtimestampcurrently raisesValueErrorand drops the whole trajectory instead of just skipping that timestamp.As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py` around lines 47 - 49, The _iso_to_epoch function currently raises on malformed ISO strings; change it to be non-fatal by making it return Optional[float] (or float|None), wrapping datetime.fromisoformat(...) in a try/except catching ValueError/TypeError (and possibly OverflowError), and return None on parse failure; then update all callers (the root-bound calculation code and the individual span creation paths that call _iso_to_epoch) to check for None and skip that timestamp instead of aborting the whole trajectory. Ensure the function and its type hints/annotations are updated (refer to _iso_to_epoch) and add a small unit test or input validation comment to document the new behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py`:
- Around line 71-83: The conversion step (self._converter.convert) currently
runs outside the try/except so conversion errors bypass the export error
boundary; move or wrap the call to self._converter.convert(trajectory_data) (and
subsequent convert_spans_to_otel_batch and span resource-setting) inside the
existing try that uses dangerously_using_project and calls
self._http_exporter.export, so any exceptions from _converter.convert,
convert_spans_to_otel_batch, or span.set_resource are caught and logged by the
export error handling path (which surrounds dangerously_using_project and
_http_exporter.export); update references to nat_spans and otel_spans
accordingly so they are defined inside that try block and preserve the
early-return behavior when nat_spans is falsy by performing the check inside the
try.
---
Duplicate comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 191-199: Replace the broad "except Exception" around
AtifStepExtra.model_validate(extra_raw) with a targeted catch for Pydantic
validation failures (catch pydantic.ValidationError from Pydantic v2) so
converter bugs still surface; retain the existing handling inside the except
(use _is_terminal_agent_step and _message_to_str to capture last_agent_msg,
logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id"))
and continue) and let any other exceptions propagate normally.
- Around line 693-702: The loops updating inv_first and inv_last use truthy
checks that treat 0.0 as missing; change the conditions in the block that
iterates over inv entries and the subsequent loop over
extra.get("tool_invocations") so they explicitly check "is not None" for
start_timestamp and end_timestamp (e.g., replace "if inv.get('start_timestamp')"
and "if inv.get('end_timestamp')" and the "ti" checks with "if
inv.get('start_timestamp') is not None", etc.), ensuring inv_first and inv_last
are updated when timestamps are zero-valued.
- Around line 394-399: The code accesses tc["function_name"] and other required
fields directly when building a tool span (see _create_tool_span call that
passes tool_name=tc["function_name"]) which raises KeyError for malformed tool
calls; update the exporter to validate/sanitize the tool-call dict (e.g., check
'function_name' exists and is a str, and validate any other required keys such
as arguments/output) before calling _create_tool_span, and if a tool-call is
missing required fields simply skip that entry (log a warning) so the rest of
the trajectory export continues; apply the same guard logic to the other
occurrence around lines 505-510 where tc fields are accessed.
- Around line 181-183: The loop over steps dereferences step with
step.get("source", "") without ensuring each element is a dict; update the loop
in atif_trajectory_exporter.py to skip non-dict entries by checking
isinstance(step, dict) (or using a try/except AttributeError) before calling
step.get, logging or continuing when an item is malformed so the conversion
continues for the rest of the file; specifically modify the for step in steps:
block to guard access to step.get("source", "") and similar dereferences.
- Around line 591-603: When you overwrite span.context.trace_id for sub_spans,
also update the stored workflow trace id attribute so the span doesn't carry two
different trace ids: find where sub_spans are modified (the loop that sets
span.context.trace_id = parent_trace_id) and update the span's attribute that
holds the generated sub-trace id (the `${span_prefix}.workflow.trace_id` entry)
to the new parent_trace_id (or remove it) for each span; likewise when linking
the root_sub_span to parent_tool_span (root_sub_span.parent =
parent_tool_span.model_copy()), ensure you set root_sub_span.context.trace_id
and update or replace the same `${span_prefix}.workflow.trace_id` attribute on
root_sub_span to keep metadata in sync.
- Around line 47-49: The _iso_to_epoch function currently raises on malformed
ISO strings; change it to be non-fatal by making it return Optional[float] (or
float|None), wrapping datetime.fromisoformat(...) in a try/except catching
ValueError/TypeError (and possibly OverflowError), and return None on parse
failure; then update all callers (the root-bound calculation code and the
individual span creation paths that call _iso_to_epoch) to check for None and
skip that timestamp instead of aborting the whole trajectory. Ensure the
function and its type hints/annotations are updated (refer to _iso_to_epoch) and
add a small unit test or input validation comment to document the new behavior.
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 95-100: The except block around exporter.export(trajectory) is
dropping the stack trace; replace logging.error("Failed to export %s: %s", path,
e) with a call that records the full exception info (e.g., logging.exception or
logging.error(..., exc_info=True)) so the traceback is preserved while still
setting has_failure = True and continuing; update the except block in the
export_atif_trajectory_to_phoenix.py code where exporter.export(trajectory) is
called.
- Around line 69-85: The per-file processing currently checks
path.exists()/is_file() outside the try, then parses JSON and accesses
trajectory.get(...) which can raise AttributeError for bad payloads; move the
full per-file load-and-validate logic into the try block so any file-check,
open/read, json.load, and subsequent accesses (the code that computes
agent_name, session_id, num_steps) are guarded, and catch PermissionError,
json.JSONDecodeError, AttributeError (and optionally OSError) to log a per-file
failure and continue; update the except to include these exceptions and ensure
the loop for path in args.files still sets has_failure = True and continues on
error.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 531a0aab-86c7-4a3a-b1f2-f7628477ad04
📒 Files selected for processing (6)
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/__init__.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/README.mdpackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/__init__.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (1)
- packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/README.md
There was a problem hiding this comment.
♻️ Duplicate comments (3)
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (3)
32-32:⚠️ Potential issue | 🟡 MinorAdd a Google-style docstring to
main().
main()is a public CLI entrypoint and currently undocumented at Line 32.As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` at line 32, Add a Google-style docstring to the public CLI entrypoint function main(): describe what the command does (high-level purpose), document parameters (none) under Args, state the return value (None) under Returns, and note any side-effects or exceptions under Raises (e.g., SystemExit or IOErrors) as applicable; place the docstring immediately inside the main() function definition so tools and linters can pick it up and ensure it follows the project’s Google-style formatting conventions.
71-83:⚠️ Potential issue | 🟠 MajorHarden file-read error handling to avoid mid-batch aborts.
At Line 78-80,
open()can still raiseFileNotFoundError/OSErrorafter the pre-check at Line 72, which will terminate the run instead of continuing to the next file.Proposed fix
- if not path.exists() or not path.is_file(): - logging.error("File not found or not a regular file: %s", path) - has_failure = True - continue - try: - with open(path) as f: + if not path.exists() or not path.is_file(): + raise ValueError(f"File not found or not a regular file: {path}") + + with path.open(encoding="utf-8") as f: trajectory = json.load(f) - except (PermissionError, json.JSONDecodeError) as e: - logging.error("Failed to read/parse %s: %s", path, e) + except (OSError, ValueError, json.JSONDecodeError): + logging.exception("Failed to read/parse %s", path) has_failure = True continueAs per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` around lines 71 - 83, The file-read loop over args.files can still raise FileNotFoundError/OSError when calling open() even after the exists/is_file pre-check, causing the whole run to abort; modify the try/except around the open/json.load in export_atif_trajectory_to_phoenix.py so it catches OSError (and FileNotFoundError) in addition to PermissionError and json.JSONDecodeError, log the error with the path and exception, set has_failure = True, and continue to the next path to avoid mid-batch aborts; keep references to args.files, the with open(path) block, json.load, and has_failure when applying the change.
97-101:⚠️ Potential issue | 🟠 MajorNarrow export exception handling and preserve traceback.
At Line 99-100,
except Exceptionpluslogging.error(..., e)drops stack traces and violates the project exception-logging rule. Uselogging.exception(...)and prefer a narrower exception type from the exporter layer.Proposed fix
- try: - exporter.export(trajectory) - except Exception as e: - logging.error("Failed to export %s: %s", path, e) + try: + exporter.export(trajectory) + except Exception: # replace with concrete exporter exception type if available + logging.exception("Failed to export %s", path) has_failure = True continue#!/bin/bash # Verify the current violation and traceback-logging pattern. rg -n -C2 'except Exception|logging\.error\("Failed to export' packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py ruff check packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py --select BLE001As per coding guidelines, "When catching and logging exceptions without re-raising, always use logger.exception() (equivalent to logger.error(exc_info=True)) to capture the full stack trace".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py` around lines 97 - 101, The current blanket except Exception around exporter.export(trajectory) swallows traceback; change it to catch the specific exporter-level exception (e.g., ExportError or ExporterError exported by the exporter module) and call logging.exception(...) to preserve the stack trace; specifically replace the block around exporter.export(trajectory) so it reads like: try: exporter.export(trajectory) except ExportError as e: logging.exception("Failed to export %s", path) has_failure = True — if the exporter has no specific exception type, add one (e.g., class ExportError(Exception)) in the exporter module and raise it from export so this narrower handler can be used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Line 32: Add a Google-style docstring to the public CLI entrypoint function
main(): describe what the command does (high-level purpose), document parameters
(none) under Args, state the return value (None) under Returns, and note any
side-effects or exceptions under Raises (e.g., SystemExit or IOErrors) as
applicable; place the docstring immediately inside the main() function
definition so tools and linters can pick it up and ensure it follows the
project’s Google-style formatting conventions.
- Around line 71-83: The file-read loop over args.files can still raise
FileNotFoundError/OSError when calling open() even after the exists/is_file
pre-check, causing the whole run to abort; modify the try/except around the
open/json.load in export_atif_trajectory_to_phoenix.py so it catches OSError
(and FileNotFoundError) in addition to PermissionError and json.JSONDecodeError,
log the error with the path and exception, set has_failure = True, and continue
to the next path to avoid mid-batch aborts; keep references to args.files, the
with open(path) block, json.load, and has_failure when applying the change.
- Around line 97-101: The current blanket except Exception around
exporter.export(trajectory) swallows traceback; change it to catch the specific
exporter-level exception (e.g., ExportError or ExporterError exported by the
exporter module) and call logging.exception(...) to preserve the stack trace;
specifically replace the block around exporter.export(trajectory) so it reads
like: try: exporter.export(trajectory) except ExportError as e:
logging.exception("Failed to export %s", path) has_failure = True — if the
exporter has no specific exception type, add one (e.g., class
ExportError(Exception)) in the exporter module and raise it from export so this
narrower handler can be used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 78b32f0b-068e-4ffd-8f8a-b32b7aa25c32
📒 Files selected for processing (6)
docs/source/improve-workflows/finetuning/dpo_with_nemo_customizer.mdexamples/finetuning/dpo_tic_tac_toe/README.mdpackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/__init__.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/__init__.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.pypackages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (4)
- docs/source/improve-workflows/finetuning/dpo_with_nemo_customizer.md
- packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/init.py
- examples/finetuning/dpo_tic_tac_toe/README.md
- packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
- packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
…ectory-exporter Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
…#1874) * This notebook was installing the `nvidia-nat-profiling` package which was dropped in v1.5, causing the notebook to install nat v1.3 * Replace the model with a nano model to avoid being rate limited during the eval steps. * Update `migration-guide.md` to fix profiler installation instructions. * Replace broken documentation links (unrelated link check errors found in CI) - I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md). - We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license. - Any contribution which contains commits that are not Signed-Off will not be accepted. - When the PR is ready for review, new or existing tests cover these changes. - When the PR is ready for review, the documentation is up to date with these changes. * **Documentation** * Updated NeMo Customizer links in the finetuning guide; revised packaging/install guidance to replace the profiling extra with profiler and document the eval/profiler split. * **Examples** * Updated Microservices setup link and cleaned README whitespace/formatting. * **Notebooks** * Switched profiling extra name to profiler and added eval where applicable; updated generated model/config defaults (model selection and token limits) and bumped notebook kernel/Python metadata. Authors: - David Gardner (https://github.com/dagardner-nv) Approvers: - https://github.com/mnajafian-nv - Bryan Bednarski (https://github.com/bbednarski9) URL: NVIDIA#1874 Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
/merge |
Description
Implement
ATIFTrajectoryExporterto visualize the complete ATIF trajectory. This exporter/viewer is for debugging purpose only.ATIFTrajectorySpanExporterthat converts complete ATIF trajectory JSON dicts into NAT Span objects, supporting agent steps (LLM), system/pipeline steps (FUNCTION), nested tool chains, and subagent trajectoriesATIFTrajectoryPhoenixExporterthat wraps the converter to batch-export spans to a Phoenix server via OpenTelemetry HTTPexport_atif_trajectory_to_phoenix.py) for exporting one or more ATIF JSON files to Phoenix with configurable optionsBy Submitting this PR I confirm:
Summary by CodeRabbit
New Features
Documentation