Skip to content

Add ATIF trajectory exporter for Phoenix visualization and debugging#1869

Merged
rapids-bot[bot] merged 14 commits intoNVIDIA:developfrom
yczhang-nv:yuchen-atif-trajectory-exporter
Apr 16, 2026
Merged

Add ATIF trajectory exporter for Phoenix visualization and debugging#1869
rapids-bot[bot] merged 14 commits intoNVIDIA:developfrom
yczhang-nv:yuchen-atif-trajectory-exporter

Conversation

@yczhang-nv
Copy link
Copy Markdown
Contributor

@yczhang-nv yczhang-nv commented Apr 14, 2026

Description

Implement ATIFTrajectoryExporter to visualize the complete ATIF trajectory. This exporter/viewer is for debugging purpose only.

  • Add ATIFTrajectorySpanExporter that converts complete ATIF trajectory JSON dicts into NAT Span objects, supporting agent steps (LLM), system/pipeline steps (FUNCTION), nested tool chains, and subagent trajectories
  • Add ATIFTrajectoryPhoenixExporter that wraps the converter to batch-export spans to a Phoenix server via OpenTelemetry HTTP
  • Add CLI script (export_atif_trajectory_to_phoenix.py) for exporting one or more ATIF JSON files to Phoenix with configurable options

By Submitting this PR I confirm:

  • I am familiar with the Contributing Guidelines.
  • We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
    • Any contribution which contains commits that are not Signed-Off will not be accepted.
  • When the PR is ready for review, new or existing tests cover these changes.
  • When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

  • New Features

    • Added ATIF trajectory export to Phoenix, converting trajectories into OpenTelemetry-compatible spans.
    • New CLI to export one or multiple ATIF JSON files with endpoint, project, and verbose options.
  • Documentation

    • Added a comprehensive README documenting setup, running a Phoenix server, export CLI usage, span hierarchy expectations, and behavior notes for trajectory processing.

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
@yczhang-nv yczhang-nv self-assigned this Apr 14, 2026
@yczhang-nv yczhang-nv added feature request New feature or request non-breaking Non-breaking change labels Apr 14, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 14, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely describes the main change: adding ATIF trajectory export functionality for Phoenix visualization, matching the core objective of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 82.86% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Timed out fetching pipeline failures after 30000ms


Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
@yczhang-nv yczhang-nv changed the title Implement ATIF trajectory exporter/viewer Add batch ATIF trajectory exporter for Phoenix visualization and debugging Apr 15, 2026
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
@yczhang-nv yczhang-nv changed the title Add batch ATIF trajectory exporter for Phoenix visualization and debugging Add ATIF trajectory exporter for Phoenix visualization and debugging Apr 15, 2026
@yczhang-nv yczhang-nv marked this pull request as ready for review April 15, 2026 20:42
@yczhang-nv yczhang-nv requested a review from a team as a code owner April 15, 2026 20:42
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 684-692: In _trajectory_time_bounds(), prefer invocation
timestamps per-boundary instead of the current all-or-nothing check: compute
has_inv_first = (inv_first != float("inf")) and has_inv_last = (inv_last !=
float("inf")), set first_ts = inv_first if has_inv_first else iso_first and
last_ts = inv_last if has_inv_last else iso_last, then normalize missing values
(e.g., if first_ts == float("inf") set first_ts = 0.0; if last_ts is
missing/invalid — float("inf") or 0.0 — set last_ts = first_ts) so start and end
fallback independently while keeping the existing normalization logic.
- Around line 512-513: The current logic sets start_epoch = 0.0 when
start_timestamp is missing, producing huge bogus spans; change it so start_epoch
is not pinned to the Unix epoch: compute end_epoch = invocation.end_timestamp if
invocation and invocation.end_timestamp is not None else None, and set
start_epoch = invocation.start_timestamp if invocation and
invocation.start_timestamp is not None else (end_epoch if end_epoch is not None
else None) so that a missing start with a present end yields a zero-length or
end-anchored span instead of 0.0; update any downstream code that assumes
numeric start_epoch to handle None accordingly (references: start_epoch,
end_epoch, invocation in atif_trajectory_exporter.py).
- Around line 130-132: In convert(), validate and sanitize trajectory_data
before accessing fields: check that trajectory_data is a dict and contains a
non-empty string "session_id", an "agent" dict with a non-empty string "name",
and "steps" is a list (possibly empty) of dicts; if any validation fails, log a
clear error including the file identifier and skip/return early instead of
letting KeyError/TypeError propagate. Use the existing variable names
session_id, agent_name, steps and the convert() entry point to locate where to
add these checks and error logging.
- Around line 173-180: The try/except around
AtifStepExtra.model_validate(extra_raw) is too broad; replace the generic
"except Exception" with catching pydantic's ValidationError only: import
ValidationError from pydantic and use "except ValidationError:" so validation
failures are handled as intended while other unexpected errors still surface;
keep the existing handling (checking _is_terminal_agent_step, setting
last_agent_msg via _message_to_str, and logger.debug) unchanged and located in
the except block for AtifStepExtra.model_validate.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py`:
- Around line 81-91: The except block around the call to
self._http_exporter.export(...) currently logs and swallows exceptions; change
it to log with logger.error (not logger.exception) and then re-raise the caught
exception so the CLI fails properly: inside the try/except that wraps
dangerously_using_project(self._project) and the call to
self._http_exporter.export(otel_spans), replace the existing except Exception:
logger.exception(...) swallow with an except Exception: logger.error("Error
exporting trajectory spans to Phoenix: %s", e) (or similar) followed by a bare
raise to preserve the original traceback.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 67-88: Replace the current per-file handling inside the for path
in args.files loop with a narrow try/except around each file read/parse/export
so one bad input doesn't abort the batch: for each path, first check
path.exists() and path.is_file(), then open and json.load the file and call
exporter.export(trajectory) inside a try block that catches FileNotFoundError,
PermissionError, json.JSONDecodeError and any exporter exceptions; log an error
for that specific path and continue to the next file, and after the loop exit
with a non-zero status if any file failed (track a failure flag) so the CLI
returns failure but still processes all inputs.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md`:
- Around line 85-97: The fenced code block in README.md (the workflow ASCII
diagram) lacks a language tag which triggers MD040; update the fence opening
(the triple backticks before the WORKFLOW diagram) to include a language
identifier such as text (e.g., replace "```" with "```text") so the block is
explicitly annotated; ensure the closing fence remains "```" and no other
content changes in the WORKFLOW diagram.
- Around line 74-79: The fenced code block containing the ATIF Trajectory
diagram needs a language annotation to satisfy markdownlint; edit the README.md
fenced block that currently starts with ``` and change it to include a language
(for example, ```text) so the block becomes ```text followed by the diagram
lines (ATIF Trajectory (dict) -> ATIFTrajectorySpanExporter.convert() ->
convert_spans_to_otel_batch() -> HTTPSpanExporter.export()) and closing ```.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 9887c7d4-5532-4271-be3a-96bfe43dba78

📥 Commits

Reviewing files that changed from the base of the PR and between cbb5fed and ce748a4.

📒 Files selected for processing (5)
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/__init__.py
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (1)

75-81: ⚠️ Potential issue | 🟠 Major

Handle OSError on file read to keep batch processing resilient.

Between Line 70 and Line 76 there is still a TOCTOU window; open() can fail with FileNotFoundError/other OSError and currently escape the handler.

Proposed fix
-        try:
-            with open(path) as f:
+        try:
+            with path.open(encoding="utf-8") as f:
                 trajectory = json.load(f)
-        except (PermissionError, json.JSONDecodeError) as e:
+        except (OSError, json.JSONDecodeError) as e:
             logging.error("Failed to read/parse %s: %s", path, e)
             has_failure = True
             continue

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
around lines 75 - 81, The try/except around reading JSON only catches
PermissionError and json.JSONDecodeError but misses other OSError subclasses
(e.g., FileNotFoundError, race conditions) raised by open(); update the
exception handling in the block that opens and json.loads the file in
export_atif_trajectory_to_phoenix.py (the try using open(path) and json.load) to
also catch OSError (or FileNotFoundError) so the code logs the error via
logging.error("Failed to read/parse %s: %s", path, e), sets has_failure = True,
and continues, preserving batch resiliency.
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (1)

192-199: ⚠️ Potential issue | 🟠 Major

Narrow the catch at Line 194 to ValidationError only.

Catching Exception here can silently hide converter defects unrelated to payload validation and skip spans unexpectedly. Keep validation fallback behavior, but catch only Pydantic validation failures.

Proposed fix
+from pydantic import ValidationError
...
-            except Exception:
+            except ValidationError:
                 # Agent step without usable extra — capture output only
                 if _is_terminal_agent_step(step):
                     last_agent_msg = _message_to_str(step.get("message", ""))
                 logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id"))
                 continue
#!/bin/bash
# Verify broad exception handling around model_validate remains/narrows correctly.
rg -n -C2 'AtifStepExtra\.model_validate|except Exception|ValidationError' \
  packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py

As per coding guidelines, "Use ruff via ruff check --fix as configured in pyproject.toml; fix warnings unless explicitly ignored in pyproject.toml".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 192 - 199, Narrow the broad except in the
AtifStepExtra.model_validate block to catch only Pydantic validation failures:
replace the generic "except Exception" with "except ValidationError" (import
ValidationError from pydantic) so that conversion errors are still handled the
same way (preserve the fallback setting of last_agent_msg via
_is_terminal_agent_step and the logger.debug skip) but other unrelated
exceptions are not swallowed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 394-399: The code is indexing tool call payloads with
tc["function_name"] which can raise KeyError on malformed entries; update uses
of tc[...] (e.g., where tool_span is created via _create_tool_span and other
similar spots handling tool_calls) to safely access fields with
tc.get("function_name", "<unknown>") (or skip/continue when critical fields are
missing), log a warning when required fields are absent, and ensure the loop
continues rather than raising—apply the same defensive access pattern to
arguments, output, and any other tc["..."] occurrences in this module.
- Around line 696-705: The current truthy checks treat numeric epoch 0.0 as
missing; update the timestamp presence checks in atif_trajectory_exporter.py to
use explicit None checks: replace conditions like if inv.get("start_timestamp")
and if inv.get("end_timestamp") with if inv.get("start_timestamp") is not None
and if inv.get("end_timestamp") is not None, and similarly change the tool
invocation checks to if ti is not None and ti.get("start_timestamp") is not None
and if ti is not None and ti.get("end_timestamp") is not None so inv_first and
inv_last correctly include zero epochs for inv and extra["tool_invocations"].
- Around line 181-183: The loop iterating over steps (for step in steps) assumes
each element is a dict and calls step.get("source", ""), which will raise
AttributeError for malformed items; update the loop in the
export_trajectory_to_phoenix code (or the function containing the loop) to
validate each item before using .get — e.g., if not isinstance(step, dict):
optionally log or warn and continue, or coerce step = {} — then safely call
source = step.get("source", "") so non-dict entries are skipped/handled without
raising.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Line 30: Add a Google-style docstring to the public CLI entrypoint function
main() that briefly describes its purpose (exports ATIF trajectories to
Phoenix), documents parameters (none) and return value (None), and notes any
side effects or exceptions (e.g., calls external services, may exit on failure);
place the docstring immediately under the def main(): signature in
export_atif_trajectory_to_phoenix.py so tools and users can discover the CLI
behavior.
- Around line 95-100: The current broad except block around
exporter.export(trajectory) logs errors with logging.error and loses the
traceback; change the handler to log the full exception context (use
logging.exception or logging.error(..., exc_info=True)) so stack traces are
preserved when catching exceptions in the export loop; update the except block
that references exporter.export, logging.error, has_failure and path to call
logging.exception("Failed to export %s", path) (or logging.error with
exc_info=True) and leave the has_failure and continue logic intact.

---

Duplicate comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 192-199: Narrow the broad except in the
AtifStepExtra.model_validate block to catch only Pydantic validation failures:
replace the generic "except Exception" with "except ValidationError" (import
ValidationError from pydantic) so that conversion errors are still handled the
same way (preserve the fallback setting of last_agent_msg via
_is_terminal_agent_step and the logger.debug skip) but other unrelated
exceptions are not swallowed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 75-81: The try/except around reading JSON only catches
PermissionError and json.JSONDecodeError but misses other OSError subclasses
(e.g., FileNotFoundError, race conditions) raised by open(); update the
exception handling in the block that opens and json.loads the file in
export_atif_trajectory_to_phoenix.py (the try using open(path) and json.load) to
also catch OSError (or FileNotFoundError) so the code logs the error via
logging.error("Failed to read/parse %s: %s", path, e), sets has_failure = True,
and continues, preserving batch resiliency.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 31025cb3-627e-4b00-bbd2-bb7f5cbfadff

📥 Commits

Reviewing files that changed from the base of the PR and between ce748a4 and 4823960.

📒 Files selected for processing (4)
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (1)
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (4)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (4)

394-399: ⚠️ Potential issue | 🟠 Major

Avoid hard-indexing malformed tool call payloads.

tc["function_name"] can raise and stop export for the whole trajectory. Guard type + required field, then continue.

Suggested fix
                 tc = tool_calls[idx]
+                if not isinstance(tc, dict):
+                    logger.debug("Skipping non-dict tool call at index %d: %s", idx, type(tc).__name__)
+                    continue
+                tool_name = tc.get("function_name")
+                if not isinstance(tool_name, str) or not tool_name:
+                    logger.debug("Skipping tool call with invalid function_name at index %d", idx)
+                    continue
                 t_anc = tool_ancestry[idx]
                 t_inv = tool_invocations[idx] if idx < len(tool_invocations) else None
...
                 tool_span = self._create_tool_span(
                     ancestry=t_anc,
                     invocation=t_inv,
-                    tool_name=tc["function_name"],
+                    tool_name=tool_name,
                     tool_args=tc.get("arguments", {}),

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

Also applies to: 505-510

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 394 - 399, The code is hard-indexing tc["function_name"] when
building tool_span which can raise if tc is malformed; update the call sites
that pass tc to _create_tool_span (e.g., where tool_span =
self._create_tool_span(... tool_name=tc["function_name"], ...) and the similar
block at the later occurrence) to first validate that tc is a dict and contains
a non-empty string "function_name" (and optionally that "arguments" is a dict),
and if validation fails log a warning and skip/continue that tool call instead
of indexing it; ensure you reference tc, _create_tool_span, tool_name and
tool_args when adding the guard so malformed payloads do not abort the entire
export.

192-199: ⚠️ Potential issue | 🟡 Minor

Catch ValidationError instead of Exception around model_validate().

Line 194 swallows unexpected converter bugs and silently skips spans. Restrict this to Pydantic validation failures only.

Suggested fix
+from pydantic import ValidationError
...
-            except Exception:
+            except ValidationError:
                 # Agent step without usable extra — capture output only
                 if _is_terminal_agent_step(step):
                     last_agent_msg = _message_to_str(step.get("message", ""))
                 logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id"))
                 continue
#!/bin/bash
# Verify broad exception handling around model_validate in this module.
rg -n -C2 "model_validate|except Exception|except ValidationError" packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py

# Confirm installed pydantic version in the environment used for checks.
python - <<'PY'
import pydantic
print("pydantic", pydantic.__version__)
PY
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 192 - 199, Change the broad except Exception around
AtifStepExtra.model_validate to catch only pydantic.ValidationError: import
ValidationError from pydantic, replace "except Exception:" with "except
ValidationError:" in the block that calls
AtifStepExtra.model_validate(extra_raw); keep the existing handling (capture
last_agent_msg via _is_terminal_agent_step/_message_to_str, logger.debug and
continue) and allow other unexpected exceptions to propagate so converter bugs
are not silently swallowed.

148-160: ⚠️ Potential issue | 🟠 Major

Validate steps entries before any .get() access.

convert() validates that steps is a list, but Line 159 passes it to _trajectory_time_bounds(), which assumes each item is a dict. A malformed element will raise AttributeError and abort the export.

Suggested fix
-        steps = trajectory_data.get("steps")
-        if not isinstance(steps, list):
+        steps_raw = trajectory_data.get("steps")
+        if not isinstance(steps_raw, list):
             logger.error("Trajectory %s: 'steps' is not a list (got %r), skipping", session_id, type(steps).__name__)
             return []
+        steps = [step for step in steps_raw if isinstance(step, dict)]
+        if len(steps) != len(steps_raw):
+            logger.debug(
+                "Trajectory %s: skipped %d non-dict step entries",
+                session_id,
+                len(steps_raw) - len(steps),
+            )

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 148 - 160, The code calls self._trajectory_time_bounds(steps)
assuming every element in steps is a dict; first validate and sanitize entries
in convert() before any .get() access by checking that steps is a list of dicts
(and optionally contains expected keys/timestamp fields) and either filter out
non-dict/malformed entries or return [] and log an error with session_id; update
the logic around the steps variable (used by _trajectory_time_bounds,
span_lookup, delegation_refs, etc.) so only validated dict entries are passed to
_trajectory_time_bounds and the rest of the export flow.

684-702: ⚠️ Potential issue | 🟡 Minor

Use explicit is not None checks for epoch fields.

Current truthy checks drop valid 0.0 timestamps, which can skew workflow boundaries.

Suggested fix
-                if inv.get("start_timestamp"):
+                if inv.get("start_timestamp") is not None:
                     inv_first = min(inv_first, inv["start_timestamp"])
-                if inv.get("end_timestamp"):
+                if inv.get("end_timestamp") is not None:
                     inv_last = max(inv_last, inv["end_timestamp"])
...
-                if ti and ti.get("start_timestamp"):
+                if ti and ti.get("start_timestamp") is not None:
                     inv_first = min(inv_first, ti["start_timestamp"])
-                if ti and ti.get("end_timestamp"):
+                if ti and ti.get("end_timestamp") is not None:
                     inv_last = max(inv_last, ti["end_timestamp"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 684 - 702, The code in atif_trajectory_exporter.py uses truthy
checks that drop valid zero timestamps; change checks to explicit "is not None"
for timestamp fields: verify ts is not None before calling _iso_to_epoch and
comparing to iso_first/iso_last, and replace
inv.get("start_timestamp")/end_timestamp and
ti.get("start_timestamp")/end_timestamp truthy checks with explicit "is not
None" checks when updating inv_first and inv_last (refer to variables ts, epoch,
_iso_to_epoch, iso_first, iso_last, extra, inv, inv_first, inv_last,
tool_invocations, ti).
🧹 Nitpick comments (1)
packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (1)

96-104: Use Google-style docstrings for public API surfaces.

ATIFTrajectorySpanExporter and convert() docstrings are NumPy-style (Parameters / Returns sections), but repository guidance requires Google-style docstrings.

As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command".

Also applies to: 116-129

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 96 - 104, The ATIFTrajectorySpanExporter class and its convert()
method use NumPy-style docstrings; change them to Google-style docstrings for
public API surfaces per repository guidance. Update the class docstring for
ATIFTrajectorySpanExporter to use a one-line summary, an optional longer
description, and Google-style Args (span_prefix) and Attributes if needed;
similarly replace the convert() docstring's Parameters/Returns sections with
Google-style Args and Returns blocks, keeping types and descriptions intact.
Ensure both docstrings are formatted consistently with other public
classes/functions and that any references to NAT_SPAN_PREFIX or default values
remain clear in the new docstrings.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 590-594: You override span.context.trace_id for spans in sub_spans
but do not update the stored workflow trace attribute, causing inconsistent
metadata; inside the same loop that sets span.context.trace_id = parent_trace_id
(iterating sub_spans), also overwrite the span attribute keyed by
f"{span_prefix}.workflow.trace_id" (or the literal key used in the code) to
parent_trace_id so the span's attributes and context remain consistent — update
the attribute on the same span object (use span.attributes or span.tags
depending on implementation) after changing span.context.trace_id.
- Around line 47-49: The _iso_to_epoch function currently lets
datetime.fromisoformat(timestamp) raise ValueError and abort processing; wrap
the parse in a try/except that catches ValueError and TypeError, log a warning
that includes the offending timestamp (so you can trace malformed inputs) and
return a non-fatal sentinel (e.g. float('nan') or None) so the caller can
continue processing the rest of the file; update only _iso_to_epoch to handle
the error and include the timestamp in the log message for easier debugging.

---

Duplicate comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 394-399: The code is hard-indexing tc["function_name"] when
building tool_span which can raise if tc is malformed; update the call sites
that pass tc to _create_tool_span (e.g., where tool_span =
self._create_tool_span(... tool_name=tc["function_name"], ...) and the similar
block at the later occurrence) to first validate that tc is a dict and contains
a non-empty string "function_name" (and optionally that "arguments" is a dict),
and if validation fails log a warning and skip/continue that tool call instead
of indexing it; ensure you reference tc, _create_tool_span, tool_name and
tool_args when adding the guard so malformed payloads do not abort the entire
export.
- Around line 192-199: Change the broad except Exception around
AtifStepExtra.model_validate to catch only pydantic.ValidationError: import
ValidationError from pydantic, replace "except Exception:" with "except
ValidationError:" in the block that calls
AtifStepExtra.model_validate(extra_raw); keep the existing handling (capture
last_agent_msg via _is_terminal_agent_step/_message_to_str, logger.debug and
continue) and allow other unexpected exceptions to propagate so converter bugs
are not silently swallowed.
- Around line 148-160: The code calls self._trajectory_time_bounds(steps)
assuming every element in steps is a dict; first validate and sanitize entries
in convert() before any .get() access by checking that steps is a list of dicts
(and optionally contains expected keys/timestamp fields) and either filter out
non-dict/malformed entries or return [] and log an error with session_id; update
the logic around the steps variable (used by _trajectory_time_bounds,
span_lookup, delegation_refs, etc.) so only validated dict entries are passed to
_trajectory_time_bounds and the rest of the export flow.
- Around line 684-702: The code in atif_trajectory_exporter.py uses truthy
checks that drop valid zero timestamps; change checks to explicit "is not None"
for timestamp fields: verify ts is not None before calling _iso_to_epoch and
comparing to iso_first/iso_last, and replace
inv.get("start_timestamp")/end_timestamp and
ti.get("start_timestamp")/end_timestamp truthy checks with explicit "is not
None" checks when updating inv_first and inv_last (refer to variables ts, epoch,
_iso_to_epoch, iso_first, iso_last, extra, inv, inv_first, inv_last,
tool_invocations, ti).

---

Nitpick comments:
In
`@packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 96-104: The ATIFTrajectorySpanExporter class and its convert()
method use NumPy-style docstrings; change them to Google-style docstrings for
public API surfaces per repository guidance. Update the class docstring for
ATIFTrajectorySpanExporter to use a one-line summary, an optional longer
description, and Google-style Args (span_prefix) and Attributes if needed;
similarly replace the convert() docstring's Parameters/Returns sections with
Google-style Args and Returns blocks, keeping types and descriptions intact.
Ensure both docstrings are formatted consistently with other public
classes/functions and that any references to NAT_SPAN_PREFIX or default values
remain clear in the new docstrings.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: d8367f5b-8caa-4e9c-bfcd-e9a94b02758c

📥 Commits

Reviewing files that changed from the base of the PR and between 4823960 and 079b37d.

📒 Files selected for processing (1)
  • packages/nvidia_nat_atif/src/nat/atif/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py

@yczhang-nv yczhang-nv marked this pull request as draft April 15, 2026 23:34
@yczhang-nv yczhang-nv marked this pull request as ready for review April 15, 2026 23:48
Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (8)
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (2)

95-100: ⚠️ Potential issue | 🟠 Major

Preserve the traceback for export failures.

This branch intentionally keeps the batch running, so logging.error("... %s", e) drops the one thing you need for debugging the failing file: the stack trace.

Suggested fix
-        except Exception as e:
-            logging.error("Failed to export %s: %s", path, e)
+        except Exception:
+            logging.exception("Failed to export %s", path)
             has_failure = True
             continue

As per coding guidelines, "When catching and logging exceptions without re-raising, always use logger.exception() (equivalent to logger.error(exc_info=True)) to capture the full stack trace information".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
around lines 95 - 100, The except block around exporter.export(trajectory) is
dropping the stack trace; replace logging.error("Failed to export %s: %s", path,
e) with a call that records the full exception info (e.g., logging.exception or
logging.error(..., exc_info=True)) so the traceback is preserved while still
setting has_failure = True and continuing; update the except block in the
export_atif_trajectory_to_phoenix.py code where exporter.export(trajectory) is
called.

69-85: ⚠️ Potential issue | 🟠 Major

Keep the full per-file load path inside the guarded block.

A bad input can still stop the batch here: path.exists()/is_file() happens outside the try, and a valid JSON payload that is not an object will hit trajectory.get(...) at Lines 83-85 with AttributeError. This should all degrade to a per-file failure and continue.

Suggested fix
     for path in args.files:
-        if not path.exists() or not path.is_file():
-            logging.error("File not found or not a regular file: %s", path)
-            has_failure = True
-            continue
-
         try:
-            with open(path) as f:
+            if not path.exists() or not path.is_file():
+                raise ValueError(f"File not found or not a regular file: {path}")
+
+            with path.open(encoding="utf-8") as f:
                 trajectory = json.load(f)
-        except (PermissionError, json.JSONDecodeError) as e:
-            logging.error("Failed to read/parse %s: %s", path, e)
+
+            if not isinstance(trajectory, dict):
+                raise ValueError(f"Top-level JSON must be an object: {path}")
+
+            agent = trajectory.get("agent")
+            agent_name = agent.get("name", "unknown") if isinstance(agent, dict) else "unknown"
+            session_id = trajectory.get("session_id", "unknown")
+            steps = trajectory.get("steps", [])
+            num_steps = len(steps) if isinstance(steps, list) else 0
+        except (OSError, ValueError, json.JSONDecodeError):
+            logging.exception("Failed to load %s", path)
             has_failure = True
             continue
-
-        agent_name = trajectory.get("agent", {}).get("name", "unknown")
-        session_id = trajectory.get("session_id", "unknown")
-        num_steps = len(trajectory.get("steps", []))

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
around lines 69 - 85, The per-file processing currently checks
path.exists()/is_file() outside the try, then parses JSON and accesses
trajectory.get(...) which can raise AttributeError for bad payloads; move the
full per-file load-and-validate logic into the try block so any file-check,
open/read, json.load, and subsequent accesses (the code that computes
agent_name, session_id, num_steps) are guarded, and catch PermissionError,
json.JSONDecodeError, AttributeError (and optionally OSError) to log a per-file
failure and continue; update the except to include these exceptions and ensure
the loop for path in args.files still sets has_failure = True and continues on
error.
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py (6)

191-199: ⚠️ Potential issue | 🟡 Minor

Catch ValidationError here, not every exception.

AtifStepExtra.model_validate() should only suppress payload validation failures. Catching Exception also hides unrelated converter bugs as “bad ATIF input”, which makes real regressions much harder to spot.

Suggested fix
+from pydantic import ValidationError
...
-            except Exception:
+            except ValidationError:
                 # Agent step without usable extra — capture output only
                 if _is_terminal_agent_step(step):
                     last_agent_msg = _message_to_str(step.get("message", ""))
In Pydantic v2, what exception does `BaseModel.model_validate()` raise when validation fails?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 191 - 199, Replace the broad "except Exception" around
AtifStepExtra.model_validate(extra_raw) with a targeted catch for Pydantic
validation failures (catch pydantic.ValidationError from Pydantic v2) so
converter bugs still surface; retain the existing handling inside the except
(use _is_terminal_agent_step and _message_to_str to capture last_agent_msg,
logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id"))
and continue) and let any other exceptions propagate normally.

693-702: ⚠️ Potential issue | 🟡 Minor

Use explicit is not None checks for epoch fields.

These truthy checks treat valid 0.0 timestamps as missing, which can push the workflow bounds back to the ISO fallback even when authoritative invocation times are present.

Suggested fix
-                if inv.get("start_timestamp"):
+                if inv.get("start_timestamp") is not None:
                     inv_first = min(inv_first, inv["start_timestamp"])
-                if inv.get("end_timestamp"):
+                if inv.get("end_timestamp") is not None:
                     inv_last = max(inv_last, inv["end_timestamp"])
@@
-                if ti and ti.get("start_timestamp"):
+                if ti and ti.get("start_timestamp") is not None:
                     inv_first = min(inv_first, ti["start_timestamp"])
-                if ti and ti.get("end_timestamp"):
+                if ti and ti.get("end_timestamp") is not None:
                     inv_last = max(inv_last, ti["end_timestamp"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 693 - 702, The loops updating inv_first and inv_last use truthy
checks that treat 0.0 as missing; change the conditions in the block that
iterates over inv entries and the subsequent loop over
extra.get("tool_invocations") so they explicitly check "is not None" for
start_timestamp and end_timestamp (e.g., replace "if inv.get('start_timestamp')"
and "if inv.get('end_timestamp')" and the "ti" checks with "if
inv.get('start_timestamp') is not None", etc.), ensuring inv_first and inv_last
are updated when timestamps are zero-valued.

394-399: ⚠️ Potential issue | 🟠 Major

Guard required tool-call fields before indexing them.

tc["function_name"] turns one malformed tool call into a KeyError that aborts the rest of the trajectory export. This path should skip bad entries and keep the rest of the trace.

Suggested fix
                 tc = tool_calls[idx]
+                if not isinstance(tc, dict):
+                    logger.debug("Skipping non-dict tool call at index %d in trajectory %s", idx, session_id)
+                    continue
                 t_anc = tool_ancestry[idx]
                 t_inv = tool_invocations[idx] if idx < len(tool_invocations) else None
+                tool_name = tc.get("function_name")
+                if not isinstance(tool_name, str) or not tool_name:
+                    logger.debug("Skipping tool call without valid function_name at index %d in trajectory %s", idx, session_id)
+                    continue
 
                 obs_content: str | None = None
                 if idx < len(obs_results):
@@
                 tool_span = self._create_tool_span(
                     ancestry=t_anc,
                     invocation=t_inv,
-                    tool_name=tc["function_name"],
+                    tool_name=tool_name,
                     tool_args=tc.get("arguments", {}),
                     tool_output=obs_content,
                     trace_id=trace_id,

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

Also applies to: 505-510

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 394 - 399, The code accesses tc["function_name"] and other required
fields directly when building a tool span (see _create_tool_span call that
passes tool_name=tc["function_name"]) which raises KeyError for malformed tool
calls; update the exporter to validate/sanitize the tool-call dict (e.g., check
'function_name' exists and is a str, and validate any other required keys such
as arguments/output) before calling _create_tool_span, and if a tool-call is
missing required fields simply skip that entry (log a warning) so the rest of
the trajectory export continues; apply the same guard logic to the other
occurrence around lines 505-510 where tc fields are accessed.

181-183: ⚠️ Potential issue | 🟠 Major

Skip non-dict steps entries before dereferencing them.

steps is only validated as a list. A single malformed element still makes step.get(...) raise AttributeError and abort conversion for the file.

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 181 - 183, The loop over steps dereferences step with
step.get("source", "") without ensuring each element is a dict; update the loop
in atif_trajectory_exporter.py to skip non-dict entries by checking
isinstance(step, dict) (or using a try/except AttributeError) before calling
step.get, logging or continuing when an item is malformed so the conversion
continues for the rest of the file; specifically modify the for step in steps:
block to guard access to step.get("source", "") and similar dereferences.

591-603: ⚠️ Potential issue | 🟡 Minor

Keep trace metadata in sync when reusing the parent trace id.

These spans get span.context.trace_id = parent_trace_id, but their stored ${span_prefix}.workflow.trace_id attribute still points at the old generated sub-trace. That leaves the same span carrying two different trace ids.

Suggested fix
         for span in sub_spans:
             if span.context:
                 span.context.trace_id = parent_trace_id
+                span.set_attribute(
+                    f"{self._span_prefix}.workflow.trace_id",
+                    f"{parent_trace_id:032x}",
+                )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 591 - 603, When you overwrite span.context.trace_id for sub_spans,
also update the stored workflow trace id attribute so the span doesn't carry two
different trace ids: find where sub_spans are modified (the loop that sets
span.context.trace_id = parent_trace_id) and update the span's attribute that
holds the generated sub-trace id (the `${span_prefix}.workflow.trace_id` entry)
to the new parent_trace_id (or remove it) for each span; likewise when linking
the root_sub_span to parent_tool_span (root_sub_span.parent =
parent_tool_span.model_copy()), ensure you set root_sub_span.context.trace_id
and update or replace the same `${span_prefix}.workflow.trace_id` attribute on
root_sub_span to keep metadata in sync.

47-49: ⚠️ Potential issue | 🟠 Major

Make ISO timestamp parsing non-fatal.

_iso_to_epoch() is used by both root-bound calculation and individual span creation. One malformed timestamp currently raises ValueError and drops the whole trajectory instead of just skipping that timestamp.

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`
around lines 47 - 49, The _iso_to_epoch function currently raises on malformed
ISO strings; change it to be non-fatal by making it return Optional[float] (or
float|None), wrapping datetime.fromisoformat(...) in a try/except catching
ValueError/TypeError (and possibly OverflowError), and return None on parse
failure; then update all callers (the root-bound calculation code and the
individual span creation paths that call _iso_to_epoch) to check for None and
skip that timestamp instead of aborting the whole trajectory. Ensure the
function and its type hints/annotations are updated (refer to _iso_to_epoch) and
add a small unit test or input validation comment to document the new behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py`:
- Around line 71-83: The conversion step (self._converter.convert) currently
runs outside the try/except so conversion errors bypass the export error
boundary; move or wrap the call to self._converter.convert(trajectory_data) (and
subsequent convert_spans_to_otel_batch and span resource-setting) inside the
existing try that uses dangerously_using_project and calls
self._http_exporter.export, so any exceptions from _converter.convert,
convert_spans_to_otel_batch, or span.set_resource are caught and logged by the
export error handling path (which surrounds dangerously_using_project and
_http_exporter.export); update references to nat_spans and otel_spans
accordingly so they are defined inside that try block and preserve the
early-return behavior when nat_spans is falsy by performing the check inside the
try.

---

Duplicate comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py`:
- Around line 191-199: Replace the broad "except Exception" around
AtifStepExtra.model_validate(extra_raw) with a targeted catch for Pydantic
validation failures (catch pydantic.ValidationError from Pydantic v2) so
converter bugs still surface; retain the existing handling inside the except
(use _is_terminal_agent_step and _message_to_str to capture last_agent_msg,
logger.debug("Skipping step %s: no valid AtifStepExtra", step.get("step_id"))
and continue) and let any other exceptions propagate normally.
- Around line 693-702: The loops updating inv_first and inv_last use truthy
checks that treat 0.0 as missing; change the conditions in the block that
iterates over inv entries and the subsequent loop over
extra.get("tool_invocations") so they explicitly check "is not None" for
start_timestamp and end_timestamp (e.g., replace "if inv.get('start_timestamp')"
and "if inv.get('end_timestamp')" and the "ti" checks with "if
inv.get('start_timestamp') is not None", etc.), ensuring inv_first and inv_last
are updated when timestamps are zero-valued.
- Around line 394-399: The code accesses tc["function_name"] and other required
fields directly when building a tool span (see _create_tool_span call that
passes tool_name=tc["function_name"]) which raises KeyError for malformed tool
calls; update the exporter to validate/sanitize the tool-call dict (e.g., check
'function_name' exists and is a str, and validate any other required keys such
as arguments/output) before calling _create_tool_span, and if a tool-call is
missing required fields simply skip that entry (log a warning) so the rest of
the trajectory export continues; apply the same guard logic to the other
occurrence around lines 505-510 where tc fields are accessed.
- Around line 181-183: The loop over steps dereferences step with
step.get("source", "") without ensuring each element is a dict; update the loop
in atif_trajectory_exporter.py to skip non-dict entries by checking
isinstance(step, dict) (or using a try/except AttributeError) before calling
step.get, logging or continuing when an item is malformed so the conversion
continues for the rest of the file; specifically modify the for step in steps:
block to guard access to step.get("source", "") and similar dereferences.
- Around line 591-603: When you overwrite span.context.trace_id for sub_spans,
also update the stored workflow trace id attribute so the span doesn't carry two
different trace ids: find where sub_spans are modified (the loop that sets
span.context.trace_id = parent_trace_id) and update the span's attribute that
holds the generated sub-trace id (the `${span_prefix}.workflow.trace_id` entry)
to the new parent_trace_id (or remove it) for each span; likewise when linking
the root_sub_span to parent_tool_span (root_sub_span.parent =
parent_tool_span.model_copy()), ensure you set root_sub_span.context.trace_id
and update or replace the same `${span_prefix}.workflow.trace_id` attribute on
root_sub_span to keep metadata in sync.
- Around line 47-49: The _iso_to_epoch function currently raises on malformed
ISO strings; change it to be non-fatal by making it return Optional[float] (or
float|None), wrapping datetime.fromisoformat(...) in a try/except catching
ValueError/TypeError (and possibly OverflowError), and return None on parse
failure; then update all callers (the root-bound calculation code and the
individual span creation paths that call _iso_to_epoch) to check for None and
skip that timestamp instead of aborting the whole trajectory. Ensure the
function and its type hints/annotations are updated (refer to _iso_to_epoch) and
add a small unit test or input validation comment to document the new behavior.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Around line 95-100: The except block around exporter.export(trajectory) is
dropping the stack trace; replace logging.error("Failed to export %s: %s", path,
e) with a call that records the full exception info (e.g., logging.exception or
logging.error(..., exc_info=True)) so the traceback is preserved while still
setting has_failure = True and continuing; update the except block in the
export_atif_trajectory_to_phoenix.py code where exporter.export(trajectory) is
called.
- Around line 69-85: The per-file processing currently checks
path.exists()/is_file() outside the try, then parses JSON and accesses
trajectory.get(...) which can raise AttributeError for bad payloads; move the
full per-file load-and-validate logic into the try block so any file-check,
open/read, json.load, and subsequent accesses (the code that computes
agent_name, session_id, num_steps) are guarded, and catch PermissionError,
json.JSONDecodeError, AttributeError (and optionally OSError) to log a per-file
failure and continue; update the except to include these exceptions and ensure
the loop for path in args.files still sets has_failure = True and continues on
error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 531a0aab-86c7-4a3a-b1f2-f7628477ad04

📥 Commits

Reviewing files that changed from the base of the PR and between 079b37d and 58a5129.

📒 Files selected for processing (6)
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/__init__.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/README.md
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/__init__.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_exporter.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (1)
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/README.md

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (3)
packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py (3)

32-32: ⚠️ Potential issue | 🟡 Minor

Add a Google-style docstring to main().

main() is a public CLI entrypoint and currently undocumented at Line 32.

As per coding guidelines, "Provide Google-style docstrings for every public module, class, function and CLI command".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
at line 32, Add a Google-style docstring to the public CLI entrypoint function
main(): describe what the command does (high-level purpose), document parameters
(none) under Args, state the return value (None) under Returns, and note any
side-effects or exceptions under Raises (e.g., SystemExit or IOErrors) as
applicable; place the docstring immediately inside the main() function
definition so tools and linters can pick it up and ensure it follows the
project’s Google-style formatting conventions.

71-83: ⚠️ Potential issue | 🟠 Major

Harden file-read error handling to avoid mid-batch aborts.

At Line 78-80, open() can still raise FileNotFoundError/OSError after the pre-check at Line 72, which will terminate the run instead of continuing to the next file.

Proposed fix
-        if not path.exists() or not path.is_file():
-            logging.error("File not found or not a regular file: %s", path)
-            has_failure = True
-            continue
-
         try:
-            with open(path) as f:
+            if not path.exists() or not path.is_file():
+                raise ValueError(f"File not found or not a regular file: {path}")
+
+            with path.open(encoding="utf-8") as f:
                 trajectory = json.load(f)
-        except (PermissionError, json.JSONDecodeError) as e:
-            logging.error("Failed to read/parse %s: %s", path, e)
+        except (OSError, ValueError, json.JSONDecodeError):
+            logging.exception("Failed to read/parse %s", path)
             has_failure = True
             continue

As per coding guidelines, "Validate and sanitise all user input, especially in web or CLI interfaces".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
around lines 71 - 83, The file-read loop over args.files can still raise
FileNotFoundError/OSError when calling open() even after the exists/is_file
pre-check, causing the whole run to abort; modify the try/except around the
open/json.load in export_atif_trajectory_to_phoenix.py so it catches OSError
(and FileNotFoundError) in addition to PermissionError and json.JSONDecodeError,
log the error with the path and exception, set has_failure = True, and continue
to the next path to avoid mid-batch aborts; keep references to args.files, the
with open(path) block, json.load, and has_failure when applying the change.

97-101: ⚠️ Potential issue | 🟠 Major

Narrow export exception handling and preserve traceback.

At Line 99-100, except Exception plus logging.error(..., e) drops stack traces and violates the project exception-logging rule. Use logging.exception(...) and prefer a narrower exception type from the exporter layer.

Proposed fix
-        try:
-            exporter.export(trajectory)
-        except Exception as e:
-            logging.error("Failed to export %s: %s", path, e)
+        try:
+            exporter.export(trajectory)
+        except Exception:  # replace with concrete exporter exception type if available
+            logging.exception("Failed to export %s", path)
             has_failure = True
             continue
#!/bin/bash
# Verify the current violation and traceback-logging pattern.
rg -n -C2 'except Exception|logging\.error\("Failed to export' packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
ruff check packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py --select BLE001

As per coding guidelines, "When catching and logging exceptions without re-raising, always use logger.exception() (equivalent to logger.error(exc_info=True)) to capture the full stack trace".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`
around lines 97 - 101, The current blanket except Exception around
exporter.export(trajectory) swallows traceback; change it to catch the specific
exporter-level exception (e.g., ExportError or ExporterError exported by the
exporter module) and call logging.exception(...) to preserve the stack trace;
specifically replace the block around exporter.export(trajectory) so it reads
like: try: exporter.export(trajectory) except ExportError as e:
logging.exception("Failed to export %s", path) has_failure = True — if the
exporter has no specific exception type, add one (e.g., class
ExportError(Exception)) in the exporter module and raise it from export so this
narrower handler can be used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In
`@packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py`:
- Line 32: Add a Google-style docstring to the public CLI entrypoint function
main(): describe what the command does (high-level purpose), document parameters
(none) under Args, state the return value (None) under Returns, and note any
side-effects or exceptions under Raises (e.g., SystemExit or IOErrors) as
applicable; place the docstring immediately inside the main() function
definition so tools and linters can pick it up and ensure it follows the
project’s Google-style formatting conventions.
- Around line 71-83: The file-read loop over args.files can still raise
FileNotFoundError/OSError when calling open() even after the exists/is_file
pre-check, causing the whole run to abort; modify the try/except around the
open/json.load in export_atif_trajectory_to_phoenix.py so it catches OSError
(and FileNotFoundError) in addition to PermissionError and json.JSONDecodeError,
log the error with the path and exception, set has_failure = True, and continue
to the next path to avoid mid-batch aborts; keep references to args.files, the
with open(path) block, json.load, and has_failure when applying the change.
- Around line 97-101: The current blanket except Exception around
exporter.export(trajectory) swallows traceback; change it to catch the specific
exporter-level exception (e.g., ExportError or ExporterError exported by the
exporter module) and call logging.exception(...) to preserve the stack trace;
specifically replace the block around exporter.export(trajectory) so it reads
like: try: exporter.export(trajectory) except ExportError as e:
logging.exception("Failed to export %s", path) has_failure = True — if the
exporter has no specific exception type, add one (e.g., class
ExportError(Exception)) in the exporter module and raise it from export so this
narrower handler can be used.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 78b32f0b-068e-4ffd-8f8a-b32b7aa25c32

📥 Commits

Reviewing files that changed from the base of the PR and between 58a5129 and 3e9103f.

📒 Files selected for processing (6)
  • docs/source/improve-workflows/finetuning/dpo_with_nemo_customizer.md
  • examples/finetuning/dpo_tic_tac_toe/README.md
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/__init__.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/__init__.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/export_atif_trajectory_to_phoenix.py
✅ Files skipped from review due to trivial changes (4)
  • docs/source/improve-workflows/finetuning/dpo_with_nemo_customizer.md
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/init.py
  • examples/finetuning/dpo_tic_tac_toe/README.md
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/init.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/nvidia_nat_phoenix/src/nat/plugins/phoenix/scripts/export_trajectory_to_phoenix/atif_trajectory_phoenix_exporter.py

yczhang-nv and others added 2 commits April 16, 2026 14:07
…ectory-exporter

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
…#1874)

* This notebook was installing the `nvidia-nat-profiling` package which was dropped in v1.5, causing the notebook to install nat v1.3
* Replace the model with a nano model to avoid being rate limited during the eval steps.
* Update `migration-guide.md` to fix profiler installation instructions.
* Replace broken documentation links (unrelated link check errors found in CI)

- I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/NeMo-Agent-Toolkit/blob/develop/docs/source/resources/contributing/index.md).
- We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
  - Any contribution which contains commits that are not Signed-Off will not be accepted.
- When the PR is ready for review, new or existing tests cover these changes.
- When the PR is ready for review, the documentation is up to date with these changes.

* **Documentation**
  * Updated NeMo Customizer links in the finetuning guide; revised packaging/install guidance to replace the profiling extra with profiler and document the eval/profiler split.

* **Examples**
  * Updated Microservices setup link and cleaned README whitespace/formatting.

* **Notebooks**
  * Switched profiling extra name to profiler and added eval where applicable; updated generated model/config defaults (model selection and token limits) and bumped notebook kernel/Python metadata.

Authors:
  - David Gardner (https://github.com/dagardner-nv)

Approvers:
  - https://github.com/mnajafian-nv
  - Bryan Bednarski (https://github.com/bbednarski9)

URL: NVIDIA#1874

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Signed-off-by: Yuchen Zhang <yuchenz@nvidia.com>
@yczhang-nv
Copy link
Copy Markdown
Contributor Author

/merge

@rapids-bot rapids-bot Bot merged commit a328e8a into NVIDIA:develop Apr 16, 2026
17 checks passed
@yczhang-nv yczhang-nv deleted the yuchen-atif-trajectory-exporter branch April 16, 2026 22:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature request New feature or request non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants