Skip to content

feat: add ATIF rollout ingestion#495

Merged
eric-tramel merged 5 commits intomainfrom
codex/issue-493-atif
Apr 6, 2026
Merged

feat: add ATIF rollout ingestion#495
eric-tramel merged 5 commits intomainfrom
codex/issue-493-atif

Conversation

@eric-tramel
Copy link
Copy Markdown
Contributor

📋 Summary

Add AgentRolloutFormat.ATIF support for standalone 1-file-1-rollout JSON trajectories in AgentRolloutSeedSource.

This keeps the existing Claude Code/Codex ingestion model intact: each imported rollout becomes one normalized seed row with messages, rollout metadata, and derived summary fields. For ATIF specifically, path is required and the default file pattern is *.json.

🔄 Changes

✨ Added

🔧 Changed

🧪 Tests

Usage

from data_designer import AgentRolloutFormat, AgentRolloutSeedSource, DataDesigner

seed_source = AgentRolloutSeedSource(
    format=AgentRolloutFormat.ATIF,
    path="/path/to/atif-traces",
)
uv run docs/assets/recipes/trace_ingestion/agent_rollout_distillation.py \
    --format atif \
    --trace-dir /path/to/atif-traces \
    --num-records 20

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

Test Plan

  • uv run pytest tests/config/test_seed_source.py in packages/data-designer-config
  • uv run pytest tests/engine/resources/agent_rollout/test_atif.py tests/engine/resources/test_seed_reader.py in packages/data-designer-engine
  • uv run pytest tests/interface/test_data_designer.py in packages/data-designer

Closes #493


🤖 Generated with AI

- add an ATIF rollout handler for standalone JSON trajectories
- require explicit paths for ATIF while keeping Claude and Codex defaults
- cover config, reader, interface, and recipe usage updates

Closes #493
@eric-tramel eric-tramel requested a review from a team as a code owner April 6, 2026 13:55
@eric-tramel eric-tramel self-assigned this Apr 6, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 6, 2026

Greptile Summary

This PR adds AgentRolloutFormat.ATIF support to AgentRolloutSeedSource, enabling ingestion of standalone 1-file-1-rollout JSON trajectory files produced by ATIF-compatible agents. The implementation follows the same normalized message/tool-call semantics used by the existing Claude Code and Codex handlers, with the key difference that ATIF has no implicit home-directory default and requires an explicit path.

  • New AtifAgentRolloutFormatHandler in atif.py parses ATIF trajectories into NormalizedAgentRolloutRecord, handling message normalization, tool-call validation, observation-to-tool-message conversion, and subagent reference collection
  • AgentRolloutFormat.ATIF added to the enum; get_agent_rollout_format_defaults returns (None, "*.json") for ATIF (no default path, .json file pattern)
  • AgentRolloutSeedSource gains a model_validator that rejects construction without path when the format defines no default path, with a redundant guard in runtime_path as an additional safety net
  • ATIF handler registered in BUILTIN_AGENT_ROLLOUT_FORMAT_HANDLERS alongside Claude Code and Codex
  • Comprehensive tests cover config validation, sequential step-ID enforcement, tool-call/observation parsing, seed reader hydration, and an end-to-end dataset creation path

Confidence Score: 5/5

This PR is safe to merge — the implementation is clean, well-scoped, and follows existing handler patterns exactly.

No logic errors, correctness issues, or bugs were found. The ATIF handler mirrors the established handler contract, validation is enforced at config construction time via model_validator, and the PR ships thorough unit, integration, and end-to-end tests covering happy paths and error cases.

No files require special attention. atif.py is the main new file and was reviewed thoroughly with no issues found.

Important Files Changed

Filename Overview
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/atif.py New ATIF format handler; parses standalone JSON trajectories into normalized rollout records with correct role mapping, sequential step_id validation, tool-call/observation linking, and subagent-ref collection
packages/data-designer-config/src/data_designer/config/seed_source.py Adds ATIF to AgentRolloutFormat enum, returns (None, '*.json') defaults, and adds model_validator to enforce explicit path requirement at construction time
packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/registry.py Registers AtifAgentRolloutFormatHandler alongside existing Claude Code and Codex handlers
packages/data-designer-engine/tests/engine/resources/agent_rollout/test_atif.py Comprehensive handler tests covering happy path, non-object payload, non-sequential step IDs, assistant-only field rejection, and observation validation
packages/data-designer-config/tests/config/test_seed_source.py Adds config-level tests for ATIF path requirement and default file pattern
packages/data-designer-engine/tests/engine/resources/test_seed_reader.py Integration test verifying ATIF JSON files are hydrated correctly into seed reader records
packages/data-designer/tests/interface/test_data_designer.py E2E test for dataset creation with ATIF seed source, verifying source_kind, cwd, and git_branch
docs/assets/recipes/trace_ingestion/agent_rollout_distillation.py Adds early validation requiring --trace-dir when --format atif and updated help text
docs/concepts/seed-datasets.md Documents ATIF usage, explicit-path requirement, and .json file pattern
docs/recipes/trace_ingestion/agent_rollout_distillation.md Updates recipe description to include atif format and its --trace-dir requirement
docs/recipes/cards.md Adds ATIF to the list of formats covered by the rollout distillation recipe

Sequence Diagram

sequenceDiagram
    actor User
    participant Config as AgentRolloutSeedSource
    participant Engine as Engine / Seed Reader
    participant Handler as AtifAgentRolloutFormatHandler
    participant Utils as parse helpers

    User->>Config: AgentRolloutSeedSource(format=ATIF, path="/traces")
    Config->>Config: model_validator: get_agent_rollout_format_defaults(ATIF)<br/>→ default_path=None
    alt path is None AND default_path is None
        Config-->>User: raises ValueError "path is required for format 'atif'"
    else path provided
        Config-->>User: AgentRolloutSeedSource instance
    end

    User->>Engine: build dataset / ingest seed source
    loop for each *.json file under root_path
        Engine->>Handler: parse_file(root_path, relative_path)
        Handler->>Utils: load_atif_payload(file_path)
        Utils-->>Handler: validated payload dict
        Handler->>Handler: extract schema_version, session_id, agent, steps

        loop for each step in steps
            Handler->>Utils: normalize_atif_role(step.source)
            Utils-->>Handler: message_role (assistant/user/system)
            Handler->>Utils: validate_atif_step_fields(role, raw_step)
            Note over Utils: rejects tool_calls/observation/reasoning_content<br/>on non-assistant steps
            Handler->>Utils: normalize_atif_tool_calls(step.tool_calls)
            Utils-->>Handler: normalized tool-call list
            Handler->>Utils: build_message(role, content, tool_calls)
            Utils-->>Handler: message dict appended to messages[]

            alt step has observation
                Handler->>Utils: normalize_atif_observation_messages(obs, valid_tool_call_ids)
                loop for each observation result
                    Utils->>Utils: validate source_call_id in valid_tool_call_ids
                    Utils->>Utils: build tool message or collect subagent_ref
                end
                Utils-->>Handler: tool messages appended to messages[]
            end
        end

        Handler->>Utils: build_atif_source_meta(payload, agent, steps, refs)
        Utils-->>Handler: source_meta dict
        Handler-->>Engine: [NormalizedAgentRolloutRecord]
    end
    Engine-->>User: dataset / seed rows
Loading

Reviews (4): Last reviewed commit: "docs: clarify ATIF path requirement" | Re-trigger Greptile

- reject non-sequential step ids before normalizing traces
- reject assistant-only fields on user/system ATIF steps
- require observation tool results to reference declared tool calls
@eric-tramel
Copy link
Copy Markdown
Contributor Author

Updated this PR to tighten ATIF ingestion semantics around malformed rollout data.

Changes made:

  • Reject ATIF files whose step_ids are not sequential starting from 1, so we do not silently normalize scrambled conversations into training rows.
  • Reject reasoning_content, tool_calls, and observation on non-agent steps, which keeps assistant-only fields from leaking onto normalized user/system messages.
  • Require observation tool results with content to include a source_call_id, and validate any source_call_id against the tool calls declared on that step. This prevents orphaned tool outputs from being ingested with broken assistant/tool linkage.
  • Added regression coverage for each invalid shape above while keeping the valid happy path intact.

Why:
The ingestion boundary should reject malformed ATIF rather than normalize it permissively, because these traces become seed rows for downstream distillation and the message ordering/linkage is part of the training signal.

Verification:

  • uv run pytest tests/engine/resources/agent_rollout/test_atif.py tests/engine/resources/test_seed_reader.py
  • uv run pytest tests/interface/test_data_designer.py -k trace_seed_sources
  • uv run ruff check packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/atif.py packages/data-designer-engine/tests/engine/resources/agent_rollout/test_atif.py

@eric-tramel
Copy link
Copy Markdown
Contributor Author

Manual ATIF smoke test I just ran against this branch (#495), using real trajectories from Hugging Face rather than synthetic fixtures.

What I used:

  • obaydata/mcp-agent-trajectory-benchmark on main
    • accountant/trajectory.json
  • harborframework/parity-experiments pinned to revision f832b477bd90d361e7372224b74411e76b2dc02a
    • adapters/skillsbench/harbor_parity/2026-02-20__10-00-00/jax-computing-basics__BMake9F/agent/trajectory.json

Procedure:

  1. Downloaded both ATIF files with huggingface_hub.
  2. Copied them into a temp directory as standalone .json trajectory files.
  3. Ran plain ATIF ingestion with AgentRolloutSeedSource(path=<tmpdir>, format=dd.AgentRolloutFormat.ATIF) and a simple expression column over final_assistant_message.
  4. Ran the published distillation recipe in preview mode against the same temp directory with rollout_format=ATIF, num_records=1, and model_alias='nvidia-super'.

Observed results:

  • Both downloaded files parsed successfully as ATIF (schema_version=ATIF-v1.2).
  • Plain ingestion preview succeeded with seed dataset size: 2 and produced 2 normalized rows.
  • The two imported traces were:
    • Harbor: trace_id=bf737818-4eb1-43dd-abf2-a8e9770cd891, message_count=16, tool_call_count=7
    • obaydata: trace_id=mt-fin2-accountant__4TAccTyw, message_count=16, tool_call_count=4
  • The full distillation recipe preview also completed successfully on the ATIF input.
  • For the 1-row recipe preview I got:
    • trace_id=bf737818-4eb1-43dd-abf2-a8e9770cd891
    • source_kind=atif
    • trace_training_value=high
    • recommended_for_sft=true
  • Model-backed preview completed with 3 successful requests and 0 failures.

One harness-specific note: because I imported the recipe module ad hoc via importlib instead of executing it normally, I had to register the module in sys.modules and call model_rebuild() on the recipe Pydantic models before building the config. That was only needed for the ad hoc harness path; the ATIF ingestion itself worked cleanly.

Net: this branch handled real external ATIF trajectories successfully both through raw ingestion and through the published recipe workflow.

nabinchha
nabinchha previously approved these changes Apr 6, 2026
Copy link
Copy Markdown
Contributor

@nabinchha nabinchha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for putting this together, @eric-tramel — here is a focused review of PR #495.

Summary

The PR adds AgentRolloutFormat.ATIF with a dedicated handler, config validation (explicit path, default *.json), registry wiring, tests at config/reader/interface layers, and recipe/docs updates. The implementation matches the stated intent: one JSON file → one normalized rollout row, reuse of shared message/tool semantics, and stricter ATIF validation (sequential step_id, role-specific fields, observation ↔ tool-call linkage). Ruff passes on all changed Python files in this branch.

Findings

Critical — Let's fix these before merge

(none)

Warnings — Worth addressing

packages/data-designer-config/src/data_designer/config/seed_source.pyAgentRolloutSeedSource.path field description

  • What: The path field description explains defaults for Claude Code and Codex when omitted, but does not state that ATIF requires an explicit path (even though validate_runtime_path_source enforces it).
  • Why: Anyone reading generated JSON Schema or docstrings from the model may assume omission is always OK if they only read the field text, not the validator error.
  • Suggestion: Extend the description with a short clause such as: “atif requires an explicit directory path; home-directory defaults apply only to claude_code and codex.”

Suggestions — Take it or leave it

packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/atif.pystarted_at / ended_at from string timestamps

  • What: started_at and ended_at use min(timestamps) / max(timestamps) on raw strings.
  • Why: For ISO-8601 strings in a consistent format this behaves as expected; mixed formats or locale-specific timestamps could produce lexicographic min/max that do not match chronological order.
  • Suggestion: If you expect heterogeneous timestamp strings in the wild, consider documenting that assumption in the handler docstring, or parsing to datetime when normalization is required.

packages/data-designer-engine/src/data_designer/engine/resources/agent_rollout/atif.py — Sequential step_id

  • What: Steps must satisfy step_id == 1-based list index.
  • Why: This is clearly intentional and tested; if some ATIF producers emit non-contiguous or differently ordered IDs while still being valid per the ATIF spec, those files would be rejected.
  • Suggestion: If the public ATIF spec guarantees 1-based contiguous IDs in file order, a one-line comment citing that would help future maintainers; if not, consider whether reordering/sorting is worth a follow-up.

What Looks Good

  • Layering: Config stays declarative; parsing lives in the engine handler; the reader already normalizes AgentRolloutSeedParseError / OSError consistently with other rollout formats.
  • Validation depth: The follow-up commit tightening observation ↔ tool_call_id linkage and rejecting assistant-only fields on non-agent steps materially reduces silent garbage in normalized rows.
  • Test coverage: Dedicated test_atif.py, config tests for path/pattern, seed reader hydration with multiple JSON files, and an E2E DataDesigner case give a sensible safety net for a new format.

Verdict

Ship it (with nits) — Addressing the path field description for ATIF would remove the only “worth fixing before merge” documentation gap; everything else is optional polish.


This review was generated by an AI assistant.

@eric-tramel eric-tramel merged commit 58870bb into main Apr 6, 2026
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ATIF (Agent Trace Interchange Format) support to AgentRolloutSeedSource

2 participants