Skip to content

feat(cli): structured JSONL output for non-interactive mode (ROADMAP 7.7)#131

Merged
emal-avala merged 3 commits intomainfrom
feat/json-structured-output
Apr 16, 2026
Merged

feat(cli): structured JSONL output for non-interactive mode (ROADMAP 7.7)#131
emal-avala merged 3 commits intomainfrom
feat/json-structured-output

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

Adds --output-format json flag for CI/CD pipelines and tool-chaining. One-shot mode (-p) writes JSONL events to stdout with status messages on stderr.

agent -p "fix the tests" --output-format json | jq 'select(.type == "tool_call")'

Event types

Event Fields When
session_start session_id, model, timestamp Before first turn
text_delta content, turn LLM streams text
thinking content, turn Extended thinking
tool_call tool, input, turn Tool invocation
tool_result tool, output, is_error, turn Tool completes
turn_complete turn, input_tokens, output_tokens, cost_usd Turn ends
error message, turn Error
session_end turns, total_cost_usd, exit_code Session ends

Exit codes

0=success, 1=config, 2=input, 3=tool failure, 4=LLM error, 5=cost limit, 6=turn limit, 7=permission denied.

Architecture

  • crates/cli/src/output.rsOutputFormat enum, ExitCode enum, JsonStreamSink implementing StreamSink
  • Wired into the one-shot path in main.rs (~40 lines)
  • Validates --output-format json requires --prompt (rejects interactive mode with a clear error)
  • Warnings/compact notifications go to stderr, not the JSONL stream

Test plan

  • 8 unit tests: event serialization (all types), format parsing, single-line-JSON invariant for multiline content
  • cargo fmt --check
  • cargo clippy --all-targets -D warnings
  • Full CLI test suite (28 integration + 3 smoke) ✓
  • Manual E2E: agent -p "echo hello" --output-format json produces parseable JSONL (requires API key)

…7.7)

Adds `--output-format json` for CI/CD and tool-chaining:

  agent -p "fix tests" --output-format json \
    | jq 'select(.type == "tool_call")'

Events written as single-line JSON objects to stdout:
- session_start — session_id, model, timestamp
- text_delta — streaming LLM text with turn index
- thinking — extended thinking content
- tool_call — tool name + input object
- tool_result — tool output + is_error flag
- turn_complete — tokens, cost per turn
- error — error message
- session_end — total turns, cost, exit code

Human-readable warnings/status go to stderr so stdout is
clean JSONL that pipes directly into jq/process_results.py.

Exit codes for non-interactive mode (ExitCode enum):
0=success, 1=config, 2=input, 3=tool, 4=llm, 5=cost,
6=turns, 7=permission.

Architecture: JsonStreamSink implements StreamSink trait,
same callback interface used by the REPL and schedule runner.
Minimal surface — one new file (output.rs) plus ~40 lines of
wiring in main.rs.

Validates --output-format json requires --prompt (rejects
interactive mode with a clear error).

Tests: 8 unit tests covering event serialization (all types),
format parsing, and the single-line-JSON invariant for
multiline content.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

…integration tests

Deep review found two bugs:

1. Turn tracking was wrong — state.turn started at 0 and only
   updated in on_turn_complete (which fires only on the FINAL turn).
   All intermediate-turn events emitted turn: 0. Fix: added
   on_turn_start(turn) to the StreamSink trait (default no-op) and
   call it at the top of each query-loop iteration. JsonStreamSink
   sets state.turn immediately so all events in a turn carry the
   correct 1-indexed turn number.

2. on_tool_result used a stale last_tool_name captured in
   on_tool_start, which could be overwritten if two tools fire in
   sequence (streaming tool execution). Fix: on_tool_result now
   uses its own tool_name parameter directly.

Also removed the now-unnecessary last_tool_name field from SinkState.

Integration tests (6 new, all pass without API key):
- output_format_appears_in_help — flag shows in --help
- output_format_invalid_value_fails — "xml" rejected
- output_format_json_without_prompt_fails — clear error
- output_format_text_is_default — no format error without flag
- output_format_json_no_api_key_emits_session_events — no panic,
  any stdout is valid JSONL
- output_format_case_insensitive — "JSON" uppercase accepted
@emal-avala
Copy link
Copy Markdown
Member Author

Self-review: 2 bugs found and fixed (1499ae2)

🔴 Bug 1: Turn tracking was wrong for multi-turn agent runs

state.turn started at 0. on_turn_complete(n) sets it to n — but only fires on the FINAL turn (when there are no tool calls). All intermediate-turn events (text_delta, tool_call, tool_result) emitted turn: 0 throughout the entire multi-turn agent run.

Fix: added on_turn_start(turn: usize) to StreamSink trait (default no-op, zero impact on existing callers). Called at the top of each for turn in 0..max_turns iteration in the query loop. JsonStreamSink::on_turn_start sets state.turn immediately so all events carry the correct 1-indexed turn number.

🟠 Bug 2: on_tool_result used stale tool name

on_tool_start captured tool_name into state.last_tool_name. on_tool_result then emitted this captured name. Problem: with streaming tool execution (read-only tools fire during streaming), on_tool_start for tool B can overwrite last_tool_name before tool A's on_tool_result arrives. The tool_result event would incorrectly name tool B.

Fix: on_tool_result now uses its tool_name parameter directly. Removed the last_tool_name field entirely.

✅ Other review checks (all clean)

Area Finding
Text mode backward compat Identical code path; error propagation via ? unchanged
std::process::exit cleanup One-shot mode has no post-run cleanup; acceptable
Session save Not affected — session saves happen inside run_turn_with_sink, not after
Exit code coverage Only Success (0) and LlmError (4) are currently returned. Other codes (1-7) defined for future use — no incorrect mapping
JSONL single-line invariant Tested with multiline content — serde_json::to_string escapes \n to \\n
Mutex contention Low risk — one-shot mode is single-turn, lock held briefly per event
schedule run / serve / ACP paths Untouched, still use their own sinks

New integration tests (6, no API key required)

Test What it verifies
output_format_appears_in_help --output-format shows in --help output
output_format_invalid_value_fails --output-format xml → error with "unknown output format"
output_format_json_without_prompt_fails JSON mode without --prompt → error with "requires --prompt"
output_format_text_is_default No --output-format flag → no format parsing error
output_format_json_no_api_key_emits_session_events No panic; any stdout is valid JSONL
output_format_case_insensitive --output-format JSON accepted

Final verification

  • cargo fmt --check
  • cargo clippy --all-targets -D warnings
  • 8 unit tests (output module) ✓
  • 6 integration tests (output_format) ✓
  • 28 config_cli + 14 schedule + 3 smoke tests ✓

Verdict: safe to merge.

In CI (no API key, no TTY), the binary exits at the API key check
or panics in the setup wizard before reaching the output format
parse — so --output-format xml didn't return the expected
"unknown output format" error and --output-format json without
--prompt didn't return "requires --prompt".

Moved both checks (format parse + json-requires-prompt) to
immediately after CLI arg parsing, before config loading, setup
wizard, or API key validation. Now they fail fast regardless of
environment.
@emal-avala emal-avala merged commit a1ff9f4 into main Apr 16, 2026
13 of 14 checks passed
@emal-avala emal-avala deleted the feat/json-structured-output branch April 16, 2026 03:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant