Skip to content

fix(providers): pretty-print tool args/results in dashboard events#161

Merged
jrob5756 merged 2 commits intomainfrom
fix/93-pretty-print-tool-results
May 6, 2026
Merged

fix(providers): pretty-print tool args/results in dashboard events#161
jrob5756 merged 2 commits intomainfrom
fix/93-pretty-print-tool-results

Conversation

@jrob5756
Copy link
Copy Markdown
Collaborator

@jrob5756 jrob5756 commented May 6, 2026

Closes #93

Problem

The web dashboard, JSONL event log, and console verbose mode all show tool-call activity by consuming agent_tool_start / agent_tool_complete events. Both providers were emitting raw str() output of structured SDK objects, which produced two readability problems:

  1. Copilot tool results showed the full Python repr of the SDK Result dataclass:

    Result(content='line one\\nline two', contents=None, detailed_content='...', kind=None)
    

    with literal \\n / \\r\\n escapes and doubled \\\\ on Windows paths.

  2. Tool arguments (both providers) rendered as Python dict repr ({'k': 'v'} with single quotes, doubled backslashes) instead of JSON.

Fix

New src/conductor/providers/_event_format.py module exposes two helpers used by both providers:

  • format_tool_arguments(args, max_length=500) — JSON-encodes dict-like arguments, falls back to str() for unusual shapes, truncates with a trailing .
  • extract_tool_result_text(result, max_length=500) — pulls .content (then .detailed_content) from structured Copilot Result objects, passes plain strings through unchanged (Claude provider), truncates with .

Applied at three sites:

File Site Change
copilot.py _forward_event (lines ~1317–1345) Primary fix — fixes the dashboard / JSONL display
copilot.py _log_event_verbose (lines ~1219–1252) Same fix for -v / -vv console output
claude.py agentic loop (lines ~1539–1605) Same str(dict(...)) arguments fix for provider parity

Before / after

A Windows tool result that previously rendered as:

✓ result  Result(content='Errors: MSB4276: The default SDK resolver failed...\\n"Microsoft.NET.SDK..." because directory "C:\\\\Program Files\\\\dotnet...\\\\Sdk" did not exist.\\n', contents=None, detailed_content='...', kind=None)

now renders as:

✓ result  Errors: MSB4276: The default SDK resolver failed...
          "Microsoft.NET.SDK..." because directory "C:\Program Files\dotnet...\Sdk" did not exist.

Tool arguments now serialise as JSON:

🔧 tool  {"command": "cd C:\\az\\Chaos\\services\\GW", "timeout": 60}

Truncation contract

Helpers truncate to max_length and append a single-character ellipsis when truncation occurs (so len(out) <= max_length + 1). Two existing Claude tests (test_tool_arguments_truncated, test_tool_result_truncated) updated to reflect this and to additionally assert the ellipsis is present.

Tests

  • New tests/test_providers/test_event_format.py covers: empty inputs, JSON encoding, Windows-path single-escaping, structured Result unwrapping (preferring content over detailed_content), plain-string passthrough, fallback to str() for unknown shapes, and truncation.
  • Existing tests/test_providers/test_claude_event_callback.py truncation tests updated.
  • Full suite: 2384 passed, 9 skipped (1 unrelated live Copilot integration test excluded — same as in fix(schema): forbid extra fields on AgentDef/ParallelGroup/ForEachDef/WorkflowConfig #159).
  • make lint clean.

Provider parity

Per the AGENTS.md Provider Parity rule, the agent_tool_start / agent_tool_complete event payloads are now consistent across both providers (JSON arguments, plain-text result content), so the dashboard, JSONL log, and console subscriber all render the same way regardless of provider.

Closes #93

The web dashboard, JSONL event log, and console verbose mode all show
tool-call activity by consuming `agent_tool_start` / `agent_tool_complete`
events. Both providers were emitting raw `str()` output of structured
SDK objects, producing two readability problems:

1. Copilot's tool results showed the full Python repr of the SDK
   `Result` dataclass:
       Result(content='line one\nline two', contents=None,
              detailed_content='...', kind=None)
   with literal \n / \r\n escapes and doubled \\ on Windows paths.

2. Tool arguments rendered as Python dict repr (`{'k': 'v'}` with single
   quotes, doubled backslashes) instead of JSON.

Add `providers/_event_format.py` with two helpers:

- `format_tool_arguments(args, max_length=500)` — JSON-encodes dict-like
  arguments, falls back to `str()` for unusual shapes, truncates with a
  trailing `…`.
- `extract_tool_result_text(result, max_length=500)` — pulls
  `.content` (then `.detailed_content`) from structured Copilot Result
  objects, passes plain strings through unchanged (Claude provider),
  truncates with `…`.

Apply to:

- `copilot.py` `_forward_event` — primary fix for dashboard / JSONL.
- `copilot.py` `_log_event_verbose` — same fix for the `-v`/`-vv`
  console output (tool args + tool results).
- `claude.py` agentic loop — fixes the same `str(dict(...))` arguments
  anti-pattern for provider parity.

After this change a Windows tool result like
`Result(content='Errors: ...\\nMSB4276 ...')` renders as plain
multi-line text in the dashboard activity panel.

Tests:
- New `tests/test_providers/test_event_format.py` covers JSON encoding,
  Result-object unwrapping, plain-string passthrough, truncation.
- Updated existing Claude truncation tests for the new
  `<= 500 + 1-char ellipsis` contract.
- 2384 tests pass (1 unrelated live Copilot integration test excluded).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r formatters

Addresses review feedback on PR #161:

1. **Copilot `_forward_event` integration tests** (provider parity gap):
   New tests/test_providers/test_copilot_event_callback.py mirrors the
   Claude tests, verifying the agent_tool_start / agent_tool_complete
   payloads are JSON-formatted (arguments) and unwrapped from the
   structured Result object (result), with truncation + ellipsis. Also
   covers the .args/.output attribute aliases. This prevents a regression
   where a future change to copilot.py reverts to `str(result)[:500]`
   without any test failing.

2. **Boundary tests for both helpers** in test_event_format.py:
   - At max_length: returned unchanged, no ellipsis.
   - One char below max_length: returned unchanged.
   - One char above max_length: truncated with ellipsis (total = max_length + 1).
   Catches off-by-one errors in the truncation predicate.

3. **Non-string content tests** for extract_tool_result_text:
   Documents the actual contract — when result.content is a non-string
   non-None value, the helper falls back to str(result) (NOT to
   detailed_content), because the `text_attr is None` branch isn't
   taken. Locks the behavior so a future refactor is forced to think
   about it.

4. **Docstring clarifications** in _event_format.py:
   - Both helpers now explicitly state the truncated result is
     max_length + 1 chars (the previous wording "truncated to max_length
     characters with a trailing ellipsis" was ambiguous).
   - extract_tool_result_text docstring now describes the actual
     decision order (plain string → content → detailed_content →
     str(result)) instead of the misleading "falls back to str(result)
     for plain-string results (Claude provider)" wording, which made it
     sound like Claude's strings went through the str() fallback.

No production code changes — only tests and documentation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@jrob5756 jrob5756 merged commit 13222fa into main May 6, 2026
7 checks passed
@jrob5756 jrob5756 deleted the fix/93-pretty-print-tool-results branch May 6, 2026 16:24
@jrob5756 jrob5756 mentioned this pull request May 6, 2026
jrob5756 added a commit that referenced this pull request May 6, 2026
- conductor resume flag parity with run (#158)
- reasoning effort displayed in dashboard (#160)
- iteration_limit_reached/resolved events for dashboard (#162)
- registry latest now means default branch HEAD, not newest tag (#157)
- forbid extra fields on Agent/Parallel/ForEach/Workflow schemas (#159)
- pretty-print tool args/results in dashboard events (#161)
- capture uv stdout+stderr on Windows install failure (#156)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Web Dashboard: Tool results display raw Python repr() with escaped newlines instead of readable text

1 participant