Skip to content

fix(copilot): consolidate agent_message_chunk events into single log lines #1046

@christso

Description

@christso

Problem

When running agentv eval with a Copilot target, the stream log files are hard to read because each agent_message_chunk ACP event is written as a separate log line. A single assistant response can produce dozens of fragmented lines:

```
[+00:02] [agent_message_chunk] Let me
[+00:02] [agent_message_chunk] analyze
[+00:02] [agent_message_chunk] this code
[+00:02] [agent_message_chunk] and fix the bug
```

Root Cause

CopilotStreamLogger.handleEvent() in copilot-utils.ts emits one log line per ACP event. The summarizeAcpEvent() function in copilot-cli.ts limits chunks to 200 chars each, but does nothing to consolidate them.

Best Practice (from code-insights research)

code-insights uses a flush-on-turn-boundary pattern:

  • Accumulate assistant.message_delta / chunk events into a text buffer
  • Flush the buffer as a single complete entry only at turn boundaries (tool_call, session.idle, session.shutdown, or close())
  • Result: [+00:05] [assistant_message] Let me analyze this code and fix the bug...

Proposed Fix

Modify CopilotStreamLogger in copilot-utils.ts:

  • Add a pendingText buffer for chunk accumulation
  • In handleEvent(): buffer chunk events instead of writing immediately; flush on non-chunk events
  • In close(): flush any remaining buffered text before closing

Purely a log formatting improvement — no behavioral change to eval execution.

Acceptance Criteria

  • Single assistant response produces one [assistant_message] log line, not dozens of [agent_message_chunk] lines
  • Tool calls, session start/end still appear as separate lines
  • json format mode is unaffected (keep per-event for full fidelity)
  • Both copilot-cli and copilot-sdk providers benefit (shared CopilotStreamLogger)
  • Unit tests updated

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingin-progressClaimed by an agent — do not duplicate work

    Type

    No type

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions