-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Description
When wrapping a Copilot AIAgent in a DelegatingAIAgent that buffers all streaming updates (e.g., to capture structured tool output), sessions with heavy built-in tool usage (file reads, shell commands, git operations) intermittently fail with:
Session error: Execution failed: Error: missing finish_reason for choice 0
The same sessions succeed when using a plain agent (no DelegatingAIAgent wrapper) — even with identical prompts, models, tools, and session configuration. The issue appears to be caused by the streaming buffering pattern breaking the SDK's internal message flow during long multi-turn sessions.
Potential Root Cause
The DelegatingAIAgent.RunCoreStreamingAsync override buffers all AgentResponseUpdate items before yielding them:
// This pattern causes the bug:
protected override async IAsyncEnumerable<AgentResponseUpdate> RunCoreStreamingAsync(...)
{
List<AgentResponseUpdate> updates = [];
await foreach (var update in base.RunCoreStreamingAsync(...))
{
updates.Add(update); // Buffer ALL updates
}
// ... yield updates after buffering
}During long sessions (50+ built-in tool calls), this buffering appears to cause the Copilot CLI to mishandle the streaming response, resulting in a missing finish_reason on the final chat completion choice.
Evidence
Controlled comparison (same prompt, same model, same repo)
| Agent Type | DelegatingAIAgent? | Built-in tool calls | Result |
|---|---|---|---|
| Worker (plain agent) | ❌ No | 24 permission requests | ✅ Success |
| Planner (DelegatingAIAgent) | ✅ Yes | 0 permission requests | ✅ Success |
| Reviewer (DelegatingAIAgent) | ✅ Yes | 15-183 permission requests | ❌ Fails ~60-100% |
| Reviewer (plain agent, no wrapper) | ❌ No | 183 permission requests | ✅ Success |
The pattern is clear: DelegatingAIAgent + heavy built-in tool usage = failure. Either factor alone works fine.
Production sandbox validation
After removing the DelegatingAIAgent wrapper from the reviewer and switching to text-based structured output:
- Before (with wrapper): 3/3 failures in production, 3/5 failures locally
- After (plain agent): Success with 183 permission requests, 1,476 session events, 605 lifecycle events — the heaviest session we've tested
Local reproduction (5 runs each)
Reviewer with DelegatingAIAgent + file reading: 2/5 PASS (40%)
Reviewer as plain agent + file reading: 5/5 PASS (100%)
Steps to Reproduce
// 1. Create a DelegatingAIAgent that buffers streaming (mimics ToolCaptureAgent)
class BufferingAgent(AIAgent inner) : DelegatingAIAgent(inner)
{
protected override async IAsyncEnumerable<AgentResponseUpdate> RunCoreStreamingAsync(
IEnumerable<ChatMessage> messages, AgentSession? session = null,
AgentRunOptions? options = null, CancellationToken ct = default)
{
List<AgentResponseUpdate> updates = [];
await foreach (var update in base.RunCoreStreamingAsync(messages, session, options, ct))
updates.Add(update);
foreach (var update in updates)
yield return update;
}
}
// 2. Create session with any model
var client = new CopilotClient(new() { GithubToken = token });
var config = new SessionConfig { WorkingDirectory = "/path/to/repo", Model = "claude-opus-4.6" };
var inner = client.AsAIAgent(config, ownsClient: false, name: "test");
var agent = new BufferingAgent(inner); // ← Wrapping causes the bug
// 3. Send prompt that triggers heavy built-in tool usage
var session = await agent.CreateSessionAsync();
var response = await agent.RunAsync(
"Read all .cs files in src/ and summarize them.", session);
// ❌ Intermittently throws: Session error: Execution failed: Error: missing finish_reason for choice 0Without the wrapper (using inner directly), the same prompt succeeds consistently.
Expected Behavior
DelegatingAIAgent subclasses that buffer streaming updates should work reliably regardless of session length or built-in tool usage count.
Actual Behavior
Sessions fail intermittently with missing finish_reason for choice 0 when a DelegatingAIAgent buffers streaming updates during long multi-turn sessions with heavy built-in tool usage. Failure rate increases with session length.
Environment
- SDK:
GitHub.Copilot.SDKv0.1.23 (NuGet, .NET) - Also uses:
Microsoft.Agents.AI.GitHub.Copilotv1.0.0-preview.260225.1 - Runtime: .NET 10
- OS: Reproduced on both Windows (local) and Linux (ADC sandbox/Azure Linux 3.0)
- Models tested:
claude-opus-4.6,gpt-5.1-codex— both exhibit the same behavior
Workaround Used
Avoid DelegatingAIAgent / streaming buffering for agents that perform heavy built-in tool usage. Use text-based structured output (prompt the model to include a parseable JSON line in its response) instead of intercepting tool calls via a wrapper agent.