fix: Complete tool call handling fixes for SDK mode and conversation history by saxyguy81 · Pull Request #26 · CaddyGlow/ccproxy-api

saxyguy81 · 2025-12-19T05:32:56Z

Real-World Use Case

I'm using ccproxy-api to route requests from the superdesign.dev VSCode plugin to my Claude Max subscription. This setup works great for simple conversations, but breaks on longer conversations that trigger tool calls, especially after the client implicitly compacts the conversation history.

Symptoms I Encountered

First few messages work fine, tool calls complete successfully
After multiple tool-heavy exchanges, responses start failing:
- Tool call results missing or incomplete
- Error: "unexpected tool_use_id found in tool_result blocks"
STDIO MCP tools (filesystem, chrome-extension) were particularly affected

Root Causes Found

After deep investigation, I found 6 bugs in the tool call handling code:

Tool calls only sent for SSE format, not dict format (SDK mode)
Race condition: messages discarded between listener check and broadcast
Race condition: worker starts before listener is registered
No error handling around streaming yields
Fire-and-forget cleanup loses in-flight messages
Orphaned tool_result blocks after conversation compaction

Summary

This PR fixes all the above bugs affecting tool call handling in ccproxy-api, particularly when using SDK mode with STDIO MCP tools and when conversation history contains compacted/summarized messages.

Issues Fixed:

Tool calls not being sent in SDK mode (dict format)
Race conditions causing message loss with fast STDIO tools
Orphaned tool_result blocks causing API errors after conversation compaction

Changes

1. OpenAI Streaming Adapter (`adapters/openai/streaming.py`)

Bug: Tool calls were only yielded for SSE format, not dict format (SDK mode).

# Before (broken):
elif self.tool_calls and self.enable_tool_calls and self.output_format == "sse":

# After (fixed):
elif self.tool_calls and self.enable_tool_calls:

Removed SSE-only condition so tool calls work in SDK mode
Added proper tool call indexing for multiple simultaneous calls
Added clearing of tool_calls dict after yielding to prevent duplicates

2. SDK Streaming Race Conditions (`claude_sdk/stream_worker.py`, `stream_handle.py`)

Bug #1: Non-atomic check-and-broadcast caused message loss

Removed separate has_listeners() check
Always call broadcast() which handles atomically

Bug #2: Worker started before listener was registered

Split worker lifecycle into create/start phases
Pre-register listener before starting worker

Bug #3: Fire-and-forget cleanup lost in-flight messages

Added 100ms drain period before interrupt
Changed asyncio.create_task() to await

3. SDK Streaming Yield Guarantees (`claude_sdk/streaming.py`)

Bug: No error handling around yields

Added try-except for GeneratorExit at 5 locations
Added completion tracking (chunks_sent/total_chunks)
Added logging for interrupted blocks

4. Tool Result Sanitization (`adapters/openai/adapter.py`)

Bug: Orphaned tool_result blocks caused API errors when conversation history was compacted.

Error: "unexpected tool_use_id found in tool_result blocks"

Added _sanitize_tool_results() method
Removes orphaned tool_results that don't have matching tool_use in preceding message
Converts orphaned results to text blocks to preserve information
Logs warnings when sanitization occurs

Why These Bugs Occurred

SDK mode uses dict format, but tool call yielding was SSE-only
STDIO tools respond in <1ms, hitting race windows that slower API calls don't
Conversation compaction (by clients like superdesign.dev) removes assistant messages with tool_use blocks while keeping user messages with tool_result references

Test Plan

New Test Files

tests/test_tool_call_streaming_fix.py - OpenAI adapter tool call tests
tests/test_tool_result_sanitization.py - Orphaned tool_result tests (17 test cases)

Run Tests

# Tool result sanitization (standalone)
python -m pytest tests/test_tool_result_sanitization.py --noconftest -c /dev/null -v

Manual Testing Done

Started ccproxy with SDK mode on port 4000
Connected superdesign.dev VSCode plugin
Had extended conversations with multiple tool calls
Verified no more "unexpected tool_use_id" errors
Verified tool calls complete successfully in long conversations

Files Changed

File	Lines	Description
`ccproxy/adapters/openai/streaming.py`	+29/-8	SSE-only fix, indexing, clearing
`ccproxy/adapters/openai/adapter.py`	+81	Tool result sanitization
`ccproxy/claude_sdk/stream_worker.py`	+13	Atomic broadcast
`ccproxy/claude_sdk/stream_handle.py`	+58/-22	Listener pre-registration, drain
`ccproxy/claude_sdk/streaming.py`	+101/-10	Yield guarantees

Breaking Changes

None. All changes are backward compatible.

🤖 Generated with Claude Code

Tool calls were only being yielded for SSE format, not dict format (used by SDK mode). This caused SDK mode clients to miss tool call responses entirely. Changes: - Remove SSE-only condition from tool call yielding - Add proper tool call indexing for multiple simultaneous calls - Clear tool_calls dict after yielding to prevent duplicates - Add debug logging for tool call processing 🤖 Generated with [Claude Code](https://claude.ai/code)

Fixes multiple race conditions affecting fast STDIO MCP tools: 1. Message discarding race (stream_worker.py) - Removed separate has_listeners() check - Always call broadcast() which handles atomically 2. Listener setup race (stream_handle.py) - Split worker lifecycle into create/start phases - Pre-register listener before starting worker 3. Interrupt timing issues (stream_handle.py) - Added 100ms drain period before interrupt - Changed asyncio.create_task() to await These race conditions were particularly problematic for STDIO tools (filesystem, chrome-extension) which respond in <1ms, hitting the race windows that slower API calls don't. 🤖 Generated with [Claude Code](https://claude.ai/code)

Adds error handling and completion tracking around yield statements in the SDK streaming processor: - Added try-except for GeneratorExit at 5 locations - Added completion tracking (chunks_sent/total_chunks) - Added logging for interrupted blocks with completion ratio - Ensures partial block delivery is tracked and logged This helps diagnose issues when clients disconnect mid-stream and ensures proper cleanup of streaming resources. 🤖 Generated with [Claude Code](https://claude.ai/code)

Adds _sanitize_tool_results() to handle cases where conversation history is compacted and tool_result blocks become orphaned (their matching tool_use blocks are removed). The Anthropic API requires each tool_result to have a matching tool_use in the immediately preceding assistant message. When clients compact history, this invariant can be violated. Fix: - Scan for valid tool_use IDs in preceding assistant message - Remove orphaned tool_result blocks that don't match - Convert orphaned results to text blocks to preserve information - Log warnings when sanitization occurs Error fixed: "unexpected tool_use_id found in tool_result blocks" 🤖 Generated with [Claude Code](https://claude.ai/code)

Adds comprehensive test coverage for the tool call fixes: 1. test_tool_call_streaming_fix.py - Tests tool calls work in dict format (SDK mode) - Tests multiple tool calls are properly indexed - Tests SSE format still works (no regression) - Tests complex tool call arguments 2. test_tool_result_sanitization.py (17 test cases) - Tests valid tool_results are preserved - Tests orphaned tool_results are removed - Tests mixed valid/orphaned scenarios - Tests conversation compaction scenario - Tests edge cases (empty, complex content, etc.) 🤖 Generated with [Claude Code](https://claude.ai/code)

CaddyGlow · 2026-01-03T21:01:18Z

Hello, did you try v0.2 first? https://github.com/CaddyGlow/ccproxy-api/tree/dev/v0.2 I have to switch main to this version but didn't get proper time to finish cleaning up the log output. It should be ready to use though. I put a lot of effort into improving the handling of transformations between APIs. I'll happily apply any fixes to this branch.

uvx --with "ccproxy-api[all]==0.2.0" ccproxy serve --port 8000

saxyguy81 · 2026-01-04T02:51:45Z

This PR has been superseded by a new PR targeting dev/v0.2 branch. Please see #31 for the updated implementation that works with the dev/v0.2 architecture.

CaddyGlow · 2026-01-05T21:10:20Z

Fix in v0.2.0

saxyguy81 · 2026-01-07T08:35:57Z

Superseded by the dev/v0.2 PR: #31 (merged).

smhanan added 5 commits December 18, 2025 23:31

CaddyGlow force-pushed the main branch 2 times, most recently from c982f22 to e8d882b Compare January 4, 2026 23:41

CaddyGlow closed this Jan 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Complete tool call handling fixes for SDK mode and conversation history#26

fix: Complete tool call handling fixes for SDK mode and conversation history#26
saxyguy81 wants to merge 5 commits intoCaddyGlow:mainfrom
saxyguy81:fix/tool-call-streaming-and-validation

saxyguy81 commented Dec 19, 2025 •

edited

Loading

Uh oh!

CaddyGlow commented Jan 3, 2026

Uh oh!

saxyguy81 commented Jan 4, 2026

Uh oh!

CaddyGlow commented Jan 5, 2026

Uh oh!

saxyguy81 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

saxyguy81 commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Real-World Use Case

Symptoms I Encountered

Root Causes Found

Summary

Changes

1. OpenAI Streaming Adapter (adapters/openai/streaming.py)

2. SDK Streaming Race Conditions (claude_sdk/stream_worker.py, stream_handle.py)

3. SDK Streaming Yield Guarantees (claude_sdk/streaming.py)

4. Tool Result Sanitization (adapters/openai/adapter.py)

Why These Bugs Occurred

Test Plan

New Test Files

Run Tests

Manual Testing Done

Files Changed

Breaking Changes

Uh oh!

CaddyGlow commented Jan 3, 2026

Uh oh!

saxyguy81 commented Jan 4, 2026

Uh oh!

CaddyGlow commented Jan 5, 2026

Uh oh!

saxyguy81 commented Jan 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

saxyguy81 commented Dec 19, 2025 •

edited

Loading

1. OpenAI Streaming Adapter (`adapters/openai/streaming.py`)

2. SDK Streaming Race Conditions (`claude_sdk/stream_worker.py`, `stream_handle.py`)

3. SDK Streaming Yield Guarantees (`claude_sdk/streaming.py`)

4. Tool Result Sanitization (`adapters/openai/adapter.py`)