fix(history): atomic pair-aware truncation for tool_call blocks#487
fix(history): atomic pair-aware truncation for tool_call blocks#487TatsuKo-Tsukimi wants to merge 1 commit intodataelement:mainfrom
Conversation
Replace naive `conversation[-ctx_size:]` slicing with a walker that treats assistant.tool_calls and its matching role="tool" messages as one atomic block. Naive slicing can leave an orphan role="tool" at the head when the cut lands mid-pair — OpenAI rejects this with "No tool call found for function call output" (issue dataelement#446). New helper `services/history_window.truncate_by_message_count` walks tail backward, identifies blocks, and includes blocks whole or not at all. Orphan tools are silently dropped regardless of budget. Replaces head-only pop guard in: - app/api/websocket.py (web chat) - app/api/feishu.py (feishu channel) Leaves app/services/llm/caller.py:626 untouched (call_agent_llm short-reply path's hardcoded [-10:] is intentional). Tests: 15 covering empty input, budget edges, parallel tool_calls, multi-pair, head/mid orphans, realistic 60-message invariant. Addresses dataelement#446 failure mode. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 114be24657
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| # Naive [-ctx_size:] slicing can leave orphan tool messages at the | ||
| # head when the cut lands mid-pair, which OpenAI rejects with | ||
| # "No tool call found for function call output" (issue #446). | ||
| _truncated = truncate_by_message_count(conversation, ctx_size) |
There was a problem hiding this comment.
Bound pair-aware truncation to recent history
truncate_by_message_count(conversation, ctx_size) now runs over the full in-memory conversation on every turn, but this websocket loop keeps appending messages and does not prune that list, so truncation cost grows with session length rather than being effectively bounded by ctx_size like the previous conversation[-ctx_size:] slice. In long-lived chats this can introduce steadily increasing per-turn latency and CPU usage; pre-limiting the input (for example to a bounded tail window) before pair-aware truncation would avoid this regression while preserving the pairing fix.
Useful? React with 👍 / 👎.
Summary
Replace naive
conversation[-ctx_size:]slicing with a walker that treatsassistant.tool_callsand its matchingrole="tool"messages as one atomic block. Naive slicing can leave an orphanrole="tool"at the head when the cut lands mid-pair — OpenAI rejects this withNo tool call found for function call output(issue #446).New helper
services/history_window.truncate_by_message_countdetects orphan tool messages bytool_call_idmatching, not adjacency — a tool inserted between a valid pair and other messages (from malformed persistence or upstream truncation) is dropped, not folded into a block. Robust against orphans at any position.Changes
backend/app/services/history_window.py— pair-aware truncation helperbackend/tests/test_history_window.py— 16 unit tests (empty / budget edges / parallel tool_calls / multi-pair / head and mid orphans / orphan-adjacent-to-pair / 60-message synthetic invariant)backend/app/api/websocket.py— replace head-only pop guard at the tool-call truncation sitebackend/app/api/feishu.py— same pattern for the IM channel pathapp/services/llm/caller.py:626untouched (call_agent_llm short-reply path's hardcoded[-10:]is intentional)Stack
This is PR 1/4 of a context-budget hygiene series. The full stack:
Each follow-up is its own PR, posted separately. Recommend reviewing in order; merging in order.
Test plan
pytest backend/tests/test_history_window.py— 16/16 passcaller.py:626[-10:]short-reply path was intentional to leave untouchedAddresses the failure mode reported in #446. Reproduce: long enough conversation with tool calls; legacy slice + head-pop leaves an orphan that reaches OpenAI Responses API. New helper drops it.
🤖 Generated with Claude Code