Skip to content

fix(esc): override cancelled tool_result with REJECT_MESSAGE in production path#150

Merged
ericleepi314 merged 1 commit into
mainfrom
fix/esc-cancel-reject-message-production-path
May 16, 2026
Merged

fix(esc): override cancelled tool_result with REJECT_MESSAGE in production path#150
ericleepi314 merged 1 commit into
mainfrom
fix/esc-cancel-reject-message-production-path

Conversation

@ericleepi314
Copy link
Copy Markdown
Collaborator

Summary

  • When ESC fires during a Bash command, the production tool-dispatch path was passing the bash tool's <error>Command was aborted before completion</error> payload through to the model. On the resume turn, the model read this as a generic failure and retried the command instead of honouring the cancel.
  • The TS reference at StreamingToolExecutor.ts:153-205 overrides the tool_result with REJECT_MESSAGE when the abort reason is user_interrupted, but the Python production REPL bypasses StreamingToolExecutor (dispatches via _run_tools_partitioned_dispatch_single_tool), so that override never fired in production.
  • This PR adds the override at four sites in _dispatch_single_tool (pre-tool gate, post-tool override, AbortError catch, late-abort tail) — all funneling through two helpers (_is_user_cancelled_abort, _build_user_cancelled_result). The sibling_error reason is carved out so the streaming-executor's parallel-tool cascade isn't mislabelled as a user rejection. The except AbortError branch re-gates on the same user-cancel check so future tools repurposing AbortError for their own internal cancellation aren't silently relabelled.

Reproduction (before the fix)

User runs a long Bash command, presses ESC, then types please resume. The model sees the bash tool_result content as <error>Command was aborted before completion</error> and treats it as a transient failure — it retries the command instead of honouring the user's cancel.

After the fix

The tool_result for the ESC-cancelled call now contains REJECT_MESSAGE ("The user doesn't want to proceed with this tool use. The tool use was rejected (eg. if it was a file edit, the new_string was NOT written to the file). STOP what you are doing and wait for the user to tell you how to proceed."). On the resume turn the model sees an unambiguous "user rejected" signal in the conversation history.

Test plan

  • New regression file tests/test_esc_reject_message_dispatch.py — 13 tests covering each override site, the sibling_error carve-out, normal completion, and the defensive AbortError-without-signal-aborted case
  • Wider sweep: 217/217 tests pass across test_esc_reject_message_dispatch + test_esc_cancel_propagation + test_abort_controller* + test_streaming_executor_interruptible + test_tool_execution_integration + test_fast_path_dispatch + test_tool_result_budget + test_query_loop + test_query_engine + test_query_error_recovery + test_query_hook_stopped + test_query_terminal + test_streaming_query_loop + test_bash_parser + test_bash_security
  • Critic review: APPROVE after one revision round (tightened the except AbortError gate and reworked the sibling_error test to exercise a real failure payload)

TS parity

Mirrors typescript/src/services/tools/StreamingToolExecutor.ts:153-205 (createSyntheticErrorMessage for user_interrupted) and :278-292/:332-345 (the initial-abort branch + per-iteration abort check).

Divergence noted in _is_user_cancelled_abort docstring: TS distinguishes 'interrupt' (mid-stream submit) from 'user_interrupted' (ESC) via a per-tool interruptBehavior() === 'cancel' gate. Python today emits neither 'interrupt' nor any per-tool interrupt_behavior, so the collapsed check is sound. Any future 'interrupt' wire-up must land the per-tool gate first.

Follow-ups (out of scope)

  • Bash timeout produces the same <error>Command was aborted before completion</error> payload but with signal.aborted == False, so the post-tool override doesn't fire. The model may still retry on timeout — pre-existing behavior.
  • Subagent contexts spawned mid-turn hold their own ToolContext snapshot; if ESC trips before reset_abort_controller, a delayed wake-up could see the previously-aborted controller and trigger REJECT_MESSAGE spuriously.

🤖 Generated with Claude Code

…ction path

When ESC fires during a Bash command, the production tool-dispatch path
(`_dispatch_single_tool` in `src/query/query.py`) was passing the bash
tool's `<error>Command was aborted before completion</error>` payload
through to the model. On the resume turn, the model read this as a
generic failure and retried the bash command instead of honouring the
user's cancel.

The TS reference at `StreamingToolExecutor.ts:153-205` overrides the
tool_result with REJECT_MESSAGE when the abort reason is
`user_interrupted`, but the Python production REPL bypasses
`StreamingToolExecutor` (it dispatches via `_run_tools_partitioned` →
`_dispatch_single_tool`), so the override never fired in production.

Add the override at four sites in `_dispatch_single_tool` — pre-tool
gate, post-tool override, `AbortError` catch, and a late-abort tail in
the generic exception handler — all funneling through two helpers
(`_is_user_cancelled_abort`, `_build_user_cancelled_result`). The
`sibling_error` reason is carved out so the streaming-executor's
parallel-tool cascade doesn't get mislabelled as a user rejection.

The `except AbortError` branch re-gates on the same user-cancel check
so future tools that repurpose `AbortError` for their own internal
cancellation aren't silently relabelled as "user rejected".

Pinned by 13 tests in `tests/test_esc_reject_message_dispatch.py`
covering each override site, the `sibling_error` carve-out, normal
completion, and the defensive `AbortError`-without-signal-aborted
case. 217/217 tests pass across the abort/ESC/query/bash domain.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant