feat(grep,glob): abort ripgrep mid-search instead of waiting out timeout#142
Merged
ericleepi314 merged 1 commit intoMay 15, 2026
Merged
Conversation
Before this fix, the ripgrep wrapper used ``subprocess.run(timeout=20)``. A SIGINT/ESC that tripped the abort controller mid-search had to wait out the full 20-second timeout before the subprocess returned and the agent loop could observe the cancellation. On a large repo Glob and Grep felt exactly like the pre-PR-#135 Bash supervisor — "ESC is ignored for 20+ seconds." Add ``_run_rg_with_abort`` that mirrors bash_tool's pattern: Popen + 50ms poll loop watching both the deadline and an ``AbortSignal``, SIGTERM → 2s grace → SIGKILL. New ``RipgrepAbortedError`` is distinct from ``RipgrepTimeoutError`` because the two outcomes warrant different downstream handling — timeout surfaces partial results (useful to the agent), abort drops them (the user already cancelled). Glob and Grep pass ``context.abort_controller.signal`` to the helper and re-raise ``AbortError`` on ``RipgrepAbortedError`` so the agent loop's ``except AbortError: raise`` branch from PR #135 unwinds to the outer cancel boundary. ``ripgrep()``'s ``abort_signal=None`` default keeps SDK consumers of the bare helper working without changes. Six regression tests pin the contract: bare helper sanity / pre-call trip / abort-tripped-subprocess-returns-promptly (deterministic event handshake against ``sleep`` — also exercises the supervisor without requiring ripgrep installed) / Glob abort → AbortError / Grep abort → AbortError / unavailable-rg path unchanged. Read, WebFetch, WebSearch and MCP are deliberately deferred: * Read: ``open().read()`` returns in <1s for typical files; the agent loop's pre-dispatch ``_check_cancel`` already short-circuits abort fired between tools. * WebFetch/WebSearch: ``urllib.request.urlopen(timeout=15)`` is bounded at 15s; a watchdog-thread approach is bigger surgery best done as a separate PR. * MCP: out-of-process JSON-RPC needs transport-level cancellation — cross-cutting change, separate PR. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
ericleepi314
added a commit
that referenced
this pull request
May 15, 2026
…out (#142) Before this fix, the ripgrep wrapper used ``subprocess.run(timeout=20)``. A SIGINT/ESC that tripped the abort controller mid-search had to wait out the full 20-second timeout before the subprocess returned and the agent loop could observe the cancellation. On a large repo Glob and Grep felt exactly like the pre-PR-#135 Bash supervisor — "ESC is ignored for 20+ seconds." Add ``_run_rg_with_abort`` that mirrors bash_tool's pattern: Popen + 50ms poll loop watching both the deadline and an ``AbortSignal``, SIGTERM → 2s grace → SIGKILL. New ``RipgrepAbortedError`` is distinct from ``RipgrepTimeoutError`` because the two outcomes warrant different downstream handling — timeout surfaces partial results (useful to the agent), abort drops them (the user already cancelled). Glob and Grep pass ``context.abort_controller.signal`` to the helper and re-raise ``AbortError`` on ``RipgrepAbortedError`` so the agent loop's ``except AbortError: raise`` branch from PR #135 unwinds to the outer cancel boundary. ``ripgrep()``'s ``abort_signal=None`` default keeps SDK consumers of the bare helper working without changes. Six regression tests pin the contract: bare helper sanity / pre-call trip / abort-tripped-subprocess-returns-promptly (deterministic event handshake against ``sleep`` — also exercises the supervisor without requiring ripgrep installed) / Glob abort → AbortError / Grep abort → AbortError / unavailable-rg path unchanged. Read, WebFetch, WebSearch and MCP are deliberately deferred: * Read: ``open().read()`` returns in <1s for typical files; the agent loop's pre-dispatch ``_check_cancel`` already short-circuits abort fired between tools. * WebFetch/WebSearch: ``urllib.request.urlopen(timeout=15)`` is bounded at 15s; a watchdog-thread approach is bigger surgery best done as a separate PR. * MCP: out-of-process JSON-RPC needs transport-level cancellation — cross-cutting change, separate PR. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
3 tasks
ericleepi314
added a commit
that referenced
this pull request
May 15, 2026
…out (#142) (#143) Before this fix, the ripgrep wrapper used ``subprocess.run(timeout=20)``. A SIGINT/ESC that tripped the abort controller mid-search had to wait out the full 20-second timeout before the subprocess returned and the agent loop could observe the cancellation. On a large repo Glob and Grep felt exactly like the pre-PR-#135 Bash supervisor — "ESC is ignored for 20+ seconds." Add ``_run_rg_with_abort`` that mirrors bash_tool's pattern: Popen + 50ms poll loop watching both the deadline and an ``AbortSignal``, SIGTERM → 2s grace → SIGKILL. New ``RipgrepAbortedError`` is distinct from ``RipgrepTimeoutError`` because the two outcomes warrant different downstream handling — timeout surfaces partial results (useful to the agent), abort drops them (the user already cancelled). Glob and Grep pass ``context.abort_controller.signal`` to the helper and re-raise ``AbortError`` on ``RipgrepAbortedError`` so the agent loop's ``except AbortError: raise`` branch from PR #135 unwinds to the outer cancel boundary. ``ripgrep()``'s ``abort_signal=None`` default keeps SDK consumers of the bare helper working without changes. Six regression tests pin the contract: bare helper sanity / pre-call trip / abort-tripped-subprocess-returns-promptly (deterministic event handshake against ``sleep`` — also exercises the supervisor without requiring ripgrep installed) / Glob abort → AbortError / Grep abort → AbortError / unavailable-rg path unchanged. Read, WebFetch, WebSearch and MCP are deliberately deferred: * Read: ``open().read()`` returns in <1s for typical files; the agent loop's pre-dispatch ``_check_cancel`` already short-circuits abort fired between tools. * WebFetch/WebSearch: ``urllib.request.urlopen(timeout=15)`` is bounded at 15s; a watchdog-thread approach is bigger surgery best done as a separate PR. * MCP: out-of-process JSON-RPC needs transport-level cancellation — cross-cutting change, separate PR. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Glob/Grepusedsubprocess.run(timeout=20)for ripgrep — a SIGINT/ESC that tripped the abort controller had to wait out the full 20s timeout before the agent loop noticedRipgrepAbortedErroris distinct fromRipgrepTimeoutError: timeout surfaces partial results (useful to the agent), abort drops them (user already cancelled — surfacing them would be noise)Why
PR #135 wired
tool_context.abort_controllerso subagents and Bash honor ESC. Other long-running tools (Read, Glob, Grep, WebFetch, WebSearch, MCP) still don't poll the signal — they'd return promptly only at natural completion. This PR closes the gap for the search tools, the most visible offenders: searching a large monorepo can pin the user's terminal for 20s after they ESC.Changes
src/tool_system/utils/ripgrep.py:_run_rg_with_abort(argv, *, timeout_s, abort_signal)— Popen-based supervisor mirroringbash_tool._run_bash_with_abort(50ms poll, SIGTERM → 2s grace → SIGKILL,start_new_session=True/ WindowsCREATE_NEW_PROCESS_GROUPfor process-group teardown)RipgrepAbortedError(message, partial_results=...)raised when the signal tripsripgrep(...)grows anabort_signal: AbortSignal | None = Nonekeyword param; default-None preserves previous semantics exactly for SDK consumerssrc/tool_system/tools/glob.py+src/tool_system/tools/grep.py:context.abort_controller.signaltoripgrep(...)RipgrepAbortedErrorand re-raiseAbortError(...)so the agent loop'sexcept AbortError: raisebranch (PR fix(esc): propagate cancel signal into tool_context for subagents #135) unwinds to the outer cancel boundary instead of synthesizing a partial tool resulttests/test_ripgrep_abort.py(new) — 6 regression tests:RipgrepAbortedErrorfastsleep 60(no real ripgrep required — also serves as CI coverage on machines withoutrginstalled)AbortErrorAbortErrorRipgrepUnavailableErrorstill raised correctlyTest plan
Grepagainst a large monorepo, press ESC — should unwind within ~1s instead of 20sOut of scope (deferred to separate follow-ups)
_check_cancelalready short-circuits abort fired between tools.urllib.request.urlopen(timeout=15). Cleanest abort requires a watchdog thread that closes the socket — separate PR.Dependencies
mainonce feat(headless): SIGINT mid-tool cancels via AbortController #141 merges. Originally PR feat(grep,glob): abort ripgrep mid-search instead of waiting out timeout #139 stacked on PR feat(headless): SIGINT mid-tool cancels via AbortController #138.🤖 Generated with Claude Code