Skip to content

fix(llm): propagate cancel token into provider streaming task#126

Merged
emal-avala merged 1 commit intomainfrom
fix/esc-interrupt-http-stream
Apr 15, 2026
Merged

fix(llm): propagate cancel token into provider streaming task#126
emal-avala merged 1 commit intomainfrom
fix/esc-interrupt-http-stream

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Problem

The Escape-key interrupt added in #106 never actually worked against a real LLM. The key press was detected, `cancel_token.cancel()` fired, and the query engine's outer select at `query/mod.rs:531` exited cleanly — but the provider's own streaming task (spawned in `anthropic.rs`, `openai.rs`, `azure_openai.rs`) kept polling `byte_stream.next().await` because it had no knowledge of the token. The reqwest response stayed open and the task kept emitting events into a receiver nobody was reading, so from your seat the turn looked uninterrupted until the LLM finished writing on its own.

The existing `HangingProvider` cancel tests kept passing throughout because that mock ignores the token and the query loop exits on its own — they were testing the select loop, not the actual bug.

Fix

  1. Add `cancel: CancellationToken` to `ProviderRequest` (`crates/lib/src/llm/provider.rs`).
  2. In each provider's spawned task, race `byte_stream.next().await` against `cancel.cancelled()` via `tokio::select!` (biased on cancel). On cancel the task `return`s, which drops the byte stream, drops the `reqwest::Response`, and aborts the underlying HTTP connection immediately. Applied to all three providers:
    • `crates/lib/src/llm/anthropic.rs`
    • `crates/lib/src/llm/openai.rs`
    • `crates/lib/src/llm/azure_openai.rs`
  3. Thread the token through all four `ProviderRequest` call sites:
    • `query/mod.rs` — passes `self.cancel.clone()` so Esc interrupts the main turn.
    • `services/compact.rs` — new `cancel` parameter on `compact_with_llm`; query loop passes `self.cancel.clone()` so Esc also interrupts inline compaction summaries.
    • `memory/extraction.rs` and `memory/consolidation.rs` — pass a fresh `CancellationToken::new()` since these are background tasks that should not be user-cancellable.

Test

Added `provider_stream_task_observes_cancellation` in `crates/lib/src/query/mod.rs`. It introduces a new `CancelAwareHangingProvider` whose spawned task mirrors the real providers — it races a `pending` future (standing in for `byte_stream.next().await`) against `ProviderRequest::cancel` and flips an `exit_flag` when the token fires. The test runs a turn, schedules a cancel at 50 ms, and asserts the flag flipped. This test fails if the token is dropped anywhere between `query::mod.rs` and the provider's spawn.

Full workspace:

```
test query::tests::cancel_shared_propagates_to_current_token ... ok
test query::tests::cancel_before_first_event_interrupts_cleanly ... ok
test query::tests::stream_loop_responds_to_cancellation ... ok
test query::tests::cancel_does_not_poison_next_turn ... ok
test query::tests::cancelled_turn_emits_warning_to_sink ... ok
test query::tests::run_turn_with_sink_interrupts_on_cancel ... ok
test query::tests::provider_stream_task_observes_cancellation ... ok
test query::tests::cancel_works_across_multiple_turns ... ok
test result: ok. 8 passed

lib unit total

test result: ok. 554 passed
```

All integration suites (message, permissions, provider, skills, config, sandbox, shell passthrough) green.

Manual test plan

  • `cargo run -p agent-code`, ask for a long reply ("write a 2000-word explanation of the ownership model"), press Esc mid-stream → stream should stop within one SSE chunk.
  • Same with Ctrl+C → same result.
  • Confirm normal completion still works end-to-end (Done event arrives, turn summary renders).
  • Repeat across providers if you have the keys: Anthropic (`claude-sonnet-4`), OpenAI (`gpt-5.4`), and one OpenAI-compat endpoint (`ollama` or `groq`) to cover all three files touched.
  • Trigger compaction (long session) and press Esc during the compaction step — should also interrupt.

The Escape-key interrupt added in #106 never actually worked against a
real LLM. The key press was detected, `cancel_token.cancel()` fired,
and the query engine's outer select loop at `query/mod.rs:531` exited
cleanly — but the provider's own streaming task (spawned in
`anthropic.rs`, `openai.rs`, `azure_openai.rs`) kept polling
`byte_stream.next().await` because it had no knowledge of the token.
The reqwest response stayed open and the task kept emitting events
into a receiver nobody was reading, so from the user's seat the turn
looked uninterrupted until the LLM finished writing.

Fix:
- add `cancel: CancellationToken` to `ProviderRequest`
- in each provider's spawned task, race `byte_stream.next().await`
  against `cancel.cancelled()` via `tokio::select!` (biased on cancel).
  On cancel we `return` from the task, which drops the byte stream,
  drops the `reqwest::Response`, and aborts the underlying HTTP
  connection immediately.
- thread the token through all four `ProviderRequest` call sites:
  `query/mod.rs` and `services/compact.rs` pass `self.cancel.clone()`
  so Esc interrupts both the main turn and any inline compaction run;
  `memory/extraction.rs` and `memory/consolidation.rs` pass a fresh
  `CancellationToken::new()` since they're background tasks that
  should not be user-cancellable.
- add `compact_with_llm(..., cancel)` parameter; update caller in
  `query/mod.rs`.

Regression test: `provider_stream_task_observes_cancellation`
introduces `CancelAwareHangingProvider`, a mock provider whose
spawned task mirrors the real one — it races a `pending` future
(standing in for `byte_stream.next().await`) against the
`ProviderRequest::cancel` token and flips an `exit_flag` when the
token fires. The test then runs a turn, schedules a cancel, and
asserts the flag flipped. The existing `HangingProvider` tests kept
passing while the feature was broken because that mock ignores the
token and the query loop exits on its own — this new test fails if
the token is dropped anywhere between `query::mod.rs` and the
provider's spawn.
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@emal-avala emal-avala merged commit 29a49cb into main Apr 15, 2026
12 of 14 checks passed
@emal-avala emal-avala deleted the fix/esc-interrupt-http-stream branch April 15, 2026 09:24
emal-avala added a commit that referenced this pull request Apr 15, 2026
Highlights since v0.15.2:
- feat(sandbox): Linux bwrap strategy for the Bash tool (#124)
- fix(cli): translate LF->CRLF in streaming sink to stop rendered drift (#125)
- fix(llm): propagate cancel token into provider streaming task (#126)
emal-avala added a commit that referenced this pull request Apr 15, 2026
Highlights since v0.15.2:
- feat(sandbox): Linux bwrap strategy for the Bash tool (#124)
- fix(cli): translate LF->CRLF in streaming sink to stop rendered drift (#125)
- fix(llm): propagate cancel token into provider streaming task (#126)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant