fix(llm): propagate cancel token into provider streaming task by emal-avala · Pull Request #126 · avala-ai/agent-code

emal-avala · 2026-04-15T09:19:05Z

Problem

The Escape-key interrupt added in #106 never actually worked against a real LLM. The key press was detected, `cancel_token.cancel()` fired, and the query engine's outer select at `query/mod.rs:531` exited cleanly — but the provider's own streaming task (spawned in `anthropic.rs`, `openai.rs`, `azure_openai.rs`) kept polling `byte_stream.next().await` because it had no knowledge of the token. The reqwest response stayed open and the task kept emitting events into a receiver nobody was reading, so from your seat the turn looked uninterrupted until the LLM finished writing on its own.

The existing `HangingProvider` cancel tests kept passing throughout because that mock ignores the token and the query loop exits on its own — they were testing the select loop, not the actual bug.

Fix

Add `cancel: CancellationToken` to `ProviderRequest` (`crates/lib/src/llm/provider.rs`).
In each provider's spawned task, race `byte_stream.next().await` against `cancel.cancelled()` via `tokio::select!` (biased on cancel). On cancel the task `return`s, which drops the byte stream, drops the `reqwest::Response`, and aborts the underlying HTTP connection immediately. Applied to all three providers:
- `crates/lib/src/llm/anthropic.rs`
- `crates/lib/src/llm/openai.rs`
- `crates/lib/src/llm/azure_openai.rs`
Thread the token through all four `ProviderRequest` call sites:
- `query/mod.rs` — passes `self.cancel.clone()` so Esc interrupts the main turn.
- `services/compact.rs` — new `cancel` parameter on `compact_with_llm`; query loop passes `self.cancel.clone()` so Esc also interrupts inline compaction summaries.
- `memory/extraction.rs` and `memory/consolidation.rs` — pass a fresh `CancellationToken::new()` since these are background tasks that should not be user-cancellable.

Test

Added `provider_stream_task_observes_cancellation` in `crates/lib/src/query/mod.rs`. It introduces a new `CancelAwareHangingProvider` whose spawned task mirrors the real providers — it races a `pending` future (standing in for `byte_stream.next().await`) against `ProviderRequest::cancel` and flips an `exit_flag` when the token fires. The test runs a turn, schedules a cancel at 50 ms, and asserts the flag flipped. This test fails if the token is dropped anywhere between `query::mod.rs` and the provider's spawn.

Full workspace:

```
test query::tests::cancel_shared_propagates_to_current_token ... ok
test query::tests::cancel_before_first_event_interrupts_cleanly ... ok
test query::tests::stream_loop_responds_to_cancellation ... ok
test query::tests::cancel_does_not_poison_next_turn ... ok
test query::tests::cancelled_turn_emits_warning_to_sink ... ok
test query::tests::run_turn_with_sink_interrupts_on_cancel ... ok
test query::tests::provider_stream_task_observes_cancellation ... ok
test query::tests::cancel_works_across_multiple_turns ... ok
test result: ok. 8 passed

lib unit total

test result: ok. 554 passed
```

All integration suites (message, permissions, provider, skills, config, sandbox, shell passthrough) green.

Manual test plan

`cargo run -p agent-code`, ask for a long reply ("write a 2000-word explanation of the ownership model"), press Esc mid-stream → stream should stop within one SSE chunk.
Same with Ctrl+C → same result.
Confirm normal completion still works end-to-end (Done event arrives, turn summary renders).
Repeat across providers if you have the keys: Anthropic (`claude-sonnet-4`), OpenAI (`gpt-5.4`), and one OpenAI-compat endpoint (`ollama` or `groq`) to cover all three files touched.
Trigger compaction (long session) and press Esc during the compaction step — should also interrupt.

The Escape-key interrupt added in #106 never actually worked against a real LLM. The key press was detected, `cancel_token.cancel()` fired, and the query engine's outer select loop at `query/mod.rs:531` exited cleanly — but the provider's own streaming task (spawned in `anthropic.rs`, `openai.rs`, `azure_openai.rs`) kept polling `byte_stream.next().await` because it had no knowledge of the token. The reqwest response stayed open and the task kept emitting events into a receiver nobody was reading, so from the user's seat the turn looked uninterrupted until the LLM finished writing. Fix: - add `cancel: CancellationToken` to `ProviderRequest` - in each provider's spawned task, race `byte_stream.next().await` against `cancel.cancelled()` via `tokio::select!` (biased on cancel). On cancel we `return` from the task, which drops the byte stream, drops the `reqwest::Response`, and aborts the underlying HTTP connection immediately. - thread the token through all four `ProviderRequest` call sites: `query/mod.rs` and `services/compact.rs` pass `self.cancel.clone()` so Esc interrupts both the main turn and any inline compaction run; `memory/extraction.rs` and `memory/consolidation.rs` pass a fresh `CancellationToken::new()` since they're background tasks that should not be user-cancellable. - add `compact_with_llm(..., cancel)` parameter; update caller in `query/mod.rs`. Regression test: `provider_stream_task_observes_cancellation` introduces `CancelAwareHangingProvider`, a mock provider whose spawned task mirrors the real one — it races a `pending` future (standing in for `byte_stream.next().await`) against the `ProviderRequest::cancel` token and flips an `exit_flag` when the token fires. The test then runs a turn, schedules a cancel, and asserts the flag flipped. The existing `HangingProvider` tests kept passing while the feature was broken because that mock ignores the token and the query loop exits on its own — this new test fails if the token is dropped anywhere between `query::mod.rs` and the provider's spawn.

chatgpt-codex-connector · 2026-04-15T09:19:12Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

Highlights since v0.15.2: - feat(sandbox): Linux bwrap strategy for the Bash tool (#124) - fix(cli): translate LF->CRLF in streaming sink to stop rendered drift (#125) - fix(llm): propagate cancel token into provider streaming task (#126)

emal-avala merged commit 29a49cb into main Apr 15, 2026
12 of 14 checks passed

emal-avala deleted the fix/esc-interrupt-http-stream branch April 15, 2026 09:24

emal-avala mentioned this pull request Apr 15, 2026

chore: release v0.15.3 #127

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(llm): propagate cancel token into provider streaming task#126

fix(llm): propagate cancel token into provider streaming task#126
emal-avala merged 1 commit intomainfrom
fix/esc-interrupt-http-stream

emal-avala commented Apr 15, 2026

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

emal-avala commented Apr 15, 2026

Problem

Fix

Test

lib unit total

Manual test plan

Uh oh!

chatgpt-codex-connector Bot commented Apr 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant