Skip to content

Stop → Continue: resume truncated or aborted assistant turns#49

Merged
vahid-ahmadi merged 2 commits into
mainfrom
feat/stop-continue
May 22, 2026
Merged

Stop → Continue: resume truncated or aborted assistant turns#49
vahid-ahmadi merged 2 commits into
mainfrom
feat/stop-continue

Conversation

@vahid-ahmadi
Copy link
Copy Markdown
Collaborator

Summary

A long policy analysis that hits the 16k `max_tokens` cap, or one the user kills with Stop, currently dies there — they have to start a new chat and re-explain context. This PR adds a Continue affordance below those messages that resumes from exactly where the answer stopped.

Closes #44.

Behaviour

  • Truncation detection (backend): `backend/routes/chatbot.py` captures `final.stop_reason` from each Anthropic stream and propagates it on the `done` SSE event. `stop_reason === "max_tokens"` means truncated; `"end_turn"` / `"stop_sequence"` mean the model finished naturally.
  • Manual stop detection (frontend): existing `AbortController` flow now also tags the in-progress message with `stopped: true` before flushing.
  • Continue button: renders below the cost line on assistant messages where `stop_reason === "max_tokens" || stopped`, hidden if any tool in the message is still pending (avoids orphan tool calls).
  • Resume in place: `continueMessage(idx)` posts the conversation up to and including the partial assistant turn back to `/chat/message`. Anthropic's assistant-prefill behaviour means the model continues the same logical turn — no extra "continue from where you stopped" user nudge needed. Streamed deltas append into the same bubble. Cost is summed onto the existing `cost_gbp`.
  • Persistence: `stop_reason` / `stopped` / `cost_gbp` are now serialised by `saveConversation` and restored by `loadConversation`, so the Continue affordance survives a page reload.
  • Per the issue's "out of scope": re-truncation is allowed (button reappears), but no auto-looping — every continuation requires a user click.

Implementation notes

  • No structural backend changes — same loop, same tools, same prompt cache.
  • `continueMessage` deliberately reuses the streaming protocol but skips the typewriter drain animation: resumed text dumps directly into the message rather than animating, which felt right (the user already read the prefix).
  • Tools complication: the partial assistant might have triggered tool calls in the original turn. The new request only carries serialized text (existing pattern for `apiMessages`); the model continues from the text alone. That's lossy but consistent with how the project already handles multi-turn conversations.

Test plan

  • `docker-compose up`. Ask a question that produces a long answer (e.g. "Walk through the entire UK income tax code with examples"). Watch for max_tokens truncation → Continue appears → click → answer resumes inline, cost sums up.
  • Mid-stream click Stop → partial text preserved, Continue appears, click → resumes from exactly where it stopped.
  • Click Stop while a tool is pending → Continue button is hidden for that message (orphan-tool guard).
  • Reload the page (or open from history) → Continue affordance is still there on truncated/stopped messages.
  • Plan-mode replies don't show Continue (they end with `stop_reason: "end_turn"`).
  • Existing chat tests (`pytest backend/tests/test_api.py`) still pass — the only backend change is an additive field on the `done` event.

Out of scope

  • "Continue indefinitely" auto-looping — explicitly user-driven only.
  • Reconstructing tool_use / tool_result blocks across the request boundary (would let the model continue with full tool context but is a bigger refactor).

Backend (chatbot.py):
- Capture `final.stop_reason` from each Anthropic stream and propagate it
  on the `done` SSE event. The frontend uses "max_tokens" to detect
  truncation; "end_turn" / "stop_sequence" mean the model finished cleanly.

Frontend (ChatPage.tsx):
- Extend Message with `stop_reason` and `stopped` flags. The `done` handler
  stores `stop_reason`; the AbortError catch (user clicks Stop) sets
  `stopped: true`. Both flags survive saveConversation/loadConversation
  via the untyped messages JSON column on the backend.
- Render a Continue affordance below any message where
  `stop_reason === "max_tokens" || stopped`, hidden if a tool is still
  pending in the message (no orphan tool calls).
- New `continueMessage(idx)` posts the conversation up to and including
  the partial assistant turn back to /chat/message. Anthropic's
  assistant-prefill behaviour means the model continues the same logical
  turn — no "Continue from where you stopped" nudge needed. Streamed
  content appends into the SAME message bubble; cost is summed onto the
  existing `cost_gbp`. If continue itself truncates or is stopped, the
  affordance comes back (user-driven, not auto-loop).

Acceptance criteria:
- max_tokens → Continue button appears.
- User clicks Stop mid-stream → partial preserved + Continue appears.
- Continue resumes in-place, single bubble, summed cost.
- Out of scope: indefinitely auto-continuing — kept user-triggered.

Closes #44

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented May 6, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
policyengine-uk-chat Ready Ready Preview, Comment May 22, 2026 10:04am

Request Review

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Beta preview has been cleaned up because this PR was closed.

@vahid-ahmadi vahid-ahmadi self-assigned this May 6, 2026
@vahid-ahmadi vahid-ahmadi requested a review from SakshiKekre May 6, 2026 12:00
Resolve the ChatPage.tsx conflict: the no-tool `done` branch now keeps
both main's `isComplete: true` (from #47) and this branch's `stop_reason`.

Also fix two issues raised in review:

- continueMessage persisted the resumed turn with stale metadata: the
  saveConversation copy was built from the pre-continuation `messages`
  closure, so `stop_reason`/`stopped`/`cost_gbp` kept their old values.
  A reload then restored the Continue button and the pre-continuation
  cost. The saved copy now carries the fresh post-continuation values.

- The partial turn is sent to Anthropic as an assistant prefill, which
  is rejected if it is empty or ends with whitespace — common for a
  max_tokens / Stop truncation. Trim the prefill content and bail early
  when nothing remains to continue from.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vahid-ahmadi
Copy link
Copy Markdown
Collaborator Author

Pushed 8fd9f75 — conflict resolved and the two happy-path bugs fixed.

Merge conflict — merged main in. The only conflict was the no-tool done branch, which #47 (now in main) and this branch both touched; resolved to keep both fields: { ...lastIdx, isComplete: true, cost_gbp: msgCost, stop_reason: stopReason }. The cost-line region auto-merged cleanly, so the Copy button (#47) and the Continue button (this PR) now coexist. PR is MERGEABLE again.

Bug 1 — resumed turn persisted as still-truncated. In continueMessage's done handler, the saveConversation copy was built from the stale pre-continuation messages closure, so stop_reason / stopped / cost_gbp kept their old values. A reload restored the Continue button and the pre-continuation cost — the resume looked like it never happened. The saved copy now carries the fresh cost_gbp: baseCost + newCost, stop_reason: stopReason, stopped: false.

Bug 2 — trailing-whitespace assistant prefill. The partial turn is sent to Anthropic as a prefill, which is rejected if it's empty or ends with whitespace — common for a max_tokens cut or a mid-stream Stop. Now .trimEnd() the prefill content, and bail early if nothing remains to continue from.

Verification: tsc --noEmit clean; next build passes (compiled, linted, all pages generated) on the merged tree. Backend untouched by the merge.

Not changed — flagging for a follow-up, not this PR:

  • continueMessage re-implements ~120 lines of the SSE streaming loop from sendMessage. Worth extracting a shared handleStreamEvent so streaming fixes only land once — but that's a refactor better done separately than bundled into a merge.
  • Edge cases left as-is: a stopped plan-mode reply continues in non-plan mode (tools enabled); the empty-prefill case now no-ops the button silently rather than hiding it. Both rare; happy to address if you'd like them in scope.

@vahid-ahmadi
Copy link
Copy Markdown
Collaborator Author

Merge-readiness audit

Ran a full check on the post-fix branch (8fd9f75).

Verified green:

  • Conflict resolved — mergeStateStatus: CLEAN, mergeable: MERGEABLE.
  • Frontend: tsc --noEmit clean, next build passes (compiled, linted, all pages generated).
  • Backend: compiles on Python 3.13; the change is 4 additive lines.
  • Per-message copy + whole-conversation markdown export #47's Copy / Download / export helpers all survived the merge intact.
  • CI: Deploy beta preview ✅, Vercel ✅.

Confirmed the two fixes are sound, not just plausible:

  • Persistence — checked conversations.py: it json.dumps the message list verbatim with no per-message schema, so stop_reason / stopped / cost_gbp genuinely round-trip. A resumed turn now reloads as completed, not truncated.
  • Prefill — checked /chat/message: the conversation goes straight to Anthropic with no trailing user message, so an assistant-final message correctly triggers prefill continuation. trimEnd() + the empty-guard prevent the 400 on whitespace/empty prefill.
  • Backend teststest_api.py asserts the done event with key-presence checks ("done" in types, "usage" in done), never exact shape, so the new stop_reason key can't break them. (Verified by inspection — they need a live API key + DB and aren't a CI gate here.)

Non-blocking items left for follow-up (not fixed in this PR):

  1. continueMessage re-implements ~120 lines of sendMessage's SSE loop — worth extracting a shared handler later so streaming fixes land once.
  2. A hard failure mid-continuation (network error, not Stop) drops the Continue button in-memory; recoverable via reload since the persisted record keeps the truncation flag. Left as-is deliberately — forcing the button back would feed the appended "Continuation failed" text into the next prefill.
  3. Continuation flattens prior tool output into the assistant's own text (author flagged as out-of-scope).
  4. A stopped plan-mode reply resumes in non-plan mode. Rare edge.

Verdict: technically safe and ready — correct on all happy paths, builds clean, conflict resolved, CI green. The remaining items are non-blocking. The one thing outstanding is a human sign-off on the conflict resolution + the two fixes — @SakshiKekre, a second look at 8fd9f75 would be good before merge.

@vahid-ahmadi vahid-ahmadi merged commit 3cddd20 into main May 22, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Stop → continue: resume aborted or truncated assistant turns

1 participant