Skip to content

feat(chat): regenerate last response (CLI, WebSocket, Web UI)#349

Merged
pancacake merged 2 commits intoHKUDS:devfrom
DarkGenius:feat/chat-regenerate-response
Apr 20, 2026
Merged

feat(chat): regenerate last response (CLI, WebSocket, Web UI)#349
pancacake merged 2 commits intoHKUDS:devfrom
DarkGenius:feat/chat-regenerate-response

Conversation

@DarkGenius
Copy link
Copy Markdown
Contributor

Summary

Adds a "regenerate last response" capability that re-runs the prior user message in a session as a fresh turn without duplicating the user row. Exposed across all three entry points:

  • WebSocket — new type: "regenerate" message accepting an optional overrides field; mirrors the start_turn flow including subscribe_turn. Errors regenerate_busy / nothing_to_regenerate surface as non-fatal error events.
  • CLI/regenerate (alias /retry) inside deeptutor chat, backed by a shared regenerate_and_render helper.
  • Web UI — per-message Regenerate button (RefreshCcw icon) sits next to Copy on the last assistant message. Uses a new regenerateLastMessage() action that pops the trailing assistant optimistically and sends the new WS message. Replaces the previous Retry button (which depended on an in-memory requestSnapshot and silently disappeared after page reload), so regeneration now also works for historical sessions hydrated from the store.

Design

  • SQLiteSessionStore.delete_message and get_last_message helpers added for tail rollback (append-only invariant preserved — only the trailing row is ever removed, so summary_up_to_msg_id is unaffected).
  • TurnRuntimeManager.regenerate_last_turn(session_id, overrides=None):
    • rejects when an active turn is running (regenerate_busy),
    • rejects when no prior user message exists (nothing_to_regenerate),
    • deletes the trailing assistant (if any),
    • rebuilds the payload from session preferences + the stored user message (including attachments),
    • sets _persist_user_message=False and _regenerate=True, then delegates to the regular start_turn to keep a single execution path.
  • New runtime_only_keys plus SESSION-event metadata (regenerated_from_message_id, superseded_turn_id, regenerate=true) so clients can reconcile the UI without a schema migration. Old turn rows are kept untouched (no destructive history mutation).
  • _run_turn skips memory_service.refresh_from_turn on regeneration to avoid double-updating long-term memory for the same user prompt.

Operational improvement bundled in

AgenticChatPipeline._stage_responding now warns and emits a non-terminal error event when the LLM returns an empty stream (zero non-empty chunks). The warning includes model, chunk count, prompt char size, and observation size so empty assistant messages can be diagnosed from logs and the Trace panel without further code instrumentation. This is independent of regenerate but materially improves UX when a model returns nothing — the user now sees a clear hint to hit Regenerate instead of staring at a silent empty bubble.

Test plan

tests/services/session/test_regenerate.py (18 tests):

  • _extract_regenerate_flag parsing.
  • Store: delete_message removes only the targeted row; get_last_message with/without role filter and on empty sessions.
  • Runtime unit tests with patched start_turn:
    • last is assistant → assistant deleted, payload replays user content / capability / tools / kb / language / attachments;
    • last is user → no deletion, new turn proceeds;
    • empty / missing session → nothing_to_regenerate;
    • active running turn → regenerate_busy;
    • overrides take precedence yet runtime flags are still injected.
  • End-to-end integration test wiring a fake orchestrator + tracking memory service: original user is preserved, prior assistant is replaced, the new SESSION event carries regenerate=true and regenerated_from_message_id, and refresh_from_turn is not invoked for the regenerated turn.

All 18 + neighbouring tests in tests/services/session/, tests/api/test_unified_ws_turn_runtime.py, tests/cli/test_chat_cli.py, tests/services/test_app_facade.py pass locally.

Manual smoke

  • Web UI: open any chat session (live or historical), click Regenerate under the last assistant message → optimistic deletion + new generation; user message preserved in the DB.
  • CLI: deeptutor chat, send a message, then /regenerate → re-runs the previous prompt with the same preferences; /retry is an alias.
  • WS: client sends { "type": "regenerate", "session_id": "..." }subscribe_turn for the new turn id; regenerate_busy / nothing_to_regenerate returned as error event with metadata.reason.

Notes

  • No schema migration required.
  • No new Python or npm dependencies.
  • AGENTS.md and the unified_ws.py module docstring updated to document the new CLI command and WS message type.

@pancacake
Copy link
Copy Markdown
Collaborator

Thanks for your contribution! Just done what i'm going to do lol

…-response diagnostics

Add a "regenerate last response" capability that re-runs the prior user
message in a session as a fresh turn without duplicating the user row.
Exposed across all three entry points:

- WebSocket: new `type: "regenerate"` message accepting an optional
  `overrides` field; mirrors the `start_turn` flow including
  `subscribe_turn`. Errors `regenerate_busy` / `nothing_to_regenerate`
  surface as non-fatal `error` events.
- CLI: `/regenerate` (alias `/retry`) inside `deeptutor chat`, backed by
  a shared `regenerate_and_render` helper.
- Web UI: per-message Regenerate button (RefreshCcw icon) sits next to
  Copy on the last assistant message; uses a new
  `regenerateLastMessage()` action that pops the trailing assistant
  optimistically and sends the new WS message. Replaces the previous
  Retry button (which depended on an in-memory `requestSnapshot` and
  silently disappeared after page reload), so regeneration now also
  works for historical sessions hydrated from the store.

Backend mechanics:

- `SQLiteSessionStore.delete_message` and `get_last_message` helpers for
  tail rollback (append-only invariant preserved).
- `TurnRuntimeManager.regenerate_last_turn` deletes the trailing
  assistant (if any), rebuilds the payload from session preferences and
  the stored user message (including attachments), sets
  `_persist_user_message=False` and `_regenerate=True`, then delegates
  to `start_turn`. New `runtime_only_keys` and SESSION-event metadata
  carry `regenerated_from_message_id` and `superseded_turn_id` so
  clients can reconcile the UI without a schema migration.
- `_run_turn` skips `memory_service.refresh_from_turn` on regeneration
  to avoid double-updating long-term memory for the same user prompt.
- Rejects when an active turn is already running (`regenerate_busy`)
  and when no prior user message exists (`nothing_to_regenerate`).

Operational improvement bundled in:

- `_stage_responding` now warns and emits a non-terminal `error` event
  when the LLM returns an empty stream (zero non-empty chunks). The
  warning includes model, chunk count, prompt char size, and observation
  size so empty assistant messages can be diagnosed from logs and the
  Trace panel without code instrumentation.

Tests: `tests/services/session/test_regenerate.py` (18 tests) covers
store deletion helpers, runtime regeneration paths (assistant tail,
user tail, empty/missing session, busy guard, override precedence) and
an end-to-end flow asserting the user row is preserved, the assistant
is replaced, the new SESSION event carries `regenerate=true` /
`regenerated_from_message_id`, and `refresh_from_turn` is not called
again on regeneration.
Collapses single-line test signatures that fit within the project's
100-char line length, matching what `uvx pre-commit run` would do on
this file. No behavioural change.
@pancacake pancacake force-pushed the feat/chat-regenerate-response branch from 1233ebe to f37c385 Compare April 20, 2026 16:12
@pancacake pancacake merged commit 0f6c81b into HKUDS:dev Apr 20, 2026
4 checks passed
pancacake added a commit that referenced this pull request Apr 20, 2026
- New `assets/releases/ver1-2-1.md` covering #348 (per-stage chat token
  limits), #349 (Regenerate across CLI/WS/Web UI), the regenerate UI
  harmony polish, and bug fixes #347 / #345 / #352.
- README release-notes block updated to surface v1.2.1 above v1.2.0.

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants