feat(chat): regenerate last response (CLI, WebSocket, Web UI) by DarkGenius · Pull Request #349 · HKUDS/DeepTutor

DarkGenius · 2026-04-20T07:00:26Z

Summary

Adds a "regenerate last response" capability that re-runs the prior user message in a session as a fresh turn without duplicating the user row. Exposed across all three entry points:

WebSocket — new type: "regenerate" message accepting an optional overrides field; mirrors the start_turn flow including subscribe_turn. Errors regenerate_busy / nothing_to_regenerate surface as non-fatal error events.
CLI — /regenerate (alias /retry) inside deeptutor chat, backed by a shared regenerate_and_render helper.
Web UI — per-message Regenerate button (RefreshCcw icon) sits next to Copy on the last assistant message. Uses a new regenerateLastMessage() action that pops the trailing assistant optimistically and sends the new WS message. Replaces the previous Retry button (which depended on an in-memory requestSnapshot and silently disappeared after page reload), so regeneration now also works for historical sessions hydrated from the store.

Design

SQLiteSessionStore.delete_message and get_last_message helpers added for tail rollback (append-only invariant preserved — only the trailing row is ever removed, so summary_up_to_msg_id is unaffected).
TurnRuntimeManager.regenerate_last_turn(session_id, overrides=None):
- rejects when an active turn is running (regenerate_busy),
- rejects when no prior user message exists (nothing_to_regenerate),
- deletes the trailing assistant (if any),
- rebuilds the payload from session preferences + the stored user message (including attachments),
- sets _persist_user_message=False and _regenerate=True, then delegates to the regular start_turn to keep a single execution path.
New runtime_only_keys plus SESSION-event metadata (regenerated_from_message_id, superseded_turn_id, regenerate=true) so clients can reconcile the UI without a schema migration. Old turn rows are kept untouched (no destructive history mutation).
_run_turn skips memory_service.refresh_from_turn on regeneration to avoid double-updating long-term memory for the same user prompt.

Operational improvement bundled in

AgenticChatPipeline._stage_responding now warns and emits a non-terminal error event when the LLM returns an empty stream (zero non-empty chunks). The warning includes model, chunk count, prompt char size, and observation size so empty assistant messages can be diagnosed from logs and the Trace panel without further code instrumentation. This is independent of regenerate but materially improves UX when a model returns nothing — the user now sees a clear hint to hit Regenerate instead of staring at a silent empty bubble.

Test plan

tests/services/session/test_regenerate.py (18 tests):

_extract_regenerate_flag parsing.
Store: delete_message removes only the targeted row; get_last_message with/without role filter and on empty sessions.
Runtime unit tests with patched start_turn:
- last is assistant → assistant deleted, payload replays user content / capability / tools / kb / language / attachments;
- last is user → no deletion, new turn proceeds;
- empty / missing session → nothing_to_regenerate;
- active running turn → regenerate_busy;
- overrides take precedence yet runtime flags are still injected.
End-to-end integration test wiring a fake orchestrator + tracking memory service: original user is preserved, prior assistant is replaced, the new SESSION event carries regenerate=true and regenerated_from_message_id, and refresh_from_turn is not invoked for the regenerated turn.

All 18 + neighbouring tests in tests/services/session/, tests/api/test_unified_ws_turn_runtime.py, tests/cli/test_chat_cli.py, tests/services/test_app_facade.py pass locally.

Manual smoke

Web UI: open any chat session (live or historical), click Regenerate under the last assistant message → optimistic deletion + new generation; user message preserved in the DB.
CLI: deeptutor chat, send a message, then /regenerate → re-runs the previous prompt with the same preferences; /retry is an alias.
WS: client sends { "type": "regenerate", "session_id": "..." } → subscribe_turn for the new turn id; regenerate_busy / nothing_to_regenerate returned as error event with metadata.reason.

Notes

No schema migration required.
No new Python or npm dependencies.
AGENTS.md and the unified_ws.py module docstring updated to document the new CLI command and WS message type.

pancacake · 2026-04-20T15:52:49Z

Thanks for your contribution! Just done what i'm going to do lol

…-response diagnostics Add a "regenerate last response" capability that re-runs the prior user message in a session as a fresh turn without duplicating the user row. Exposed across all three entry points: - WebSocket: new `type: "regenerate"` message accepting an optional `overrides` field; mirrors the `start_turn` flow including `subscribe_turn`. Errors `regenerate_busy` / `nothing_to_regenerate` surface as non-fatal `error` events. - CLI: `/regenerate` (alias `/retry`) inside `deeptutor chat`, backed by a shared `regenerate_and_render` helper. - Web UI: per-message Regenerate button (RefreshCcw icon) sits next to Copy on the last assistant message; uses a new `regenerateLastMessage()` action that pops the trailing assistant optimistically and sends the new WS message. Replaces the previous Retry button (which depended on an in-memory `requestSnapshot` and silently disappeared after page reload), so regeneration now also works for historical sessions hydrated from the store. Backend mechanics: - `SQLiteSessionStore.delete_message` and `get_last_message` helpers for tail rollback (append-only invariant preserved). - `TurnRuntimeManager.regenerate_last_turn` deletes the trailing assistant (if any), rebuilds the payload from session preferences and the stored user message (including attachments), sets `_persist_user_message=False` and `_regenerate=True`, then delegates to `start_turn`. New `runtime_only_keys` and SESSION-event metadata carry `regenerated_from_message_id` and `superseded_turn_id` so clients can reconcile the UI without a schema migration. - `_run_turn` skips `memory_service.refresh_from_turn` on regeneration to avoid double-updating long-term memory for the same user prompt. - Rejects when an active turn is already running (`regenerate_busy`) and when no prior user message exists (`nothing_to_regenerate`). Operational improvement bundled in: - `_stage_responding` now warns and emits a non-terminal `error` event when the LLM returns an empty stream (zero non-empty chunks). The warning includes model, chunk count, prompt char size, and observation size so empty assistant messages can be diagnosed from logs and the Trace panel without code instrumentation. Tests: `tests/services/session/test_regenerate.py` (18 tests) covers store deletion helpers, runtime regeneration paths (assistant tail, user tail, empty/missing session, busy guard, override precedence) and an end-to-end flow asserting the user row is preserved, the assistant is replaced, the new SESSION event carries `regenerate=true` / `regenerated_from_message_id`, and `refresh_from_turn` is not called again on regeneration.

Collapses single-line test signatures that fit within the project's 100-char line length, matching what `uvx pre-commit run` would do on this file. No behavioural change.

- New `assets/releases/ver1-2-1.md` covering #348 (per-stage chat token limits), #349 (Regenerate across CLI/WS/Web UI), the regenerate UI harmony polish, and bug fixes #347 / #345 / #352. - README release-notes block updated to surface v1.2.1 above v1.2.0. Made-with: Cursor

DarkGenius added 2 commits April 21, 2026 00:12

style(tests): apply ruff-format to test_regenerate.py

f37c385

Collapses single-line test signatures that fit within the project's 100-char line length, matching what `uvx pre-commit run` would do on this file. No behavioural change.

pancacake force-pushed the feat/chat-regenerate-response branch from 1233ebe to f37c385 Compare April 20, 2026 16:12

pancacake merged commit 0f6c81b into HKUDS:dev Apr 20, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(chat): regenerate last response (CLI, WebSocket, Web UI)#349

feat(chat): regenerate last response (CLI, WebSocket, Web UI)#349
pancacake merged 2 commits intoHKUDS:devfrom
DarkGenius:feat/chat-regenerate-response

DarkGenius commented Apr 20, 2026

Uh oh!

pancacake commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

DarkGenius commented Apr 20, 2026

Summary

Design

Operational improvement bundled in

Test plan

Manual smoke

Notes

Uh oh!

pancacake commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants