Skip to content

Conversation

@ThomasK33
Copy link
Member

Fixes a restart edge case where ask_user_question was treated as an interrupted stream.

When Mux is closed while the agent is blocked on ask_user_question, we now treat that tool call as a durable waiting-for-input state:

  • No Retry/Interrupted UI on restart
  • No auto-resume that re-runs the LLM call and re-asks the questions
  • Interrupt keybinds (Esc/Ctrl+C) + command palette interrupt are disabled while awaiting questions

On restart, answering the questions now works even though the in-memory pending tool call is gone:

  • Backend persists the tool result into partial.json (or chat history) and emits a synthetic tool-call-end
  • Frontend triggers a manual resume check so the assistant continues immediately after answers

📋 Implementation Plan

🤖 Plan: Persist ask_user_question as a true “waiting for input” state

Goal

When a workspace is blocked on ask_user_question and the user closes/reopens Mux, do not treat that as an interrupted stream that must be auto-resumed. Instead:

  • Restore the existing question UI
  • Allow answering the questions
  • Resume the assistant only after answers are submitted
  • Do not let Esc cancel/interrupt while we’re awaiting an ask_user_question

Recommended approach (minimal + consistent)

Make ask_user_question “resume-safe” by treating it as a special waiting state (not an interruption).

What changes, behavior-wise

  1. After app restart with a partial message whose last part is an unfinished ask_user_question tool call:

    • Tool UI shows as executing (answerable), not interrupted.
    • We do not show the “Interrupted” chat barrier.
    • We do not show the RetryBarrier.
    • We do not auto-call resumeStream().
  2. While actively awaiting ask_user_question (stream is still “running” but blocked on user input):

    • Esc / Ctrl+C interrupt keybind becomes a no-op for this state.
    • UI hints should not advertise interrupting; they should point to answering/cancel-by-chat.
  3. When the user submits answers after a restart (no active stream exists anymore):

    • Backend updates the persisted partial (or history) message to mark the tool call as output-available with { questions, answers }.
    • Backend emits a synthetic tool-call-end event so the renderer updates immediately.
    • Frontend triggers a resume check (manual) so the assistant continues promptly.

Why this works with the current architecture

  • On restart, we usually have partial.json with the assistant message containing the tool call.
  • Today, we mark unfinished tools in partial messages as interrupted, which triggers:
    • Retry UI + auto-resume manager → extra LLM callnew tool call
  • By:
    • keeping the tool answerable, and
    • suppressing the “interrupted” UX + auto-resume,
      we avoid re-running the LLM just to re-create questions.

Implementation steps

1) Frontend: classify ask_user_question in partial messages as “executing”

Files:

  • src/browser/utils/messages/StreamingMessageAggregator.ts

Change: In getDisplayedMessages() tool status mapping:

  • Current: input-available && message.metadata.partial → status = "interrupted"
  • New: if toolName === "ask_user_question", treat input-available as "executing" even when partial.

Also tighten hasAwaitingUserQuestion():

  • Only consider the latest displayed message (or latest tool message) to avoid “stale waiting” if the user continues the chat.

2) Frontend: suppress “Interrupted” + Retry + auto-resume for that state

Files:

  • src/browser/utils/messages/retryEligibility.ts
  • src/browser/utils/messages/messageUtils.ts
  • src/browser/components/AIView.tsx (optional defense-in-depth)

Change:

  • hasInterruptedStream(...): if the last message is a tool message with toolName === "ask_user_question" and status === "executing", return false.
    • This automatically disables:
      • RetryBarrier
      • useResumeManager auto-resume
  • shouldShowInterruptedBarrier(msg): return false for the same tool message type.
  • (Optional) In AIView, also gate showRetryBarrier by !awaitingUserQuestion for extra safety.

3) Frontend: disable interrupt keybind while awaiting questions

File:

  • src/browser/hooks/useAIViewKeybinds.ts

Change:

  • When the interrupt keybind is pressed:
    • If aggregator?.hasAwaitingUserQuestion() is true, do not call workspace.interruptStream and do not toggle autoRetry.

4) Frontend: stop advertising Esc for this state

Files:

  • src/browser/components/ChatInput/index.tsx
  • src/browser/components/AIView.tsx

Change (small UX polish):

  • Add awaitingUserQuestion as a prop to ChatInput so the placeholder/hints avoid “Esc to interrupt” and instead reflect “Answer above / type a message to respond”.

5) Backend: allow answering after restart (no active stream)

Files:

  • src/node/services/workspaceService.ts

Change: make answerAskUserQuestion(...) async and implement fallback:

  1. Try the current in-memory path:
    • If the tool is actually pending in askUserQuestionManager, resolve it (existing behavior).
  2. Otherwise (restart case):
    • Read partial.json via partialService.readPartial(workspaceId).
      • If the partial message contains the ask_user_question toolCallId, update that tool part to output-available with { questions, answers } and write back via partialService.writePartial.
    • Else: locate the message in chat.jsonl via historyService.getHistory and update via historyService.updateHistory.
    • Emit a synthetic tool-call-end chat event using session.emitChatEvent(...) so the UI updates immediately.

Important guardrails (defensive programming):

  • Validate the tool part’s input matches AskUserQuestionToolArgs shape before using it.
  • Refuse to answer if the tool call is stale (e.g., the message is not the latest assistant message in history), returning a clear error.

6) Frontend: after successful submit, request resume immediately

File:

  • src/browser/components/tools/AskUserQuestionToolCall.tsx

Change: after answerAskUserQuestion succeeds:

  • Dispatch CUSTOM_EVENTS.RESUME_CHECK_REQUESTED with { workspaceId, isManual: true }.
    • This bypasses autoRetry=false state and resumes promptly.

Tests

Frontend unit tests

  • src/browser/utils/messages/StreamingMessageAggregator.status.test.ts
    • New case: partial message with unfinished ask_user_question → tool status is executing and hasAwaitingUserQuestion() is true.
  • src/browser/utils/messages/retryEligibility.test.ts
    • New case: last message is partial ask_user_question executing → hasInterruptedStream is false.

Backend unit tests

  • Add a small pure helper (new file or colocated) that:
    • finds a tool part by toolCallId,
    • builds { questions, answers },
    • returns updated message.
  • Test:
    • success path (input has questions)
    • failure path (toolCallId missing)
    • stale-guard path (message not last)

Rollout / validation

  1. Manual repro:
    • Trigger ask_user_question.
    • Quit Mux completely.
    • Relaunch → ensure questions are still answerable, no retry UI, no auto-resume.
    • Submit answers → ensure mux resumes and continues.
  2. Confirm Esc does not interrupt while awaiting questions.
  3. Confirm “real” interrupted streams (non-ask_user_question) still show RetryBarrier + auto-resume.

Net LoC estimate (product code only)

  • Recommended approach: ~180–260 LoC
    • Frontend classification + eligibility + keybind gating: ~60–100
    • Backend fallback answer persistence + event emission: ~120–160

Alternatives considered

A) Persist/resume the actual in-flight model request

  • Would require provider-specific “resume from tool call” / response-id continuation.
  • High complexity, brittle across providers.
  • Not recommended.

B) Make convertToModelMessages(ignoreIncompleteToolCalls=false) for ask_user_question

  • Risks sending incomplete tool calls to providers (API validation failures).
  • Still doesn’t solve “answer after restart” unless we persist tool result.
  • Not recommended.

Generated with mux • Model: openai:gpt-5.2 • Thinking: xhigh

@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Repo admins can enable using credits for code reviews in their settings.

Change-Id: I0ae14c18a3380a752e787012e4b3a3ee88429b54
Signed-off-by: Thomas Kosiewski <tk@coder.com>
Change-Id: Id57c2cf78994d5713020f7922bb414e5af773410
Signed-off-by: Thomas Kosiewski <tk@coder.com>
@ThomasK33 ThomasK33 force-pushed the ask-user-questions-interruption branch from 26ae06e to 8420997 Compare December 14, 2025 20:24
@ThomasK33 ThomasK33 added this pull request to the merge queue Dec 14, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 14, 2025
@ThomasK33 ThomasK33 merged commit f50fb3a into main Dec 14, 2025
20 checks passed
@ThomasK33 ThomasK33 deleted the ask-user-questions-interruption branch December 14, 2025 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant