Skip to content

bug: AskUserQuestion multi-question flow crashes Untether with TypeError after answering question 1 of N #488

@nathanschram

Description

Summary

Unhandled TypeError in the AskUserQuestion multi-question flow crashes the entire Untether process when the user answers question 1 of N via inline option buttons. systemd auto-restarts in ~10 seconds, but the in-flight Claude session and any other active runs are killed.

Observed live on 2026-05-08 06:43:11 UTC (4:43 PM AEST) on @hetz_lba1_bot running v0.35.2 from PyPI. The bug is unfiled and the same buggy code path exists on the current dev branch (feature/289-loop-scheduler HEAD verified).

Stack trace

File "untether/telegram/loop.py", line 2376, in route_message
    await reply(text=next_msg)        # <-- next_msg is RenderedMessage, not str
File "untether/telegram/bridge.py", line 425, in send_plain
    rendered_text, entities = prepare_telegram(MarkdownParts(header=text))
File "untether/telegram/render.py", line 293, in prepare_telegram
    return render_markdown(assemble_markdown_parts(trimmed))
File "untether/markdown.py", line 31, in assemble_markdown_parts
    return "\n\n".join(
TypeError: sequence item 0: expected str instance, RenderedMessage found

Root cause

src/untether/telegram/loop.py:2362-2376 builds a RenderedMessage for the next question (with HTML parse_mode + inline-keyboard option buttons) then passes it to reply(text=...):

buttons = get_question_option_buttons(flow)
from ..transport import RenderedMessage as _RM

next_msg = _RM(
    text=msg_text,
    extra={
        "parse_mode": "HTML",
        "reply_markup": {"inline_keyboard": buttons},
    },
)
await reply(text=next_msg)  # 💥 reply is bound to send_plain, expects str

reply here is the send_plain partial used for AskUserQuestion confirmations — its text: kwarg is typed str, not RenderedMessage. Passing a RenderedMessage makes markdown.assemble_markdown_parts try to "\n\n".join((rendered_message, body, footer)) which fails.

Impact

  • High — crashes the entire Untether process, killing ALL active runs across all chats (not just the one in the multi-question flow).
  • systemd Restart=on-failure recovers in ~10s thanks to RestartSec=2 + child cleanup; offset_persistence.py (research: graceful restart improvements — reduce downtime and user impact #287) prevents Telegram update loss. So the visible damage is "your runs got killed and the bot restarted with a startup message" — manageable but bad UX.
  • Trigger: any AskUserQuestion call with ≥2 questions where the user answers the first via inline option buttons. Single-question flows are not affected.

Reproduction

  1. Send a prompt to a Claude session via Untether that causes Claude to call AskUserQuestion with at least 2 questions, each with options.
  2. Answer the first question by clicking an option button.
  3. Untether tries to send question 2 with its option buttons → crashes.

The bug only triggers on the rendered code path that includes inline-keyboard buttons for option-based questions. Open-ended (text-reply) questions wouldn't go through this branch.

Why no telemetry caught it earlier

untether-issue-watcher (the local error-log → GitHub issue forwarder) only watches specific named structlog events (handle.worker_failed, handle.runner_failed, etc.). An unhandled exception in route_message propagates up the asyncio task and kills the process before any structured logging can emit, so the watcher misses it. The crash is only visible in the systemd journal as a Rich-formatted Python traceback.

Proposed fix

Replace the reply(text=next_msg) path with a direct transport.send carrying the RenderedMessage and inline keyboard, modelled on the same flow's first-question send helper in commands/ask_question.py:

# src/untether/telegram/loop.py around line 2367-2376
buttons = get_question_option_buttons(flow)
await transport.send(
    channel_id=chat_id,
    message=RenderedMessage(
        text=msg_text,
        extra={
            "parse_mode": "HTML",
            "reply_markup": {"inline_keyboard": buttons},
        },
    ),
    options=SendOptions(
        reply_to=MessageRef(channel_id=chat_id, message_id=user_msg_id),
        thread_id=thread_id,
    ),
)
return

Test coverage

Add a regression test in tests/test_ask_user_question.py that drives a 2-question flow through route_message, asserts the second question is sent successfully, and asserts no exception is raised. Existing tests in this file cover single-question flows and registry lifecycle but do not exercise the multi-question continuation path.

Affected files

  • src/untether/telegram/loop.py (lines ~2362-2376)
  • tests/test_ask_user_question.py (regression test to add)
  • CHANGELOG.md (entry under v0.35.3rc10)

Target

v0.35.3rc10 — should ship in the next staging rc since the bug breaks a user-visible Claude-only feature and is trivially reproducible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingengine:claudeClaude Code CLI (Anthropic)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions