fix: bot resilience + card/parser UI fixes#96
Merged
Conversation
…cess-gate-first supervisor/restart A sustained "Conflict: terminated by other getUpdates request" was swallowed by the PTB error handler while the bot kept long-polling, leaving it permanently deaf until a manual kill. The instance now exits after a short Conflict streak (or >15s) so the singleton flock + supervisor converge to a single live bot. Supervisor reworked to be idempotent and fault-tolerant: it probes the process-level lock FIRST and backs off (never preempts) while a healthy instance holds it, only then waiting for network and launching. main.py yields cleanly (exit 0) when another healthy instance owns the lock instead of looking like a crash. restart.sh is the explicit, deliberate, graceful restart path: SIGTERM -> wait for full exit + lock release -> SIGKILL only after timeout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…stion AskUserQuestion Add a UIPattern for Claude Code's "Resume from summary" select shown when resuming a large/old session, so ccbot renders the arrow/Enter/Esc keyboard instead of leaving the resumed session hung. Add a footer-scrolled-off fallback pattern for tall multi-question AskUserQuestion prompts, guarded against colliding with Permission / ResumeSummary / Settings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…psing to 1 When the upstream Claude process stalls/exits mid-turn (no final assistant turn, only filtered metadata entries), the live card stayed frozen on its last frame. An active session whose spinner has been idle past a threshold with no new content is now finalized with a "went idle" note. The footer page counter collapsed to 1 during a card repost / stale-reset because state.events was wiped and not re-seeded before the footer was built; re-seed inline so the turn-based total stays correct. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Time4Mind
added a commit
that referenced
this pull request
May 23, 2026
…select stall (#98) * fix(card): stop double-repost flicker + duplicate user message Two coupled live-card regressions, both surfacing on the active session's card; neither introduced by the #97 refactor. Bug 1 (delete+resend flicker): repost_card refreshed last_rendered / last_edit_ts but not last_event_ts, so a card idle >= STALE_CARD_SECONDS was misjudged stale by the first event after the repost and a second card was spawned ~1-2s later. Stamp last_event_ts on repost (a repost is itself user activity). Bug 2 (user message rendered twice): the stale-reset and release_card_message wipe sites re-seed events from JSONL — which already holds the just-submitted prompt — then append the same live event again with no dedup. Guard both append sites with _duplicate_of_seeded (matches type/started_at/text; distinct turns never collide). Adds tests/test_card_dup_repost.py (5 regression tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(kb-mode): detect multi-select AskUserQuestion + debounce teardown A parked multi-select AskUserQuestion (numbered bracketed checkboxes N. [✔] / N. [ ] with the cursor ❯ on a separate Submit line) was not classified by any AskUserQuestion UIPattern once the user moved the cursor onto Submit and the ☐ header had scrolled off — none of the existing top anchors (bare ☐ glyph, or ❯ N. on a numbered option) match that frame. Detection dropped mid-prompt: the kb-mode keyboard vanished and, with all interactive_waiting signals gone, the A4 stall-rescue (PR #96) misfired the 'session went idle' note. terminal_parser: add a multi-select AUQ pattern anchored on the stable signatures (N. [✔]/[ ] checkbox lines, or the ❯ Submit line) framed by the Enter-to-select footer, plus the same anchors on the bottom-less last-resort pattern for the footer-also-scrolled-off case. status_polling: debounce kb-mode teardown — require KB_CLEAR_CONFIRM_POLLS consecutive no-UI polls before exit_kb_mode(clear_pending=True), matching exit_kb_mode's stated 'double-poll confirm' intent, so a single flickered detection frame can't wipe a prompt that is still on screen. Streak resets when the prompt is re-detected. Adds tests/test_askuser_multiselect.py + tests/test_kb_mode_debounce.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Time4Mind
added a commit
that referenced
this pull request
May 23, 2026
Audit (3 agents) of tests + docs + config vs the current code (post #96/#97/#98). Findings + fixes: Tests (suite was already healthy — no stale tests to delete): - Drop duplicate test_session.py::TestActiveSessions::test_set_and_clear (covered by test_session_dm.py create+delete); keep the empty-initial guard. - Strengthen 3 'is None or != AskUserQuestion' poach-guards to 'is None' (verified the parser returns None for those degraded-capture panes). Docs: - architecture.md: fix 'background sessions render their own cards' (they emit none — panel only); correct commands/ inventory (no /list /use /rename; add /health /help); add missing modules (logging_setup, metrics, voice_install, local_terminal, card_model, kb_mode, response_builder, context_poll [disabled], callbacks/help). - dm-architecture.md: rewrite the slash-command block to match setMyCommands reality (published: menu/help/history/done + forwarded CC pickers; hidden: new/kill/stop/archive/screenshot/usage/health). - dm-multisession-spec.md: BOT_TOKEN -> TELEGRAM_BOT_TOKEN, CLAUDE_BIN -> CLAUDE_COMMAND, MODEL_PATH -> WHISPER_MODEL_PATH; /history is published; flag /restore-file as not-yet-implemented. - dm-multisession-plan.md: banner marking it COMPLETED & SUPERSEDED (its file:line hotspot map targets the removed monolithic bot.py). Config: - .env.example: remove dead SESSION_TOKEN_BUDGET_5H / MAX_5H_TOKENS / MAX_WEEKLY_TOKENS (retired local token aggregator); document CCBOT_RESUME_SETTLE_TIMEOUT / CARD_EDIT_LAG / BG_STATUS_MAX / BG_STATUS_QUOTA_THRESHOLDS / LOG_LEVEL + a supervisor-knobs pointer. - session.py: stale '/status' -> 'Menu -> Status' in a comment. No behavior change (only a comment edit in src). 560 tests pass. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resilience + UI fixes surfaced by auditing prod logs (Conflict storms, stranded supervisors, frozen live cards, missing inline keyboards).
restart.sh(SIGTERM → wait for clean lock release → SIGKILL only on timeout).Test plan
ruff check+ruff format --checkcleanpyright src/ccbot/: 0 errorsshellcheckon supervisor/restart scripts: cleanpytest: 528 passed (28 new tests across 6 files)🤖 Generated with Claude Code