0.35.3rc19 — /monitor campaign issue sweep (10 commits) by nathanschram · Pull Request #555 · littlebearapps/untether

Nathan Schram (nathanschram) · 2026-05-17T08:33:57Z

Summary

Bundles 9 OPEN issues from the recent /monitor campaign (lba-1 staging, runs 2026-05-13 through 2026-05-16) into one rc19 staged for fleet rollout, plus the daemon-filed #533 duplicate.

User-confirmed scope decisions: all in-scope issues moved into v0.35.3 milestone; #527 (umbrella stall predicate refactor) explicitly deferred to v0.35.4.

Already fixed in master (closed out separately)

bug(#507 redux): ScheduleWakeup state cleared on tool_result before _post_result_idle_watchdog reads it #544 — ScheduleWakeup state-lifetime — verified rc16/PR fix: #544 — ScheduleWakeup state-lifetime (#507 redux for v0.35.3rc16) #545; 3 new + 2 existing tests pass
monitor: catalog.refresh_sent storm in 'scout' (count=183) #497 — catalog.refresh_sent storm — verified rc14/PR fix: #497 — debounce catalog.refresh_sent to prevent storms #541

Bundled here

#	Title	Commit
#528	"↩️ Answered:" echo no longer truncated to 100 chars	`d175be9`
#525	dedup cancel.requested triple-fire (1s TTL on chat/msg id)	`f558235`
#532	consolidate per-engine setup.warning to one setup.summary INFO	`a9dc234`
#547 axis 1	preamble warns agents against `systemctl restart untether`	`12cf4ca`
#523	recognise leading-dot slash-command typos (`.new`, `.usage`, …)	`b02c20d`
#524	surface skipped outbox entries (directories etc.) in chat	`fde1bd8`
#526 (+ #533)	demote stall WARN to INFO during approval-pending	`66ba78f`
#547 axes 2+3 + #548	hot-reload Telegram confirmation message	`40dc6b7`
#546	bypass outbox for answer_callback_query (latency fix)	`ee03b32`
version bump	0.35.3rc18 → 0.35.3rc19 + uv.lock	`12c2d71`

Deferred to v0.35.4

ENH-PATCH: agent self-restart pattern after editing untether.toml — hot-reload ignored, drain timeout, outbox message dropped #547 axis 3 — drain self-restart heuristic (needs bigger refactor; axes 1+2 break the pattern at its source)
tracking: unified predicate-based stall detector — supersede the whack-a-mole pattern across releases #527 — unified predicate-based stall detector umbrella refactor (per user decision)

Pre-PR validation

uv run pytest — 2737 passed / 2 skipped (1 skipped is pre-existing)
Coverage gate — 82.58% (above 80%)
uv run ruff check src/ tests/ — clean
uv run ruff format --check src/ tests/ — clean
uv lock --check — clean

Test plan

CI green on TestPyPI publish path
After merge to dev: integration tests on @untether_dev_bot (claude chat -5284581592):
- U2: AskUserQuestion → "Other (type it out)" → 250-char reply → echo not truncated ("↩️ Answered:" confirmation truncated to 100 chars after AskUserQuestion text reply (agent receives full text) #528)
- U6: ExitPlanMode → triple-tap Cancel within 1s → journal shows ONE cancel.requested (TRIVIAL: cancel.requested fires 3× within 512 ms for a single user cancel (button not disabled after first click or duplicate dispatch) #525)
- U7: /new then any prompt → assistant transcript contains the new preamble sentence (ENH-PATCH: agent self-restart pattern after editing untether.toml — hot-reload ignored, drain timeout, outbox message dropped #547 axis 1)
- send .new → reply offers /new correction (ENH-PATCH: Untether bot dispatches an agent run for slash-command typos like .new (no recognition, full Claude cost incurred) #523)
- edit ~/.untether-dev/untether.toml → confirm "♻️ Hot-reloaded ... No restart needed." message appears in chat (ENH-PATCH: agent self-restart pattern after editing untether.toml — hot-reload ignored, drain timeout, outbox message dropped #547 axis 2 / ENH-PATCH: hot-reload success Telegram notification with explicit "no restart needed" framing #548); journal shows ONE setup.summary INFO (no per-engine WARN, ENH-PATCH: setup.warning fires per-engine on every config.reload — 5×N noise when host has fewer engines than fleet #532)
- prompt Claude to write .untether-outbox/guides/ (dir) + .untether-outbox/x.md (file) → final message includes "📎 Outbox skipped: guides/ — directory" + delivers x.md (bug: outbox silently drops directory entries (e.g. guides/) — silent loss of intended deliveries on every session #524)
- trigger ExitPlanMode and leave pending 15+ min → no subprocess.liveness_stall WARNs; subprocess.approval_pending INFO fires once per 30 min (ENH-PATCH: differentiate approval-pending stalls from genuine hangs in stall warnings + Telegram messaging #526)
- rapid-tap 3 approvals → all callback.answered.latency_ms stay near the ~220ms baseline (ENH-PATCH: callback-answer latency escalates 6-10× under rapid-click clusters (220ms baseline → 1.4-2.9s on 2nd/3rd click) #546)
scripts/run-integration-tests.sh 0.35.3rc19 --manual attestation written
scripts/fleet-rollout.sh 0.35.3rc19 to all 4 hosts (lba-1 + nsd + channelo + mac); /ping smoke on each
Comment on each closed issue with rc19 rollout confirmation

🤖 Generated with Claude Code

The `↩️ Answered:` confirmation after an AskUserQuestion text reply was hard-sliced at [:100] with no ellipsis, so users couldn't see whether their full message reached the agent (the agent path was unaffected and always received the complete text). Replace the slice with a 300-char soft cap + ellipsis via the new `_format_answered_echo` helper. Regression tests in tests/test_loop_coverage.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rapid double/triple-tap of the inline Cancel button delivered three `cancel.requested` events for one user intent (Telegram delivered duplicate callbacks before the keyboard cleared). Repeat `cancel_requested.set()` was benign today, but log noise + future side-effectful cancel actions would inherit the 3x fan-out. Add a 1-second TTL dedup keyed on (chat_id, progress_message_id) in all three cancel entry points (text-reply, text-fallback, callback). Per-test autouse fixture clears the module-level dict so tests that reuse (chat_id, msg_id) aren't surprised by silent drops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ummary Previously every `config.reload.applied` emitted one [warning] setup.warning per engine not on PATH. On a single-engine host (e.g. channelo runs only claude) that's 5 WARNs per reload, padding warn filters in untether-issue-watcher, /monitor, and Grafana with intentional install state. Replace with one INFO `setup.summary` line per reload that lists found/missing_on_path/bad_config engines. Loud WARN now reserved for engines the user actively configured (non-empty [engines.<id>] block) but that aren't on PATH — those are noteworthy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tether Agents routinely follow `Edit untether.toml` with `Bash systemctl --user restart untether` because their training data is full of "restart the service after config changes". Untether already hot-reloads the file; the restart shuts down the very session issuing the command, drain hits the 120s timeout, and the agent's final answer to the user is silently dropped via outbox.fail_pending. Add a dedicated "Configuration changes (`untether.toml`)" section to the default preamble explicitly telling agents NOT to restart after editing config, with the consequence spelled out and the restart-only key list (`bot_token`, `chat_id`, `session_mode`, `topics`, `message_overflow`) provided as the genuine exception. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

\`.new\`, \`.cancel\`, etc. previously dispatched as fresh agent prompts — full Claude cold-start cost (OAuth handshake, MCP catalog probe, preamble injection) paid before the user could cancel. \`.\` and \`/\` are adjacent on iOS/Android punctuation rows, and several mobile keyboards auto-replace a leading \`/\` with \`.\` on autocorrect. Add \`parse_dot_typo\` helper that recognises \`.<cmd>\` and \`.<cmd> args\` shapes where <cmd> matches a registered slash command (case-insensitive, ellipsis/path-prefix safe). Wired into route_message in a follow-up commit so detection happens before agent dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When an agent writes a directory (e.g. \`guides/\`) into \`.untether-outbox/\`, the scanner logs \`outbox.skipped\` and drops it without surfacing anything to the user. The agent's "I've prepared the guides folder for you" final message becomes a silent lie. Wire the skipped tuples through to a new \`_format_outbox_skipped_notice\` helper in runner_bridge.py (added in the #547 axis 1 commit alongside the preamble update) that composes a brief 📎 Outbox skipped block and sends it as a follow-up message in the same chat. Gated by new \`[transports.telegram.files].outbox_notify_skipped\` config flag (default true so the surface fires automatically on upgrade). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The runner_bridge.py change shipped in the #547 axis 1 commit (same file edited for both fixes); this commit adds the regression tests. When threshold_reason == \"pending_approval\", emit a paced \`subprocess.approval_pending\` INFO (max once per 30 min) instead of the \`progress_edits.stall_detected\` WARN. The chat-side \"⏳ Awaiting your approval (N min)\" message (#494-C) is unchanged — only the log-side WARN is suppressed, so warn-filter dashboards and the untether-issue-watcher daemon stop spamming on legitimate approval waits. Also closes #533 as a duplicate (daemon-filed subprocess.liveness_stall on nsd — same root cause). Tests in test_exec_bridge.py assert: - progress_edits.stall_detected WARN is NOT emitted when approval_pending is true - subprocess.approval_pending INFO IS emitted with approval_pending=True - The INFO fires at most once per 30-min window even with rapid ticks Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…g.reload The loop.py wiring (\`_notify_reload_applied\` + handle_reload changes) shipped in the #528 commit (same file edited for both fixes); this commit adds the formatting module, tests, and FAQ entry. New module \`src/untether/config_reload_notification.py\` exposes three message shapes (hot-reload-only / restart-required / partial-reload) with literal "**No restart needed.**" / "**Restart required**" headlines — agents read these messages in next-turn context and the framing flips the trained-in reflex to \`systemctl restart\` after editing config. Broadcast follows the same project-chat + admin-DM fan-out pattern as \`_notify_restart_required\` (#318) so the affirmation reaches whoever edited the file even in project-routed deployments. FAQ entry "Do I need to restart Untether after editing untether.toml?" documents the hot-reload behaviour, restart-only key list, and the agent-don't-restart guidance from #547 axis 1. Axis 3 (drain self-restart heuristic) deferred to v0.35.4 — the obvious "detect the active session ran systemctl" heuristic is fragile and inverts cleanly to false-positive on legitimate sibling-unit restarts. The robust path needs a bigger refactor. Axes 1+2 together break the recurring pattern at its source. Closes #548. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Rapid taps (e.g. approving plans in two chats inside ~2s) saw callback-answer latency escalate 6-10x: 1st click ~220ms HTTP baseline, 2nd/3rd clicks 1.4-2.9s. Root cause was that \`answer_callback_query\` went through \`enqueue_op(chat_id=None)\` and stacked on the shared \`_next_at[None]\` per-chat pacing bucket (private_interval=1.0s) even though Telegram doesn't rate-limit callback-answers per chat — they're keyed off callback-query-id. Route \`answer_callback_query\` directly through \`self._client.answer_callback_query\`, bypassing the outbox semaphore + per-chat pacing. Retry-after handling preserved (one retry on TelegramRetryAfter then fail-fast — better than silent retry loops since the spinner expires after 30s anyway). Add \`queue_wait_ms=0.0\` field to \`callback.answered\` instrumentation so monitoring dashboards can confirm the bypass survived future refactors. Regression test asserts the outbox.enqueue path is never reached during answer_callback_query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bundles 9 issues from the recent /monitor campaign (lba-1 staging, 2026-05-13 through 2026-05-16 runs) + the daemon-filed #533 dup: - #528 — Answered: echo no longer truncated to 100 chars - #525 — dedup cancel.requested triple-fire - #532 — consolidate per-engine setup.warning to one summary - #547 axis 1 — preamble warns agents against systemctl restart - #523 — recognise leading-dot slash-command typos (\`.new\` etc.) - #524 — surface skipped outbox entries (directories etc.) in chat - #526 — demote stall WARN to INFO during approval-pending (+ #533) - #547 axes 2+3 + #548 — hot-reload Telegram notification - #546 — bypass outbox for answer_callback_query (latency fix) Plus housekeeping: closed #544 / #497 (verified already fixed in rc16 / rc14), closed #531 (label + monitor TOML config drift fixed out-of-tree). Axis 3 of #547 (drain self-restart heuristic) deferred to v0.35.4 — needs a bigger refactor than rc19 wants. #527 (umbrella predicate refactor) deferred to v0.35.4 per user decision. 2737 tests pass / 82.58% coverage / ruff format + check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-17T08:34:05Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b819456c-f039-4409-bb28-06901b5b4952

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/rc19-monitor-campaign-bundle

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

) Both issues shipped rc19 fixes (PR #555) but /monitor audits on 2026-05-18 showed each regression still firing in production because the rc19 patch landed in only one of two code paths. #524 — outbox silently drops directory entries rc19 surfaced 📎 Outbox skipped notices on the normal-completion path in handle_message but missed two adjacent paths: the pre-auto-continue delivery (subprocess 1 stuck-after-tool-result recovery) and the run_ok=False failed-run branch. Both still silently dropped the agent's intended deliverable. This commit extracts the surfacing logic into _surface_outbox_skipped in runner_bridge.py and wires it into both gap paths. On a failed run the code still skips the actual file send (preserving the original gating) but does a cheap scan_outbox() to collect skipped items and surface them, so the user always learns what the agent intended to ship. Honours the existing outbox_notify_skipped config flag and filters the "..." overflow pseudo-entry from the user-facing block. #526 — approval-pending stalls misclassified rc19 demoted the bridge-side WARN (progress_edits.stall_detected) to a paced INFO (subprocess.approval_pending) when _has_pending_approval() returned true. The watchdog-side detector in runner.py (which emits subprocess.liveness_stall and is the actual signal untether-issue-watcher auto-files on) was untouched, so the daemon kept filing GitHub issues on routine approval-pending sessions and the nsd audit (2026-05-18) showed a user cancelling a productive 15-minute investigation because the chat-side reassurance came too late (1800s threshold). This commit: - Adds _recent_event_is_control_request helper in runner.py — uses the stream's recent_events ring buffer as the approval-pending signal, consistent with the bridge's inline-keyboard predicate but accessible to runner-scope code. - Plumbs the predicate into _watchdog_loop: when the last JSONL event is control_request, emit subprocess.approval_pending INFO instead of liveness_stall WARN. Skip the auto-kill branch entirely. Pace INFO emission to once per 30 min via the shared _APPROVAL_PENDING_REFIRE_S constant (now defined once in runner.py and imported by the bridge). - Splits _STALL_THRESHOLD_APPROVAL into _STALL_THRESHOLD_APPROVAL_FIRST (600s) and the existing 1800s refire so the user gets a reassuring "tap a button above" chat message at 10 min on first occurrence, matching the watchdog's liveness threshold and avoiding the nsd-style early cancellation. - Rewords the chat-side approval reminder copy to make the "tap a button above to proceed (no action needed otherwise)" affordance explicit, directly quoting the audit's recommended text. Tests cover both code paths: - tests/test_outbox_delivery.py (existing) — format helper + settings default unchanged; no new file-level tests needed. - tests/test_exec_bridge.py — failed-run surfacing, notify_skipped=false suppression, only-overflow filter, two-tier first-reminder threshold, reworded copy. - tests/test_exec_runner.py — predicate truth-table coverage, watchdog demotion via integration with a fake codex script emitting control _request, watchdog WARN still fires when no control_request is recent. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Nathan Schram (nathanschram) and others added 10 commits May 17, 2026 18:32

Nathan Schram (nathanschram) merged commit b23d1f8 into dev May 17, 2026
21 checks passed

Nathan Schram (nathanschram) deleted the fix/rc19-monitor-campaign-bundle branch May 17, 2026 08:35

This was referenced May 20, 2026

ENH: drain-timeout heuristic for self-restart from active session (follow-up to #547 axis 3) #559

Open

release: v0.35.3 #560

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.35.3rc19 — /monitor campaign issue sweep (10 commits)#555

0.35.3rc19 — /monitor campaign issue sweep (10 commits)#555
Nathan Schram (nathanschram) merged 10 commits into
devfrom
fix/rc19-monitor-campaign-bundle

Nathan Schram (nathanschram) commented May 17, 2026

Uh oh!

coderabbitai Bot commented May 17, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nathan Schram (nathanschram) commented May 17, 2026

Summary

Already fixed in master (closed out separately)

Bundled here

Deferred to v0.35.4

Pre-PR validation

Test plan

Uh oh!

coderabbitai Bot commented May 17, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant