0.35.3rc19 — /monitor campaign issue sweep (10 commits)#555
Merged
Nathan Schram (nathanschram) merged 10 commits intoMay 17, 2026
Conversation
The `↩️ Answered:` confirmation after an AskUserQuestion text reply was hard-sliced at [:100] with no ellipsis, so users couldn't see whether their full message reached the agent (the agent path was unaffected and always received the complete text). Replace the slice with a 300-char soft cap + ellipsis via the new `_format_answered_echo` helper. Regression tests in tests/test_loop_coverage.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rapid double/triple-tap of the inline Cancel button delivered three `cancel.requested` events for one user intent (Telegram delivered duplicate callbacks before the keyboard cleared). Repeat `cancel_requested.set()` was benign today, but log noise + future side-effectful cancel actions would inherit the 3x fan-out. Add a 1-second TTL dedup keyed on (chat_id, progress_message_id) in all three cancel entry points (text-reply, text-fallback, callback). Per-test autouse fixture clears the module-level dict so tests that reuse (chat_id, msg_id) aren't surprised by silent drops. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ummary Previously every `config.reload.applied` emitted one [warning] setup.warning per engine not on PATH. On a single-engine host (e.g. channelo runs only claude) that's 5 WARNs per reload, padding warn filters in untether-issue-watcher, /monitor, and Grafana with intentional install state. Replace with one INFO `setup.summary` line per reload that lists found/missing_on_path/bad_config engines. Loud WARN now reserved for engines the user actively configured (non-empty [engines.<id>] block) but that aren't on PATH — those are noteworthy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tether Agents routinely follow `Edit untether.toml` with `Bash systemctl --user restart untether` because their training data is full of "restart the service after config changes". Untether already hot-reloads the file; the restart shuts down the very session issuing the command, drain hits the 120s timeout, and the agent's final answer to the user is silently dropped via outbox.fail_pending. Add a dedicated "Configuration changes (`untether.toml`)" section to the default preamble explicitly telling agents NOT to restart after editing config, with the consequence spelled out and the restart-only key list (`bot_token`, `chat_id`, `session_mode`, `topics`, `message_overflow`) provided as the genuine exception. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
\`.new\`, \`.cancel\`, etc. previously dispatched as fresh agent prompts — full Claude cold-start cost (OAuth handshake, MCP catalog probe, preamble injection) paid before the user could cancel. \`.\` and \`/\` are adjacent on iOS/Android punctuation rows, and several mobile keyboards auto-replace a leading \`/\` with \`.\` on autocorrect. Add \`parse_dot_typo\` helper that recognises \`.<cmd>\` and \`.<cmd> args\` shapes where <cmd> matches a registered slash command (case-insensitive, ellipsis/path-prefix safe). Wired into route_message in a follow-up commit so detection happens before agent dispatch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When an agent writes a directory (e.g. \`guides/\`) into \`.untether-outbox/\`, the scanner logs \`outbox.skipped\` and drops it without surfacing anything to the user. The agent's "I've prepared the guides folder for you" final message becomes a silent lie. Wire the skipped tuples through to a new \`_format_outbox_skipped_notice\` helper in runner_bridge.py (added in the #547 axis 1 commit alongside the preamble update) that composes a brief 📎 Outbox skipped block and sends it as a follow-up message in the same chat. Gated by new \`[transports.telegram.files].outbox_notify_skipped\` config flag (default true so the surface fires automatically on upgrade). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The runner_bridge.py change shipped in the #547 axis 1 commit (same file edited for both fixes); this commit adds the regression tests. When threshold_reason == \"pending_approval\", emit a paced \`subprocess.approval_pending\` INFO (max once per 30 min) instead of the \`progress_edits.stall_detected\` WARN. The chat-side \"⏳ Awaiting your approval (N min)\" message (#494-C) is unchanged — only the log-side WARN is suppressed, so warn-filter dashboards and the untether-issue-watcher daemon stop spamming on legitimate approval waits. Also closes #533 as a duplicate (daemon-filed subprocess.liveness_stall on nsd — same root cause). Tests in test_exec_bridge.py assert: - progress_edits.stall_detected WARN is NOT emitted when approval_pending is true - subprocess.approval_pending INFO IS emitted with approval_pending=True - The INFO fires at most once per 30-min window even with rapid ticks Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…g.reload The loop.py wiring (\`_notify_reload_applied\` + handle_reload changes) shipped in the #528 commit (same file edited for both fixes); this commit adds the formatting module, tests, and FAQ entry. New module \`src/untether/config_reload_notification.py\` exposes three message shapes (hot-reload-only / restart-required / partial-reload) with literal "**No restart needed.**" / "**Restart required**" headlines — agents read these messages in next-turn context and the framing flips the trained-in reflex to \`systemctl restart\` after editing config. Broadcast follows the same project-chat + admin-DM fan-out pattern as \`_notify_restart_required\` (#318) so the affirmation reaches whoever edited the file even in project-routed deployments. FAQ entry "Do I need to restart Untether after editing untether.toml?" documents the hot-reload behaviour, restart-only key list, and the agent-don't-restart guidance from #547 axis 1. Axis 3 (drain self-restart heuristic) deferred to v0.35.4 — the obvious "detect the active session ran systemctl" heuristic is fragile and inverts cleanly to false-positive on legitimate sibling-unit restarts. The robust path needs a bigger refactor. Axes 1+2 together break the recurring pattern at its source. Closes #548. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Rapid taps (e.g. approving plans in two chats inside ~2s) saw callback-answer latency escalate 6-10x: 1st click ~220ms HTTP baseline, 2nd/3rd clicks 1.4-2.9s. Root cause was that \`answer_callback_query\` went through \`enqueue_op(chat_id=None)\` and stacked on the shared \`_next_at[None]\` per-chat pacing bucket (private_interval=1.0s) even though Telegram doesn't rate-limit callback-answers per chat — they're keyed off callback-query-id. Route \`answer_callback_query\` directly through \`self._client.answer_callback_query\`, bypassing the outbox semaphore + per-chat pacing. Retry-after handling preserved (one retry on TelegramRetryAfter then fail-fast — better than silent retry loops since the spinner expires after 30s anyway). Add \`queue_wait_ms=0.0\` field to \`callback.answered\` instrumentation so monitoring dashboards can confirm the bypass survived future refactors. Regression test asserts the outbox.enqueue path is never reached during answer_callback_query. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bundles 9 issues from the recent /monitor campaign (lba-1 staging, 2026-05-13 through 2026-05-16 runs) + the daemon-filed #533 dup: - #528 — Answered: echo no longer truncated to 100 chars - #525 — dedup cancel.requested triple-fire - #532 — consolidate per-engine setup.warning to one summary - #547 axis 1 — preamble warns agents against systemctl restart - #523 — recognise leading-dot slash-command typos (\`.new\` etc.) - #524 — surface skipped outbox entries (directories etc.) in chat - #526 — demote stall WARN to INFO during approval-pending (+ #533) - #547 axes 2+3 + #548 — hot-reload Telegram notification - #546 — bypass outbox for answer_callback_query (latency fix) Plus housekeeping: closed #544 / #497 (verified already fixed in rc16 / rc14), closed #531 (label + monitor TOML config drift fixed out-of-tree). Axis 3 of #547 (drain self-restart heuristic) deferred to v0.35.4 — needs a bigger refactor than rc19 wants. #527 (umbrella predicate refactor) deferred to v0.35.4 per user decision. 2737 tests pass / 82.58% coverage / ruff format + check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This was referenced May 17, 2026
Closed
Nathan Schram (nathanschram)
added a commit
that referenced
this pull request
May 19, 2026
) Both issues shipped rc19 fixes (PR #555) but /monitor audits on 2026-05-18 showed each regression still firing in production because the rc19 patch landed in only one of two code paths. #524 — outbox silently drops directory entries rc19 surfaced 📎 Outbox skipped notices on the normal-completion path in handle_message but missed two adjacent paths: the pre-auto-continue delivery (subprocess 1 stuck-after-tool-result recovery) and the run_ok=False failed-run branch. Both still silently dropped the agent's intended deliverable. This commit extracts the surfacing logic into _surface_outbox_skipped in runner_bridge.py and wires it into both gap paths. On a failed run the code still skips the actual file send (preserving the original gating) but does a cheap scan_outbox() to collect skipped items and surface them, so the user always learns what the agent intended to ship. Honours the existing outbox_notify_skipped config flag and filters the "..." overflow pseudo-entry from the user-facing block. #526 — approval-pending stalls misclassified rc19 demoted the bridge-side WARN (progress_edits.stall_detected) to a paced INFO (subprocess.approval_pending) when _has_pending_approval() returned true. The watchdog-side detector in runner.py (which emits subprocess.liveness_stall and is the actual signal untether-issue-watcher auto-files on) was untouched, so the daemon kept filing GitHub issues on routine approval-pending sessions and the nsd audit (2026-05-18) showed a user cancelling a productive 15-minute investigation because the chat-side reassurance came too late (1800s threshold). This commit: - Adds _recent_event_is_control_request helper in runner.py — uses the stream's recent_events ring buffer as the approval-pending signal, consistent with the bridge's inline-keyboard predicate but accessible to runner-scope code. - Plumbs the predicate into _watchdog_loop: when the last JSONL event is control_request, emit subprocess.approval_pending INFO instead of liveness_stall WARN. Skip the auto-kill branch entirely. Pace INFO emission to once per 30 min via the shared _APPROVAL_PENDING_REFIRE_S constant (now defined once in runner.py and imported by the bridge). - Splits _STALL_THRESHOLD_APPROVAL into _STALL_THRESHOLD_APPROVAL_FIRST (600s) and the existing 1800s refire so the user gets a reassuring "tap a button above" chat message at 10 min on first occurrence, matching the watchdog's liveness threshold and avoiding the nsd-style early cancellation. - Rewords the chat-side approval reminder copy to make the "tap a button above to proceed (no action needed otherwise)" affordance explicit, directly quoting the audit's recommended text. Tests cover both code paths: - tests/test_outbox_delivery.py (existing) — format helper + settings default unchanged; no new file-level tests needed. - tests/test_exec_bridge.py — failed-run surfacing, notify_skipped=false suppression, only-overflow filter, two-tier first-reminder threshold, reworded copy. - tests/test_exec_runner.py — predicate truth-table coverage, watchdog demotion via integration with a fake codex script emitting control _request, watchdog WARN still fires when no control_request is recent. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 20, 2026
Open
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bundles 9 OPEN issues from the recent
/monitorcampaign (lba-1 staging, runs 2026-05-13 through 2026-05-16) into one rc19 staged for fleet rollout, plus the daemon-filed #533 duplicate.User-confirmed scope decisions: all in-scope issues moved into v0.35.3 milestone; #527 (umbrella stall predicate refactor) explicitly deferred to v0.35.4.
Already fixed in master (closed out separately)
Bundled here
d175be9f558235a9dc234systemctl restart untether12cf4ca.new,.usage, …)b02c20dfde1bd866ba78f40dc6b7ee03b3212c2d71Deferred to v0.35.4
Pre-PR validation
uv run pytest— 2737 passed / 2 skipped (1 skipped is pre-existing)uv run ruff check src/ tests/— cleanuv run ruff format --check src/ tests/— cleanuv lock --check— cleanTest plan
dev: integration tests on@untether_dev_bot(claude chat-5284581592):cancel.requested(TRIVIAL:cancel.requestedfires 3× within 512 ms for a single user cancel (button not disabled after first click or duplicate dispatch) #525)/newthen any prompt → assistant transcript contains the new preamble sentence (ENH-PATCH: agent self-restart pattern after editing untether.toml — hot-reload ignored, drain timeout, outbox message dropped #547 axis 1).new→ reply offers/newcorrection (ENH-PATCH: Untether bot dispatches an agent run for slash-command typos like.new(no recognition, full Claude cost incurred) #523)~/.untether-dev/untether.toml→ confirm "♻️ Hot-reloaded ... No restart needed." message appears in chat (ENH-PATCH: agent self-restart pattern after editing untether.toml — hot-reload ignored, drain timeout, outbox message dropped #547 axis 2 / ENH-PATCH: hot-reload success Telegram notification with explicit "no restart needed" framing #548); journal shows ONEsetup.summaryINFO (no per-engine WARN, ENH-PATCH: setup.warning fires per-engine on every config.reload — 5×N noise when host has fewer engines than fleet #532).untether-outbox/guides/(dir) +.untether-outbox/x.md(file) → final message includes "📎 Outbox skipped: guides/ — directory" + delivers x.md (bug: outbox silently drops directory entries (e.g.guides/) — silent loss of intended deliveries on every session #524)subprocess.liveness_stallWARNs;subprocess.approval_pendingINFO fires once per 30 min (ENH-PATCH: differentiate approval-pending stalls from genuine hangs in stall warnings + Telegram messaging #526)callback.answered.latency_msstay near the ~220ms baseline (ENH-PATCH: callback-answer latency escalates 6-10× under rapid-click clusters (220ms baseline → 1.4-2.9s on 2nd/3rd click) #546)scripts/run-integration-tests.sh 0.35.3rc19 --manualattestation writtenscripts/fleet-rollout.sh 0.35.3rc19to all 4 hosts (lba-1 + nsd + channelo + mac); /ping smoke on each🤖 Generated with Claude Code