fix: #515 — rc13 plan-summary over-fire (CLI-style brevity restored) by nathanschram · Pull Request #516 · littlebearapps/untether

Nathan Schram (nathanschram) · 2026-05-12T04:16:25Z

Summary

Closes rc11/rc12 #508 fix over-fires: 11-message Telegram finals on research/audit runs (plan body + post-approval text both bloated + Layer E concatenates) #515 — rc11/rc12 final Telegram message becomes too short for research/investigation tasks under plan mode #508 fix over-fired, producing 25k–42k char (~8–12 Telegram message) finals on plan-mode research/audit runs. rc13 retunes preamble + Layer E to CLI-style brevity (3–7 bullets, ~500–1500 chars).
Three stacked over-shoots in rc11/rc12: A1 told Claude to "expand bullets into substantive summary"; A2 told Claude to "MUST repeat substantive findings"; Layer E's substring-skip check failed on every paraphrased run, concatenating the full plan body.
Live evidence (staging @hetz_lba1_bot, 48h log window): aushistory 42,634-char final = the user's 11-message repro; scout 27,558 chars; Telegram MCP search_messages "📋 Plan (approved):" hit on every recent plan-mode completion in both chats.

Changes

src/untether/runner_bridge.py _DEFAULT_PREAMBLE: A1 → "concise 3–5 bullets; not as final deliverable"; A2 → "brief CLI-style summary, ~500–1500 chars, do NOT re-paste full plan content"; A3 → "Path AND a 3–5 bullet headline summary, not a re-paste".
src/untether/runners/claude.py _prepend_exitplanmode_plan: substring-skip replaced with len(final_answer) < 600 length gate (substring check kept as cheap secondary skip); plan body capped at 1500 chars + truncation marker when Layer E fires.
Version bump 0.35.3rc12 → rc13; CHANGELOG entry; uv lock synced.
7 new/updated tests in tests/test_preamble.py (regression-locks rc11 verbosity-driving phrases out, plus length-gate / body-cap / substring-skip cases); 2 new tests in tests/test_claude_runner.py.

Wider blast radius note

A3's reworded ## Summary ### Plan/Document Created bullet affects all completed Claude runs, not just plan-mode. This is intentional — the user explicitly asked for "summaries like we have here in command line", which is the broader goal. If a future case needs the older verbose shape, the right answer is per-chat /verbose (already exists) rather than reverting A3.

Test plan

uv run pytest — 2652 passed, 2 skipped, 82.38% coverage
uv run ruff format --check src/ tests/ — clean
uv run ruff check src/ tests/ — clean
uv lock --check (lockfile synced)
Live integration test (primed) on @untether_dev_bot: research-task prompt with "keep it short" → 882 chars, no 📋 Plan (approved): literal, clean CLI summary
Live integration test (unprimed) on @untether_dev_bot: default research-task prompt (no brevity hint) → 1019 chars; preamble does its job on its own
Fallback path (Claude exits with brief post-approval text → Layer E fires with capped plan body): unit-tested only — not live-verified because the new preamble makes the empty post-approval path almost impossible to repro intentionally; test_prepend_exitplanmode_plan_when_final_answer_short + test_translate_result_caps_long_plan_body_when_prepending cover it

Targets

v0.35.3rc13 (TestPyPI on dev merge, then staging via scripts/staging.sh install 0.35.3rc13)
Aimed to land alongside the other rc12/rc13 fixes on the dev branch

🤖 Generated with Claude Code

Closes the rc11/rc12 over-correction on #508 that produced 25k–42k char (~8–12 Telegram message) finals on staging plan-mode research/audit runs. User report (Nathan, 2026-05-12): "I had a summary from Claude Code yesterday which was 11 Telegram messages long!! What I really want back is to have Claude Code provide summaries like we have here in command line — summaries of plans (not the entire plan), summaries of recommendations and/or findings and/or next steps (where relevant)." Three stacked over-shoots in rc11/rc12: 1. A1 preamble: "expand the bullets into a substantive summary" for research/audit → plan body ballooned to 2–5k chars. 2. A2 preamble: "your next assistant message ... MUST repeat the substantive findings" → post-approval text ballooned to 0.5–2k chars AND was paraphrased rather than literal-copied. 3. Layer E: substring-skip rule (body in final_answer) failed on every paraphrased run, so the plan body was unconditionally concatenated in front of the post-approval text. Evidence from `journalctl --user -u untether.service` (last 48h on staging @hetz_lba1_bot v0.35.3rc12): aushistory finals at 14k / 16k / 28k / 35k / 42k chars; scout finals at 26k / 27k chars. The 42k case matches the 11-message user repro. Telegram MCP `search_messages` for the literal "📋 Plan (approved):" returned hits on every recent plan-mode completion in both chats — confirming Layer E was the load-bearing over-firer. rc13 retuning: - A1 → "concise 3–5 bullet summary; plan is shown for approval, not as the final deliverable" (drops the substantive-expansion license). - A2 → "brief CLI-style summary, 3–7 bullets or 1–2 short paragraphs, ~500–1500 chars, do NOT re-paste the full plan content". - A3 (## Summary Plan/Document Created bullet) → "Path AND a 3–5 bullet headline summary, not a re-paste of the full content". Note: A3 affects the ## Summary block on ALL completed work, not just plan-mode runs — intentional, matches user's stated goal. - _prepend_exitplanmode_plan: substring check replaced with a length gate (`len(final_answer) < 600`). Substring check stays as a cheap belt-and-braces second skip. Plan body is capped at 1500 chars + truncation marker so a runaway body can't ship 30k chars even when Layer E does fire (preserves original #508 UX for genuinely empty post-approval results without re-introducing concatenation). Live verification on @untether_dev_bot (test chat -5284581592): - Primed test (with "keep it short" instruction): answer_len=882 chars (~1 Telegram message), no "📋 Plan (approved):" literal. - Unprimed test (default research-task prompt): answer_len=1019 chars — preamble is doing its job without user help. Layer E correctly skipped (1019 > 600). Quality verified: 3 substantive bullets + ## Summary block with Completed / Next Steps. The original #508 fallback path (Claude exits with very short post- approval text → Layer E fires with capped plan body) is unit-tested only; not live-verified because the new preamble makes it almost impossible to repro intentionally. Tests: 7 new/updated in tests/test_preamble.py (regression-locks the rc11 verbosity-driving phrases out of _DEFAULT_PREAMBLE, plus length-gate / body-cap / substring-skip cases) and 2 in tests/test_claude_runner.py (`test_translate_result_skips_prepend_ when_answer_substantive`, `test_translate_result_caps_long_plan_body_ when_prepending`). Full suite: 2652 passed, 2 skipped, 82.38% coverage. ruff format + check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-12T04:16:32Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 47d3127b-ab55-462b-935b-df840e6bb526

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/v0.35.3rc13-plan-summary-overfire

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Nathan Schram (nathanschram) mentioned this pull request May 12, 2026

rc11/rc12 #508 fix over-fires: 11-message Telegram finals on research/audit runs (plan body + post-approval text both bloated + Layer E concatenates) #515

Open

Nathan Schram (nathanschram) merged commit 1b192f1 into dev May 12, 2026
20 of 21 checks passed

Nathan Schram (nathanschram) deleted the fix/v0.35.3rc13-plan-summary-overfire branch May 12, 2026 04:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: #515 — rc13 plan-summary over-fire (CLI-style brevity restored)#516

fix: #515 — rc13 plan-summary over-fire (CLI-style brevity restored)#516
Nathan Schram (nathanschram) merged 1 commit into
devfrom
fix/v0.35.3rc13-plan-summary-overfire

Nathan Schram (nathanschram) commented May 12, 2026

Uh oh!

coderabbitai Bot commented May 12, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Nathan Schram (nathanschram) commented May 12, 2026

Summary

Changes

Wider blast radius note

Test plan

Targets

Uh oh!

coderabbitai Bot commented May 12, 2026

Review skipped

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant