fix: #515 — rc13 plan-summary over-fire (CLI-style brevity restored)#516
Merged
Nathan Schram (nathanschram) merged 1 commit intoMay 12, 2026
Merged
Conversation
Closes the rc11/rc12 over-correction on #508 that produced 25k–42k char (~8–12 Telegram message) finals on staging plan-mode research/audit runs. User report (Nathan, 2026-05-12): "I had a summary from Claude Code yesterday which was 11 Telegram messages long!! What I really want back is to have Claude Code provide summaries like we have here in command line — summaries of plans (not the entire plan), summaries of recommendations and/or findings and/or next steps (where relevant)." Three stacked over-shoots in rc11/rc12: 1. A1 preamble: "expand the bullets into a substantive summary" for research/audit → plan body ballooned to 2–5k chars. 2. A2 preamble: "your next assistant message ... MUST repeat the substantive findings" → post-approval text ballooned to 0.5–2k chars AND was paraphrased rather than literal-copied. 3. Layer E: substring-skip rule (body in final_answer) failed on every paraphrased run, so the plan body was unconditionally concatenated in front of the post-approval text. Evidence from `journalctl --user -u untether.service` (last 48h on staging @hetz_lba1_bot v0.35.3rc12): aushistory finals at 14k / 16k / 28k / 35k / 42k chars; scout finals at 26k / 27k chars. The 42k case matches the 11-message user repro. Telegram MCP `search_messages` for the literal "📋 Plan (approved):" returned hits on every recent plan-mode completion in both chats — confirming Layer E was the load-bearing over-firer. rc13 retuning: - A1 → "concise 3–5 bullet summary; plan is shown for approval, not as the final deliverable" (drops the substantive-expansion license). - A2 → "brief CLI-style summary, 3–7 bullets or 1–2 short paragraphs, ~500–1500 chars, do NOT re-paste the full plan content". - A3 (## Summary Plan/Document Created bullet) → "Path AND a 3–5 bullet headline summary, not a re-paste of the full content". Note: A3 affects the ## Summary block on ALL completed work, not just plan-mode runs — intentional, matches user's stated goal. - _prepend_exitplanmode_plan: substring check replaced with a length gate (`len(final_answer) < 600`). Substring check stays as a cheap belt-and-braces second skip. Plan body is capped at 1500 chars + truncation marker so a runaway body can't ship 30k chars even when Layer E does fire (preserves original #508 UX for genuinely empty post-approval results without re-introducing concatenation). Live verification on @untether_dev_bot (test chat -5284581592): - Primed test (with "keep it short" instruction): answer_len=882 chars (~1 Telegram message), no "📋 Plan (approved):" literal. - Unprimed test (default research-task prompt): answer_len=1019 chars — preamble is doing its job without user help. Layer E correctly skipped (1019 > 600). Quality verified: 3 substantive bullets + ## Summary block with Completed / Next Steps. The original #508 fallback path (Claude exits with very short post- approval text → Layer E fires with capped plan body) is unit-tested only; not live-verified because the new preamble makes it almost impossible to repro intentionally. Tests: 7 new/updated in tests/test_preamble.py (regression-locks the rc11 verbosity-driving phrases out of _DEFAULT_PREAMBLE, plus length-gate / body-cap / substring-skip cases) and 2 in tests/test_claude_runner.py (`test_translate_result_skips_prepend_ when_answer_substantive`, `test_translate_result_caps_long_plan_body_ when_prepending`). Full suite: 2652 passed, 2 skipped, 82.38% coverage. ruff format + check clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
search_messages "📋 Plan (approved):"hit on every recent plan-mode completion in both chats.Changes
src/untether/runner_bridge.py_DEFAULT_PREAMBLE: A1 → "concise 3–5 bullets; not as final deliverable"; A2 → "brief CLI-style summary, ~500–1500 chars, do NOT re-paste full plan content"; A3 → "Path AND a 3–5 bullet headline summary, not a re-paste".src/untether/runners/claude.py_prepend_exitplanmode_plan: substring-skip replaced withlen(final_answer) < 600length gate (substring check kept as cheap secondary skip); plan body capped at 1500 chars + truncation marker when Layer E fires.uv locksynced.tests/test_preamble.py(regression-locks rc11 verbosity-driving phrases out, plus length-gate / body-cap / substring-skip cases); 2 new tests intests/test_claude_runner.py.Wider blast radius note
A3's reworded
## Summary### Plan/Document Createdbullet affects all completed Claude runs, not just plan-mode. This is intentional — the user explicitly asked for "summaries like we have here in command line", which is the broader goal. If a future case needs the older verbose shape, the right answer is per-chat/verbose(already exists) rather than reverting A3.Test plan
uv run pytest— 2652 passed, 2 skipped, 82.38% coverageuv run ruff format --check src/ tests/— cleanuv run ruff check src/ tests/— cleanuv lock --check(lockfile synced)@untether_dev_bot: research-task prompt with "keep it short" → 882 chars, no📋 Plan (approved):literal, clean CLI summary@untether_dev_bot: default research-task prompt (no brevity hint) → 1019 chars; preamble does its job on its owntest_prepend_exitplanmode_plan_when_final_answer_short+test_translate_result_caps_long_plan_body_when_prependingcover itTargets
scripts/staging.sh install 0.35.3rc13)devbranch🤖 Generated with Claude Code