final Telegram message becomes too short for research/investigation tasks under plan mode

## Symptom

Claude Code research/audit/investigation runs in plan mode are producing very short final Telegram messages — typically just a file path pointer to `~/.claude/plans/*.md` instead of the actual findings. The user is on a phone and cannot easily open the file from Telegram.

User report (Nathan, 2026-05-10): _"more & more recently when I ask Claude Code to investigate/look into/research something, the final response from Claude Code is quite short & does not outline Claude Code's findings/recommendations/plan at all … this part of Untether has been working well and still does sometimes, so it just needs investigation & fine-tuning/updates."_

Reference screenshot: `incoming/file_7604.jpg` (5m30s scout-project research run, final chat message = 584 chars).

**Investigation plan file (local):** `/home/nathan/.claude/plans/hello-claude-please-cuddly-kitten.md` — full diagnosis, `pal` consensus reasoning (gpt-5.2 + gemini-3.1-pro), advisor pass, fix design with file paths and line numbers, tests, verification steps.

## Repro (live)

- Staging `@hetz_lba1_bot`, v0.35.3rc10
- Plan mode ON for the chat
- Send any research/investigation/audit prompt (e.g. "check for ALL X and tell me Y")
- Claude does the research, saves findings to `~/.claude/plans/<topic>.md`, calls `ExitPlanMode` with a brief plan body, user clicks **Approve**
- Final message in chat is ~500–700 chars and just points to the file

## Concrete evidence

Session `dde3c528-45e9-4dfb-9111-c3943f802eb8` (2026-05-10 14:11–14:16 UTC, scout project):

- 16 turns / 5m30s of work
- Plan mode ON (footer showed `plan`)
- Claude saved findings to `~/.claude/plans/untether-you-are-running-inherited-hummingbird.md`
- `ExitPlanMode` plan body was a brief acknowledgement (≈580 chars: _"Plan approved — research is complete. The audit lives at … No code work is implied …"_)
- User clicked Approve at 14:16:12; session exited rc=0 at 14:16:32 (20 s post-approval, no further work)
- `session.summary … last_event_type=control_response ok=True num_turns=16`
- `runner.completed … action_count=18 answer_len=584 ok=True`
- The 584-char answer was extracted via the empty-`result` fallback at `src/untether/runners/claude.py:1149-1153` (Claude emitted `result` with empty `result` field; Untether fell back to `state.last_assistant_text`).

### Comparison sessions in the same time window

| Session | Task type | Plan mode | Post-approval work | `last_event_type` | `answer_len` |
|---|---|---|---|---|---|
| `dde3c528…` (scout) | **Research** | yes | 20 s, none | `control_response` | **584** ❌ |
| `76bad108…` (auditor-toolkit) | Code | yes | ~2 min code work | `assistant` | 1683 ✅ |
| `dde3c528…` resumed by Nathan saying _"outline findings here"_ | Direct ask | no | single turn | `result` | **4466** ✅ |

The third row proves Claude is fully capable of writing detailed findings to chat — the bug is not extraction or delivery, it's that **plan-mode + research = no post-approval substantive turn**.

## Root cause

Plan-mode workflow assumes the post-approval phase produces the substantive final assistant text. For research tasks where the deliverable IS the research (no code work follows), Claude:

1. Performs the research (16 turns over 5m30s)
2. Writes findings to `~/.claude/plans/<topic>.md`
3. Calls `ExitPlanMode` with a brief plan body that just confirms research is complete and references the file
4. After approval, has nothing left to do — exits with empty `result`
5. Untether's fallback to `last_assistant_text` (claude.py:1149-1153) surfaces only the brief plan-body acknowledgement

This is **expected upstream Claude Code design**, not an upstream bug — the issue is the Untether/Telegram environment where files cannot easily be opened by the user.

### Why this seems "more recent"

Likely a model-behaviour drift (newer Claude Opus is more inclined to organise findings into well-structured plan files), amplified by Untether's preamble explicitly mentioning `~/.claude/plans/` paths and "concise summary" — Claude reads that as "file path + concise note is enough." No specific Untether regression commit identified.

### Critical mechanism finding (advisor pass)

The `ExitPlanMode` plan body is **ephemeral** — confirmed at `src/untether/runner_bridge.py:2153-2172`, outline messages are deleted from Telegram when approval is resolved. So a preamble fix that only targets the plan body cannot reach the user's permanent timeline; the only thing the user retains post-approval is whatever arrives via the final `runner.completed` answer.

The dead `_outline_prefix = "Plan outline:\n"` matcher at `runner_bridge.py:2952-2964` was intended to surface the plan body in the final answer but matches no action title produced anywhere in the codebase. It needs reimplementing, not just restoring.

## Fix (consensus from gpt-5.2 + gemini-3.1-pro via `pal` MCP, plus advisor)

Both consulted models agreed the diagnosis is correct. Options C (synthetic post-approval stdin nudge) and D (prompt-shape intent detection) were rejected as too risky/fragile. Option B (auto-attach plan files to outbox) was reordered behind A+E after the advisor identified the plan-body ephemerality.

### Fix A — Preamble revision (low risk, fast)

Update `_DEFAULT_PREAMBLE` at `src/untether/runner_bridge.py:291-318` with two targeted clauses:

**(A1) ExitPlanMode plan body shape:**
> When calling `ExitPlanMode`, your `plan` parameter MUST include a 3–5 bullet point summary of your findings, decisions, or proposed changes — never just a file path. The user is on Telegram and cannot easily open files. For code-change tasks keep it concise; for research/audit tasks where no further work is expected after approval, expand the bullets into a substantive summary.

**(A2) Post-approval assistant text shape:**
> After `ExitPlanMode` is approved, your next assistant message — which becomes the user's final Telegram message — MUST repeat the substantive findings or decisions. Do not just write "Plan approved" or "research complete, see file X". The plan-body messages on Telegram disappear after approval, so your post-approval text is the only thing the user retains.

Also reword the existing Summary section's `### Plan/Document Created (if applicable)` bullet to say "include the key findings inline; do not require the user to open the file."

### Fix E — Capture & re-emit the ExitPlanMode plan body (medium risk, the load-bearing fix)

Replace the dead-code prepend at `runner_bridge.py:2952-2964` with a working implementation:

1. **Capture in the runner** (`src/untether/runners/claude.py`): in the `StreamAssistantMessage` translation arm where tool_use blocks are observed, when `tool_name == "ExitPlanMode"`, persist `tool_input.get("plan")` onto a new `ClaudeStreamState.last_exitplanmode_plan: str | None` field. (Mirrors the existing `state.outline_text` pattern at `claude.py:261`/`1084`, which is `_OUTLINE_PENDING`-gated and so unsuitable for the regular Approve flow.)
2. **Surface in the bridge** (`src/untether/runner_bridge.py:2946-2964`): replace the dead `_outline_prefix` matcher with a read of `engine_state.last_exitplanmode_plan` (via duck-typed `getattr`, runner-agnostic). If `final_answer` is empty OR shorter than a threshold (~200 chars) AND the plan body is substantive, prepend it to `final_answer` with a separator, e.g.:
   ```
   📋 Plan (approved):

   <plan body>

   ---

   <final_answer>
   ```
3. **Suppress duplicate display:** if `final_answer` already contains the plan body (substring match) skip the prepend — covers the case where Fix A2 caused Claude's post-approval text to repeat the plan content.
4. **Cleanup:** clear `last_exitplanmode_plan` in the existing `claude_runner.session_cleanup` step.

**Why A and E both ship together:** A alone is insufficient because the plan body is ephemeral and never reaches the user's permanent timeline; E alone is insufficient unless A1 makes the plan body substantive in the first place.

## Deferred / out of scope

- **Fix B** (auto-attach `~/.claude/plans/*.md` files written during a session to `.untether-outbox/`): genuinely useful safety net but adds non-trivial scope (outbox integration, size limits, deny-glob, opt-out config) and Untether already has the manual `/file get` path. With A+E shipping the substantive content into the chat directly, B becomes nice-to-have rather than load-bearing. Track separately for v0.35.4 once we measure whether A+E alone closes the gap.
- **Option C** (synthetic post-approval stdin nudge): too risky — alters multi-turn behaviour, may confuse Claude when there IS post-approval code work to do.
- **Option D** (prompt-shape intent detection): too fragile — heuristics on user prompt shape are not reliable.
- Filing an upstream issue with Claude Code: both consulted models agreed the empty-`result`-after-approval behaviour is expected design for a CLI tool (no terminal spam after a saved plan file). The bug exists purely in the translation to a headless Telegram environment.

## Tests

- `tests/test_preamble.py` — assert A1 and A2 clauses present in the default preamble.
- `tests/test_claude_runner.py` — `test_exitplanmode_plan_captured_to_state`, `test_exitplanmode_plan_cleared_on_session_end`.
- Bridge prepend tests (location TBD by implementer): `test_exitplanmode_plan_prepended_when_answer_short`, `test_exitplanmode_plan_skipped_when_answer_already_contains_it`, `test_exitplanmode_plan_skipped_when_answer_substantial`.
- Integration tests via `@untether_dev_bot`: U2 (Claude plan-mode interactive) + add a new research-task scenario to `docs/reference/integration-testing.md`.

## Critical files

- `src/untether/runner_bridge.py:291-318` — `_DEFAULT_PREAMBLE` (Fix A target)
- `src/untether/runners/claude.py:1149-1153` — empty-`result` fallback (already correct, no change)
- `src/untether/runners/claude.py:261` — `ClaudeStreamState` (Fix E adds `last_exitplanmode_plan` field)
- `src/untether/runners/claude.py` `StreamAssistantMessage` arm (around line 1078) — Fix E capture point
- `src/untether/runner_bridge.py:2952-2964` — dead-code outline prepend (Fix E rewrites this)
- `src/untether/runner_bridge.py:2153-2172` — outline-deletion-on-approve mechanism (background; explains why plan body is ephemeral)

## References

- **Investigation plan file (local, lba-1):** `/home/nathan/.claude/plans/hello-claude-please-cuddly-kitten.md`
- Reference screenshot: `incoming/file_7604.jpg` (in untether repo, untracked)
- Live session: `dde3c528-45e9-4dfb-9111-c3943f802eb8` — scout project, 2026-05-10 14:11–14:16 UTC, staging `@hetz_lba1_bot` v0.35.3rc10

## Target release

v0.35.3rc11 (current rc10 will not include this fix; rc11 should).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

final Telegram message becomes too short for research/investigation tasks under plan mode #508

Symptom

Repro (live)

Concrete evidence

Comparison sessions in the same time window

Root cause

Why this seems "more recent"

Critical mechanism finding (advisor pass)

Fix (consensus from gpt-5.2 + gemini-3.1-pro via `pal` MCP, plus advisor)

Fix A — Preamble revision (low risk, fast)

Fix E — Capture & re-emit the ExitPlanMode plan body (medium risk, the load-bearing fix)

Deferred / out of scope

Tests

Critical files

References

Target release

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Session	Task type	Plan mode	Post-approval work	`last_event_type`	`answer_len`
`dde3c528…` (scout)	Research	yes	20 s, none	`control_response`	584 ❌
`76bad108…` (auditor-toolkit)	Code	yes	~2 min code work	`assistant`	1683 ✅
`dde3c528…` resumed by Nathan saying "outline findings here"	Direct ask	no	single turn	`result`	4466 ✅

final Telegram message becomes too short for research/investigation tasks under plan mode #508

Description

Symptom

Repro (live)

Concrete evidence

Comparison sessions in the same time window

Root cause

Why this seems "more recent"

Critical mechanism finding (advisor pass)

Fix (consensus from gpt-5.2 + gemini-3.1-pro via pal MCP, plus advisor)

Fix A — Preamble revision (low risk, fast)

Fix E — Capture & re-emit the ExitPlanMode plan body (medium risk, the load-bearing fix)

Deferred / out of scope

Tests

Critical files

References

Target release

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Fix (consensus from gpt-5.2 + gemini-3.1-pro via `pal` MCP, plus advisor)