Skip to content

feat(code): add plan view with threaded inline comments#2266

Draft
mcoll-posthog wants to merge 21 commits into
mainfrom
posthog-code/plan-view-with-threads
Draft

feat(code): add plan view with threaded inline comments#2266
mcoll-posthog wants to merge 21 commits into
mainfrom
posthog-code/plan-view-with-threads

Conversation

@mcoll-posthog
Copy link
Copy Markdown

Summary

  • Adds a spacious Plan tab that renders the agent's plan markdown and lets you attach threaded [H]: / [A]: comments anchored to specific blocks (paragraphs, headings, lists, code).
  • Threads live as plain markdown blockquotes in the plan file (~/.claude/plans/<slug>.md), so the agent reads and edits the conversation the same way it edits any file. Posting a [H]: comment auto-prompts the agent to reply on that thread; resolving a thread asks the agent to incorporate the feedback and remove the blockquote.
  • The Plan tab registers itself the first time the agent writes a plan file in the session (detected from session tool calls). The cramped approval preview gets an "Open in Plan view" button.

How the data layer works

A thread is a contiguous markdown blockquote of > [H]: … / > [A]: … / > [resolved] lines placed immediately after its anchor block. No JSON sidecar, no opaque IDs — anchor identity is the verbatim source text of the preceding block, which both the renderer and the new PlansWatcherService agree on.

Example after a back-and-forth:

## Step 1: Refactor auth middleware

Move session validation from middleware to a dedicated service.

> [H]: Won't this break the legacy /api/v1 routes?
> [A]: Those routes don't use middleware — they validate inline.

What's new

Main process (apps/code/src/main/)

  • services/plans-watcher/service.ts + schemas.ts — new PlansWatcherService with read / append-thread / resolve-thread + atomic writes and a debounced parcel watcher on ~/.claude/plans/
  • trpc/routers/plans.tsplans.read, plans.appendThreadMessage, plans.resolveThread, plans.ensureWatching, plans.onChanged / onDeleted
  • 13 unit tests for the parser / insertion helpers

Renderer (apps/code/src/renderer/features/plans/)

  • remark/remarkPlanThreads.ts — remark plugin that rewrites thread blockquotes into <plan-thread> nodes and annotates every other top-level block with data-plan-block
  • components/PlanView.tsx — full-pane reader wiring MarkdownRenderer + watcher subscription
  • components/PlanThread.tsx / PlanBlockGutter.tsx / PlanComposePopover.tsx — thread UI (adapted from the existing PrCommentThread / CommentAnnotation / EnrichmentPopover patterns)
  • hooks/usePlanFilePath.ts + usePlanTab.ts — derive the active plan path from session tool calls and register the Plan tab
  • utils/planPrompts.ts — auto-prompt copy for "reply on this thread" and "incorporate resolved threads"

Wiring

  • DEFAULT_TAB_IDS.PLAN, new { type: "plan"; filePath } TabData, dispatch in TabContentRenderer
  • usePanelLayoutStore.ensurePlanTab(taskId, filePath) opens/refreshes the tab
  • TaskDetail calls usePlanTab(taskId) so the tab appears the first time the agent writes a plan
  • PlanContent (the cramped approval box) gains an "Open in Plan view" icon button when the tab is registered

Test plan

  • pnpm dev, open a task, send a prompt that triggers plan mode (e.g. "Plan a refactor of <X>")
  • Confirm the Plan tab appears when the agent writes ~/.claude/plans/<slug>.md and auto-activates
  • Hover a paragraph in the Plan view — the gutter + button appears. Click it, type a comment, submit.
  • cat ~/.claude/plans/<slug>.md — confirm > [H]: … was appended directly under the paragraph
  • Confirm the Plan view immediately re-renders the new [H] message and the agent gets a follow-up prompt (visible in the Chat tab)
  • Wait for the agent to append > [A]: …; confirm the Plan view updates without a refresh
  • Click Resolve — confirm > [resolved] is appended, a second follow-up prompt is sent, and the agent rewrites the surrounding content and removes the blockquote
  • Trigger ExitPlanMode; confirm the existing approval flow still works unchanged
  • Restart the app; threads persist (they're plain markdown on disk)
  • pnpm --filter code typecheck && pnpm lint && pnpm --filter code test src/main/services/plans-watcher

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf

Add a spacious Plan tab that renders the agent's plan markdown and lets
the user attach threaded `[H]:` / `[A]:` comments anchored to specific
blocks. Threads live as plain markdown blockquotes in the plan file,
so the agent can read and edit them as part of the same document.

- PlansWatcherService watches `~/.claude/plans/` and exposes read /
  append-thread / resolve-thread mutations via tRPC
- remarkPlanThreads rewrites `> [H]:` / `> [A]:` / `> [resolved]`
  blockquotes into custom `<plan-thread>` nodes and annotates every
  other top-level block with a `data-plan-block` source snippet
- PlanView renders the plan through the existing MarkdownRenderer
  with a hover `+` gutter button per block; the compose popover and
  PlanThread reuse the InputGroup / Avatar patterns from the PR
  review thread UI
- Submitting a comment writes to the file and auto-prompts the agent
  to reply on the same thread; resolving asks the agent to integrate
  the feedback and remove the thread block
- The Plan tab is registered the first time the agent writes a plan
  file (detected from session tool calls), and the cramped approval
  preview gets an "Open in Plan view" button

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…scription

Address PR #2266 review findings via red→green TDD:

- P1+P3: move plan-file detection to main process. New
  `plan-file-detector.ts` reads `CLAUDE_CONFIG_DIR` (the same env var
  `env.ts` sets at boot) so renderer and watcher agree on the plans
  directory — fixes the desktop case where the app uses `<userData>/claude`
  rather than `~/.claude`. `AgentService` caches `planFilePath` per
  `ManagedSession` and emits a typed `PlanFileChanged` event; renderer
  uses `agent.getPlanFilePath` query + `agent.onPlanFileChanged`
  subscription, replacing the renderer-side regex on `rawInput`.

- P2 anchor uniqueness: thread mutations now take an `occurrence` index
  so repeated headings/bullets attach replies and resolutions to the
  block the user actually clicked. `findBlockInsertionLine` advances
  past matched windows; the remark plugin counts per-block occurrences
  and tags every anchor + plan-thread node with `data-occurrence`.

- P2 deletion: `PlanView` subscribes to `plans.onDeleted` and clears
  the cached `plans.read` response when the deleted file is the one
  being viewed, via a small `handlePlanDeletion` helper.

36 new unit tests cover the new helpers (parser, detector, deletion
gate, end-to-end watcher round-trip on a tmp dir).

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Address remaining PR #2266 review findings:

- P2 (still): anchor lookup now does an *exact* source-block match
  rather than substring containment. Previously `## Step 1` could match
  `## Step 10`, and thread reply text mentioning the snippet could be
  counted as an occurrence. The watcher now iterates the file in
  paragraph-delimited blocks (split on blank lines), skips thread
  blockquotes, and only accepts blocks whose trimmed source equals
  `blockText` — aligning with how the renderer computes occurrence.

- P3 (still): plan-file detection now reads the typed ACP
  `tool_call.locations[0].path` field instead of `rawInput.file_path`.
  Per the repo guidance we don't build agent-facing contracts on
  rawInput. The Claude adapter already populates `locations` for
  Write / Edit / MultiEdit / NotebookEdit (see
  packages/agent/.../conversion/tool-use-to-acp.ts:216, :253). The
  detector drops the rawInput path entirely.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Address PR #2266 review v3 findings:

- P2 (still): exact line-based matching split blocks on every blank
  line, which broke for fenced code blocks containing blank lines and
  for loose lists (lists with blank lines between items). Both are ONE
  block in the renderer (single gutter button) but were N blocks to the
  old watcher logic. The watcher now parses the file with `remark-parse`
  the same way the renderer does, iterates top-level mdast children,
  skips thread blockquotes, and matches by verbatim source slice. Tests
  added for fenced code with internal blank, loose list, GFM table.

- P3 (still): MultiEdit is dropped from the supported set. The Claude
  adapter (packages/agent/.../tool-use-to-acp.ts) has no MultiEdit case
  and falls through to the default branch which never populates
  `locations`. Including it would only have produced false negatives in
  testing — and worse, future regressions if the adapter is changed to
  populate locations without us auditing the contract. Restore it once
  the adapter has a typed `locations` for MultiEdit.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Address PR #2266 review v4 finding:

The remark plugin annotated `code` and `table` mdast nodes with
`data-plan-block`, but the renderer's `PlanView` couldn't surface a
usable gutter on either:

- For fenced code, mdast-util-to-hast moves a `code` node's hProperties
  onto the inner `<code>` element (wrapped in `<pre>`). We were
  wrapping `<pre>`, which never received the data attribute. The
  gutter button never appeared on fenced code blocks.
- For tables, the base `table` component dropped arbitrary props, so
  `data-plan-block` reached `<table>` but had no effect.

Tighten `ANCHORABLE_TYPES` to the set the renderer can actually wrap:
`heading`, `paragraph`, `list`. Drop the no-op `pre` wrap in PlanView.
The watcher still parses code blocks / tables correctly (the parser
tests stay), so re-enabling them later only requires wrapping the
right components.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Address PR #2266 review v5 finding:

CommonMark treats a non-blank line immediately after `> [H]: …` as lazy
continuation of the blockquote. So inserting a thread right after a
heading that's followed by an adjacent paragraph (`## Heading\nNext
paragraph`) produced `## Heading\n\n> [H]: msg\nNext paragraph`, and
`remarkPlanThreads` then failed to recognise the thread because the
blockquote's paragraph child contained `[H]: msg\nNext paragraph`.

Introduce `ensureBlankAfterThread(lines, threadEnd)` and call it from
both `appendThreadMessage` (new + extend) and `resolveThread`, so every
thread we write is bounded by blank lines (or end of file) on the
trailing side too.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
… off)

This PR ships the Plan view + threads as opt-in only, so we can iterate
on the UX without forcing it on everyone in the rollout.

- Add `planThreadsEnabled: boolean` (default `false`) to
  `useSettingsStore`, persisted alongside other prefs.
- Surface a toggle in Settings → General → Experimental (above the
  existing Fun / Hedgehog row), with the standard `SETTING_CHANGED`
  analytics event.
- Gate `usePlanTab` so the Plan tab never auto-registers when the
  setting is off — no tab, no `PlanView`, no watcher start, no gutter
  UI. Single load-bearing check; the rest of the feature plumbing
  stays unconditional.
- Gate the `Open in Plan view` button on the approval card.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…ct bar

Three UX fixes against the testing session feedback:

- Tab no longer jumps to Chat when sending a comment. The compose and
  reply paths used `sendPromptToAgent`, which calls `setActiveTab(LOGS)`
  as a side-effect — fine from the Chat tab, wrong from the Plan tab.
  Both paths now call `getSessionService().sendPrompt(taskId, ...)`
  directly so the user stays in Plan view; the file watcher invalidates
  the cached read and the new `[H]:` (then `[A]:`) appear in place.

- Composer is inline, not a floating popover. `PlanBlockGutter` now
  owns local "composing" state and renders an inline composer card
  directly below the anchor block. `PlanComposePopover` and
  `planComposeStore` are deleted — no portal, no right-side floater.

- Approve / Reject the plan from inside the Plan view. A sticky
  `PlanApprovalBar` appears at the top whenever there's a pending
  `switch_mode` permission (the agent's `ExitPlanMode` request). The
  first allow option (the user's previous mode) drives Approve;
  Reject opens an inline textarea for optional feedback that maps to
  `reject_with_feedback`. The original approval card in the chat keeps
  working unchanged — both surfaces resolve the same permission.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
CommonMark parses `[H]: short_value` as a *link reference definition*.
Once such a definition exists in the document, every subsequent `[H]`
is parsed as a `linkReference` node that consumes the brackets — so a
multi-line thread ends up as a mix of `definition`, `paragraph`,
`linkReference`, and `text` nodes. Reconstructing the original line
text from those children mangled it (e.g. `[H]: Got it.` became
`H: Got it.`), and `parseThreadBlockquote` rejected the whole block,
which is what the user saw: the second comment rendered as a plain
blockquote, not a thread.

Switch to parsing the verbatim *source slice* between the blockquote's
position offsets. Strip the leading `>` (plus optional space) from each
line and match against `THREAD_LINE_RE`. The mdast children are
ignored entirely — they're an unreliable channel for this case. Two
new tests cover the mixed `[H]`/`[A]` round-trip and consecutive
single-word definitions.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Surface what the agent is working on inside the Plan view. When a user
submits a comment or hits Resolve from a thread, the thread is added
to a FIFO queue. The head of the queue is rendered with a spinning
robot avatar and "Responding…" label; everything behind it shows a
muted "Queued behind earlier comments" hint. The thread is automatically
dequeued when its last message flips to `[A]:` (the agent's reply
arrives via the file watcher), and as a safety net when the thread
block unmounts (the typical end of a Resolve flow, where the agent
rewrites the plan and removes the resolved blockquote).

For resolved threads still in the queue, the label switches to
"Incorporating feedback…" — same status, more accurate verb for the
"agent rewrites surrounding content" phase.

Store and dequeue/promote semantics are covered by 8 new unit tests.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
The "Responding…" / "Queued" indicator never appeared in dev because
the per-thread useEffect cleanup that dequeued the thread on unmount
also fired on React StrictMode's synthetic dev-only mount→unmount→
remount cycle, clearing the queue right after the user's submit-time
enqueue.

Replace the unmount-time cleanup with a content-driven sweep:

- New `syncQueue(activeKeys)` action on `planAgentActivityStore`:
  keep only keys still present in the parsed plan.
- New `extractThreadKeys(content, filePath)` helper: runs the same
  remark pipeline `PlanView` uses, walks the planThread nodes, and
  returns the set of currently-anchored thread keys.
- `PlanView` calls `syncQueue(extractThreadKeys(content, filePath))`
  in a `useEffect` whenever the plan content changes. This handles
  the "Resolve → agent rewrites the plan and removes the block"
  case (where no `[A]:` reply ever lands in the thread) without
  racing against StrictMode.
- `PlanThread` no longer carries an unmount-time `dequeue`. The
  `lastSpeaker === "A"` effect still fires for the common path
  (agent replied inside the thread).

Adds two store tests for `syncQueue`; 58 plan-feature tests pass.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Adds a component-level test that mounts <PlanThread> inside
<StrictMode> and asserts the activity indicator survives the
double-mount cycle. Proves the cb3cb37 fix is correct at the
component level — the indicator renders when the thread key is
in the activity queue.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
End-to-end test that renders content through ReactMarkdown + remarkGfm +
remarkPlanThreads (the same pipeline PlanView uses) and asserts the
indicator appears when the threadKey computed from the rendered
<plan-thread> element matches what InlineComposer would have enqueued.

Covers: single-anchor matching, occurrence indexing for duplicate
anchor text, and heading anchors.

This passes against the current build, which means the React/store/
remark prop flow is correct. If the indicator still doesn't appear in
the running app, the issue is elsewhere (likely a fast-agent race
where [A]: lands before PlanThread first mounts).

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…send failure

P1 — Approve plan used to silently pick whichever allow_* option the agent
put first, which on fresh tasks (no previousMode to promote a safer choice)
meant clicking "Approve plan" could move the session into bypassPermissions
mode. Now the approval bar exposes every allow_* option as a Select,
defaults to `default` (manual approval) when available — never bypassPermissions
unless it's the only option — and shows an inline warning when bypassPermissions
is selected.

P2a — Existing persisted Plan tabs used to render PlanView regardless of
the planThreadsEnabled experimental setting, mounting the watcher even
after the setting was turned off. PlanView now short-circuits to a
disabled placeholder when the setting is off, so neither the watcher
nor the gutter UI mount for tabs that pre-existed the toggle.

P2b — sendPrompt() rejections (offline, disconnected session, no active
session) were unhandled and left the activity indicator stuck. The submit/
reply/resolve handlers now await sendPrompt inside a nested try and
dequeue the thread key on failure so the indicator doesn't get stuck.

Added 15 new tests across 3 files (planApprovalPermission, PlanView gating,
PlanComposerActions). 71 plan-feature tests pass; typecheck and lint clean.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Adds a regression test that mounts PlanView with a pending switch_mode
permission and asserts the Approve plan, Reject, Mode label, and the
default option (manually approve edits) are all in the DOM.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
The bar previously required a pending switch_mode permission, but
during an active comment loop the agent edits the plan file via
handlePlanFileException (which permits Edit/Write on plan files in
plan mode) and never calls ExitPlanMode — so no permission is pending
and the bar (incorrectly) disappeared.

New buildPlanApprovalState helper drives the bar from EITHER:
  - A pending switch_mode permission (unchanged: Approve resolves it
    with the chosen option, Reject responds with the reject option).
  - The session's `mode` configOption when currentValue === "plan"
    (Approve calls setSessionConfigOption("mode", chosen); Reject
    sends a rejection prompt and mode stays at "plan" so the agent
    keeps iterating).

Reject is now always available in both flows. In mode flow it sends
a buildPlanRejectionPrompt with optional feedback.

15 new tests across the helper and PlanView cover both code paths,
precedence, and the bar's interaction wiring.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…ent to implement

Clicking Approve previously only changed the mode (or resolved the
permission). Now it also brings the user to the Chat tab and — in the
mode-driven flow where the agent is idle in plan mode — sends a
buildPlanImplementationPrompt() so the agent actually starts working
instead of just sitting idle.

Permission flow: respondToPermission resolves it (agent auto-continues),
then activateChatTab switches to Chat.

Mode flow: setSessionConfigOption -> sendPrompt(buildPlanImplementationPrompt)
-> activateChatTab.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…ding

When the agent finishes a plan it issues an ExitPlanMode (switch_mode)
permission. While that is pending, sessions are isPromptPending, so
calling sendPrompt for a plan comment silently enqueues the message and
the agent never reacts.

New dispatchPlanComment helper: if a pending switch_mode permission has
a reject_once/reject_always option, resolve it with the comment prompt
as customInput — the agent processes the rejection feedback, stays in
plan mode, and reacts to the new [H]: thread. Otherwise sendPrompt as
before.

Wired into PlanBlockGutter (new comment), PlanThread (reply, resolve).
5 helper tests + 1 component-level integration test cover the path.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Previously the entire <ul>/<ol> was one anchor and a single gutter
button. Each list item now carries its own gutter so users can comment
on items independently.

- remarkPlanThreads: ANCHORABLE_TYPES drops `list` and adds `listItem`.
  Pre-walks the tree (descending into lists) to annotate each item in
  document order. Thread anchor resolution: when the preceding sibling
  of a thread is a list, descend to its LAST listItem.
- plans-watcher: parseAstBlocks descends into lists and exposes each
  listItem as its own AstBlock with its own endLine. Lists themselves
  are no longer anchorable blocks.
- PlanView: wraps <li> instead of <ul>/<ol>.

5 new tests on the remark plugin (each item annotated, ordered lists
covered, thread anchored to last item of preceding list), 4 new tests
on the watcher (per-item match, ordered lists, occurrence across
identical items, end-to-end append). All 99 plan-feature tests pass.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…orts

Three issues raised in review:

- P1: PlanContent no longer imports useSettingsStore. The setting check
  here was redundant (usePlanTab is the load-bearing gate), and pulling
  electronStorage -> trpcClient into PlanContent broke jsdom for any
  ancestor test that doesn't mock the tRPC client.
- P2a: Composer no longer waits for the agent's full turn. sendPrompt
  resolves only when the turn ends (seconds); awaiting it stuck the
  composer in "Sending…". Fire-and-forget with a .catch() handler that
  dequeues the activity indicator on failure.
- P2b: <li> is now wrapped by a dedicated PlanListItemGutter that
  renders <li> directly (with the gutter button as a positioned child)
  and the composer as a sibling <li class="list-none">. The previous
  PlanBlockGutter wrap produced <ul><div><li>…</li></div></ul>, which
  browsers hoist to invalid DOM.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
…ling

The previous fix moved the composer to a sibling <li class="list-none">
to keep the <ul>/<ol> DOM valid. That hides the marker but the element
still participates in <ol> numbering, so opening the composer shifted
the next step's number by one while the user was typing.

A <div> inside <li> is valid HTML and doesn't increment the list
counter. Render the composer as a child of the anchor <li>.

Generated-By: PostHog Code
Task-Id: c2ee090b-484a-4c47-aae0-0876862b1ebf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant