Skip to content

EP: Hierarchical interactive components — expandable parts in chat #199

@dugshub

Description

@dugshub

Vision

Each rendered chat part (thinking, tool call, error, code block, etc.) is a container that holds its full data and renders one of two views:

  • Summary view (default): compact one-liner or truncated body
  • Detail view (on interaction): full content rendered from the same underlying data

The data is already streamed end-to-end from the backend — we just choose which view to render. Toggling expansion is a pure view-state flip on the part, no fetching or loading state.

Why

Today the chat view is a flat string blob in a viewport. Each `MessagePart` is rendered once into a `string` and concatenated. There's no way to:

  • Expand a thinking part to see the full reasoning (currently shows only the first 60 chars)
  • Reveal full tool call args + duration + tool_call_id under a tool call header
  • Show stack traces under errors
  • Expand truncated code/diff blocks

The user has all the data — it's stored on `MessagePart.ToolCall`, `Content`, etc. We just don't surface it.

Conceptual model

```
ThinkingPart (container)
├─ summary: "thinking: "
└─ detail: full reasoning markdown

ToolCallPart (container)
├─ summary: ✓ [read_file] (current header)
└─ detail: + full args JSON, + duration, + tool_call_id, + raw result

ErrorPart (container)
├─ summary: error message
└─ detail: + stack trace, + recoverable, + context

CodePart / DiffPart (containers)
├─ summary: first N lines or hunk count
└─ detail: full file / full diff
```

The component itself owns both views. The renderer reads an `Expanded bool` and dispatches.

The interaction problem

Bubble Tea has zero hit-testing. The view function returns a single `string`; the framework writes it to the terminal. There's no widget tree, no spatial registry. Mouse clicks arrive as raw `(X, Y)` coordinates with no awareness of what was rendered there.

To make parts clickable, we need to build the bridge ourselves: track which content lines belong to which part during render, then translate click coordinates → content line → part identity → toggle action.

Approaches considered

A. Region map (pragmatic, ~100 lines)
During render, accumulate `[]Region{startLine, endLine, msgIdx, partIdx}` as a side effect. On click: convert screen Y → viewport Y → content Y, walk regions, find part, toggle `Expanded`. Render and hit-test live in the same pass. Doesn't restructure the model.

B. Component-per-part (the right shape, ~500+ lines)
Each part type becomes a Bubble Tea sub-model with its own `Init/Update/View/LineCount`. Chat model holds `[]Message{Parts: []Interactive}`. Renderer walks parts accumulating Y offsets. Updates are localized — clicking a thinking part calls its own `Update`, doesn't touch siblings. Still needs a region map for dispatch routing.

C. Renderable type with embedded metadata (~150 lines)
Render functions return `Renderable{Content string, Regions []Region}` instead of `string`. Chat model concatenates them, regions stitch together with running line offsets. Cleaner separation than A — spatial tracking factored into a small library.

D. Cards pattern
Each addressable thing is a Card with `{ID, Render(expanded), Click()}`. Sharper unit of interactivity than full sub-models, same dispatch problem.

Recommendation

Long-term shape is B (hierarchical sub-components) because the conceptual model maps cleanly: each part owns its data and its render variants, expansion state is local. But this is a real refactor.

Short-term, we could ship A to get expandable parts working with minimal disruption, then graduate to B when we want richer per-part behavior (animations, scrolling within parts, nested interactivity).

Building blocks already available

  • `tea.MouseClickMsg` / `tea.MouseReleaseMsg` / `tea.MouseMotionMsg` — Bubble Tea v2 dispatches all of these
  • `lipgloss.Height(rendered)` — line count of any rendered chunk
  • `viewport.YOffset()` — current scroll position
  • `Mouse handling already wired` — chat model routes `MouseWheelMsg` for scrolling, so the program is mouse-enabled

Out of scope

  • Hover highlighting (phase 2)
  • Visual focus cursor for keyboard navigation (phase 2)
  • Animated expand/collapse (terminal animation is limited)
  • Nested clickables within an expanded part (deferred)

Acceptance criteria (for the eventual implementation)

  • Each part type has `summary` and `detail` view variants
  • `MessagePart` has an `Expanded bool` (or equivalent state holder)
  • Click on any part toggles its expansion
  • Keyboard fallback: `tab`/`shift+tab` to focus parts, `enter` to toggle
  • Existing flat-rendering tests still pass
  • Demo fixture exercises expanded states
  • Architecture doesn't prevent future hierarchical refactor

References

  • Discussion in EP-008 implementation session
  • Spec: `docs/specs/2026-03-22-message-parts.md`
  • Bubble Tea is on the "no built-in hit-testing" side, like Ratatui — Textual / Flutter / DOM all maintain widget trees with bounding boxes

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions