Skip to content

Native macOS terminal pane (Variant 2: direct broker WebSocket, libghostty NSView) #34

@willwashburn

Description

@willwashburn

Summary

Ship the macOS terminal pane as a native libghostty NSView that opens its own WebSocket to the broker, bypassing Electron IPC + the Chromium compositor for the typing path. Keep the rest of the app (sidebar, chat, graph, diff, dialogs, all control-plane RPCs) in React/TS as today. Windows and Linux keep the existing xterm.js path unchanged.

Sister-issue to #33 (the full native rewrite). This is the surgical version: ~6–8 weeks of focused work that fixes the typing-latency complaint without committing to the full Mac-only rewrite.

Why

We've done what's possible inside Electron + xterm.js (#29, #30, #31, #32). The remaining ~25–35ms typing gap vs Ghostty is structural: Electron's renderer↔main IPC (~5–10ms RT) and the Chromium compositor frame (~16ms). Both of those go away on the Mac if the terminal:

  1. Renders natively to Metal via libghostty (no Chromium compositor)
  2. Talks directly to the broker over its own WebSocket (no Electron IPC for chunks)

Variant 2 does both. Expected p50 keystroke-to-paint: ~10–15ms — essentially Ghostty parity for local agents.

Goals

  • Typing latency on Mac: p50 ≤ 15ms for a shell prompt, within 5ms of Ghostty on the same hardware
  • No regression on Windows/Linux: xterm.js path stays exactly as it is today
  • No churn for the rest of the React UI: the change is contained to <TerminalInstance> and a new native module
  • Reuse existing broker investment: the @agent-relay/sdk TS package stays in the renderer for everything except the PTY chunk path

Non-goals

  • Not the full native rewrite. React UI stays.
  • Not killing the broker. Broker is still a separate process; we just talk to it directly from the native module instead of round-tripping through Electron's main process.
  • Not the memory/battery/bundle wins from Epic: Native macOS rewrite (drop Electron, Mac-only) #33. Still Electron, still Chromium.
  • Not a path to Mac-native product positioning. This is a perf fix, not a strategy change.

Architecture

┌─────────────────────────────────────────────────────────┐
│ Pear (Electron, all platforms)                          │
│                                                         │
│  React UI (sidebar, chat, graph, diff, dialogs)         │
│  - control-plane RPCs via @agent-relay/sdk through      │
│    existing main-process broker manager                 │
│                                                         │
│  <TerminalInstance>                                     │
│    if (process.platform === 'darwin'):                  │
│      ┌───────────────────────────────────────────┐      │
│      │ Native NSView (libghostty)                │──────┼──► broker WS (direct, PTY in)
│      │  - Metal paint                            │      │
│      │  - own WebSocket client (PTY subset only) │◄─────┼── PTY chunks (direct, no IPC)
│      └───────────────────────────────────────────┘      │
│    else:                                                │
│      xterm.js (current path, unchanged)                 │
└─────────────────────────────────────────────────────────┘

Split of responsibilities:

Operation Where it lives
Spawn agent, list agents, channels, personas, broker logs, status, auth TS / @agent-relay/sdk (renderer → main → broker, as today)
Resolve broker URL + auth token for a given project TS (renderer → main → broker manager)
Attach to PTY for a specific agent native module's own WS
Send keystroke bytes native module's own WS
Resize PTY native module's own WS
Receive worker_stream chunks for this agent native module's own WS
Render to screen libghostty → Metal
Selection / copy / paste / find / OSC8 links libghostty + JS bridge for keybindings
Theme TS passes color config into native module on mount + on theme change
Snapshot replay on attach native module asks broker for snapshot via its WS, writes it into libghostty
AgentNode preview tile (graph view) native module mirrors recent chunks back into JS pty-buffer-store (low-frequency, OK to IPC)

The native module needs only a PTY-subset client for the broker — roughly 4 message types. See the protocol section below.

What we need from the broker (agent-relay) side

A documented stable contract for the PTY subset of the broker WebSocket protocol, and ideally an official Swift package for it. We'll be filing a sister-issue on the agent-relay repo with this ask. Until that lands, we'll hand-roll a ~200-line Swift mini-client against the existing protocol (reverse-engineered from the TS SDK).

Required messages:

  1. Attach — open a session against an agent's PTY, optionally with rows/cols and mode (view / drive / passthrough). Returns: { snapshot: { rows, cols, cursor, screen }, mode, pending }.
  2. Send input{ name, data }. Fire-and-forget.
  3. Resize{ name, rows, cols }.
  4. Subscribe events for one agent — receive worker_stream messages { name, chunk } for the attached agent only.

Plus the bootstrap: auth handshake (API key in header) and connection URL discovery.

If/when an official Swift SDK is published, the mini-client gets replaced in a single PR.

Phased plan

Phase 0 — Spikes (1 week)

  • Spike A: embed libghostty in an NSView via a node-addon-api / N-API native module loaded by the Electron renderer. Get one PTY's output painting.
  • Spike B: parent the NSView into an Electron window's content view; position it from JS using ResizeObserver on a placeholder <div>. Confirm hit-testing, keyboard focus, copy/paste work.
  • Spike C: Swift/ObjC++ WebSocket client (URLSessionWebSocketTask or similar) connects to a local broker and round-trips one keystroke (send input → receive echo chunk). Measure p50 latency vs Ghostty.
  • Spike D: confirm libghostty's API exposes everything we need: PTY input, resize, mode queries, scrollback, selection, copy, OSC8, theming, font handling.

Exit: all four spikes green with measured latency hitting ≤ 15ms p50. Decision point: ship with libghostty, or fall back to SwiftTerm?

Phase 1 — Basic native terminal (3 weeks)

  • Native module skeleton (build, codesign, ship for arm64 + x64 in CI)
  • libghostty wrapped in an NSView, parented into Electron window
  • Position tracking from React placeholder (ResizeObserver + bounds → native)
  • Mac-only WS client speaking attach / send input / resize / subscribe
  • TS-side bridge:
    • <TerminalInstance> branches on process.platform
    • On Mac: render a <div ref={placeholderRef} />, mount native view at its bounds, hand the native module broker URL + auth + agent name
    • On other platforms: existing xterm.js path
  • Keystroke path: libghostty's input callback → native WS send (NOT JS → Electron IPC → broker)
  • Chunk path: native WS receive → libghostty write → Metal paint
  • Resize on window resize / split-pane drag
  • Theme: pipe DARK_THEME / LIGHT_THEME into libghostty at mount + on theme change
  • Mount/unmount lifecycle (no orphan WS connections, no leaked NSViews)

Exit: Mac user can type into a running agent and see output paint with measured p50 ≤ 15ms. Win/Linux still works via xterm.js.

Phase 2 — Parity with xterm features (3 weeks)

  • Selection + ⌘C copy
  • ⌘V paste (TS reads from clipboard, sends to native module which forwards to PTY)
  • ⌘F find-in-terminal
  • OSC8 hyperlinks (libghostty supports natively, wire to shell.openExternal)
  • Cursor blink / cursor styles consistent with current xterm config
  • Snapshot replay on attach (broker returns screen snapshot → libghostty writes it before subscribing)
  • View vs drive vs passthrough modes (mode passed to broker on attach; drive shows "Holding messages" UI via existing TS sidebar code, no native change needed)
  • Resize-bounce / SIGWINCH dance currently in use-terminal.ts:317
  • Click-to-focus, blur handling
  • Bell + title-change callbacks back to JS (route through existing UI hooks)
  • AgentNode graph preview: native module emits "recent chunks" snapshots back into the JS pty-buffer-store every ~200ms so the preview tile keeps working

Exit: no "I have to use the Win/Linux path to do X" cases on Mac.

Phase 3 — Overlays, polish, ship (2 weeks)

  • Z-order strategy for overlays (dropdown menus, command menu ⌘K, dialogs, pending-messages popover, theme picker):
    • Default: native view auto-hides when a modal/popover opens, restores on close
    • Catalog every overlay that can appear over the terminal; verify each
  • Tab switching, split-pane page flips — terminal NSView gets attached/detached appropriately, scrollback preserved
  • Accessibility: VoiceOver, dynamic type for terminal font, keyboard-only navigation
  • CI: build native module for arm64 + x64, codesign, attach to release artifacts
  • Telemetry: capture typing-latency metric on Mac builds, gated by user opt-in (use the existing __pearTypingStats plumbing, reported via crash/analytics path if any)
  • Migration: no data migration needed; native terminal just attaches to existing broker sessions
  • Internal dogfood: 1-week shadow run on Mac, compare against the xterm.js path side-by-side
  • Ship behind a feature flag (PEAR_NATIVE_TERMINAL_MAC=1) for one release, then default-on, then remove the flag

Exit: flag flipped on for all Mac users; no regressions vs xterm.js path; typing-latency target met in real-world telemetry.

Total estimate

Team size Estimated duration
1 dev with Electron native module + AppKit experience ~6–8 weeks
2 devs ~4 weeks

Risks

Ranked by severity:

  1. libghostty embedding limitations. If the C bindings don't expose something we need (e.g., custom mouse handling, programmatic scroll position), we either patch upstream or fall back to SwiftTerm. Mitigate: Spike D explicitly checks the API surface before committing.
  2. Z-order / overlay edge cases. The hide-on-overlay strategy is workable but has a long tail. Some menu animations will poke through. Mitigate: catalog every overlay in Phase 3 and test each; document any known quirks; accept some.
  3. Broker protocol stability for the PTY subset. If agent-relay refactors the WS protocol without warning, our mini-client breaks. Mitigate: pin the protocol version we support; assert version on connect; file the sister-issue asking for a documented contract or official Swift SDK.
  4. Two terminal codepaths to maintain. Mac native vs xterm.js for Win/Linux. Bug fixes diverge. Mitigate: shared TS abstractions where possible; consider deprecating Win/Linux terminal features that don't have libghostty parity rather than back-porting.
  5. CI complexity. Native module build, codesign, notarization on every release. Mitigate: do this in Phase 1, not at the end; cache builds aggressively.
  6. Native module crashes take down the renderer. A bug in libghostty or our binding could segfault the renderer. Mitigate: defensive bounds-checking at the FFI boundary; consider isolating in a Node worker if instability becomes a problem (would add IPC tax back for a subset of users).

Open questions

  • Which libghostty version do we pin to? Vendor or git-submodule?
  • Do we run libghostty in the renderer process (current plan) or in a separate Electron utility process? Renderer is simpler and faster, but a utility process would sandbox crashes.
  • Telemetry/opt-in: do we have a reporting channel for the typing-latency metric, or just rely on user-reported numbers via the __pearTypingStats console helper?
  • Feature flag mechanism — env var, settings.json, runtime toggle? (Probably settings.json with a nativeTerminalMac boolean.)
  • macOS minimum version on Mac? libghostty's requirements likely set the floor.

Success criteria

  • p50 keystroke-to-paint ≤ 15ms on Mac for a shell prompt, within 5ms of Ghostty on the same hardware
  • No regression on Windows/Linux (xterm.js path untouched)
  • No regression in any feature listed in Phase 2 parity checklist
  • CI builds and codesigns the native module on every release
  • Feature flag removed; default-on for Mac users in a stable release

Relationship to other issues

Sister-issue on agent-relay repo

We'll file a separate issue on the agent-relay repository asking for:

  1. A documented stable contract for the PTY subset of the broker WebSocket protocol (or an explicit "this is what the TS SDK does, treat it as the spec").
  2. An official Swift package exposing that subset. Until it lands we'll hand-roll a mini-client; once available we'll swap.

The agent-relay request links back to this issue.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions