Skip to content

feat(plugin): add M4 pane plugin system with typed panes and session …#4

Merged
artyomsv merged 1 commit into
masterfrom
feature/m4-plugin-system
Mar 16, 2026
Merged

feat(plugin): add M4 pane plugin system with typed panes and session …#4
artyomsv merged 1 commit into
masterfrom
feature/m4-plugin-system

Conversation

@artyomsv

Copy link
Copy Markdown
Owner

…resume

Complete M3 (Resume Engine) and M4 (Plugin System) milestones:

  • Plugin system with registry, TOML loading, 4 built-in plugins (terminal, claude-code, ssh, stripe)
  • Pane creation dialog (Ctrl+N) with category/plugin/split selection
  • Session resume via pre-assigned UUIDs for Claude Code
  • Error handler pattern matching with help dialogs
  • Atomic pane replacement (ReplacePane)
  • Window size persistence (Win32 MoveWindow / xterm sequence)
  • Resuming/preparing spinner on pane border
  • Ghost buffer replay skipped for TUI app panes
  • Platform-specific cursor handling for non-terminal panes

…resume

Complete M3 (Resume Engine) and M4 (Plugin System) milestones:

- Plugin system with registry, TOML loading, 4 built-in plugins
  (terminal, claude-code, ssh, stripe)
- Pane creation dialog (Ctrl+N) with category/plugin/split selection
- Session resume via pre-assigned UUIDs for Claude Code
- Error handler pattern matching with help dialogs
- Atomic pane replacement (ReplacePane)
- Window size persistence (Win32 MoveWindow / xterm sequence)
- Resuming/preparing spinner on pane border
- Ghost buffer replay skipped for TUI app panes
- Platform-specific cursor handling for non-terminal panes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@artyomsv artyomsv merged commit 9477725 into master Mar 16, 2026
1 check passed
@artyomsv artyomsv deleted the feature/m4-plugin-system branch March 16, 2026 00:40
artyomsv added a commit that referenced this pull request Jun 8, 2026
Apply all Critical / High / Medium / Low items surfaced by the code
review.

Source changes:

- internal/ipc/server.go
  * (C1) Broadcast now clones the marshaled frame via slices.Clone
    before fan-out. Today no callsite mutates it, but a future
    sync.Pool of bytes.Buffer would silently corrupt frames still
    being read by per-conn sendLoops; the clone makes that contract
    bullet-proof.
  * (L4 + M3) Overflow path is CAS-guarded — the warning log fires
    exactly once per slow conn and only one Close goroutine spawns,
    instead of N redundant log lines and N redundant goroutines while
    the conn is being reaped.
  * (H1 belt-and-suspenders) sendLoop sets a 30 s SetWriteDeadline per
    frame so a kernel-buffer wedge that doesn't error has a
    deterministic cleanup ceiling.
  * (M4) log.Printf migrated to logger.Warn / logger.Error so the
    project's leveled logger controls visibility.
  * (M2) Document why Send and sendFrame both check closed/overflow —
    outer check is the marshal-cost fast path, inner check is the
    race-safe gate next to the channel send.
  * (H2) Close() doc comment clarifies pending frames are intentionally
    discarded.
  * (L6) Guard comment on the conns snapshot copy in Broadcast warns
    against future "optimizations" that would reintroduce the wedge.
  * (M6) Server.Stop comment clarifies Daemon shutdown does not rely
    on a final IPC broadcast (durability lives in the on-disk
    workspace.json path).
  * New: Server.ConnCount() for test-friendly connect/disconnect sync.

- internal/daemon/event.go
  * (H4) Reserved Data key renamed to "_quil_truncated" — the new
    _quil_ namespace prevents accidental collision with caller-supplied
    keys (including a literal "truncated" key that emitters may use).
  * (H3) Compile-time invariant block asserts both caps strictly exceed
    len(truncationMarker); a future contributor lowering either cap
    below the marker length will fail to build instead of panic at
    runtime.
  * (Rules #1) const block reflowed; gofmt-clean.
  * (L1) Marker length explained (… is 3-byte UTF-8 + "[truncated]" =
    14 bytes, not 12).
  * (L2) Truncation strategy doc comment names the tail-keep rationale
    and points future emitters at TruncationStrategy if they need head.
  * (L3) TODO comment on sync.Pool trade-off (defer until profiling
    demands it; clone-then-pool is the correct sequence).

Test changes:

- internal/ipc/conn_internal_test.go (NEW, package ipc white-box)
  * TestSendLoop_ExitsOnWriteError — uses net.Pipe to force a write
    error mid-frame; asserts no goroutine leak (QA #1 finding).
  * TestConn_CloseIdempotent — 16 concurrent Close calls, none panic,
    closed flag set exactly once.
  * TestConn_SendFrameAfterCloseShortCircuits — sendFrame and Send
    after Close return ErrSendOverflow without touching the socket.

- internal/ipc/broadcast_resilience_test.go
  * (M5) Replaced time.Sleep-after-connect with waitForConnCount
    polling on the new ConnCount() accessor. No more CI flake risk
    from racing the connect-registration window.
  * (Rules #4) Added t.Parallel() to both tests.
  * (L7) TestBroadcast_MarshalErrorLogsAndReturns covers the
    previously-untested marshal-error path.

- internal/daemon/event_sizecap_test.go
  * Updated existing 4 tests for the renamed _quil_truncated key.
  * Added t.Parallel() to all 7 tests.
  * (QA #2) TestToPaneEventPayload_ExactCapBoundary verifies the > cap
    condition rejects an off-by-one refactor to >=.
  * (L8) TestToPaneEventPayload_BothOverCapSetsFlagOnce — when both
    Message and Data value exceed cap, the flag is set exactly once.
  * NEW TestToPaneEventPayload_ReservedKeyDoesNotClobberCaller proves
    the _quil_ namespace fix: a caller's literal "truncated" key
    survives untouched alongside the daemon's _quil_truncated flag.

CLAUDE.md: `internal/ipc/` bullet expanded with the resilience design
(sendCh, sendLoop, CAS overflow, ConnCount, size caps) per
documentation-maintenance rule.

NewClient audit (cmd/quil/main.go, mcp.go, version_gate.go): all
callsites either defer Close or explicitly close. SetWriteDeadline in
sendLoop is the runtime safety net for future regressions.

All green under -race.
artyomsv added a commit that referenced this pull request Jun 8, 2026
All Critical / High / Medium / Low items from the final review, plus
the test-coverage gaps QA flagged. Every fix has a one-liner in the
review notes for traceability.

## Critical (1)

C1 Ingester pending timer leak past pane destroy
  internal/hookevents/ingest.go — new Ingester.Cancel(paneID) stops
  every AfterFunc timer keyed by paneID and removes the per-pane rate
  bookkeeping. Wired into both handleDestroyPane / handleDestroyPaneReq
  paths in daemon.go so a destroyed pane cannot fire one final stale
  event ~50 ms later via the coalescer's natural window close.

## High (4)

H1 Path traversal via Spool.Cleanup(paneID)
  daemon.go — both destroy IPC handlers now validate payload.PaneID
  with isValidHexID before any FS-touching call. Defense-in-depth:
  spool.go safePaneID rejects /, \, .., NUL etc and Cleanup uses
  filepath.Clean + HasPrefix to ensure the resolved path stays under
  the spool dir. New tests TestSafePaneID and
  TestSpool_Cleanup_RejectsTraversalPaneID cover the guard.

H1 (CR) FlushAll shutdown race
  ingest.go — Ingester.closed flag set by FlushAll; subsequent Submit
  is a no-op. FlushAll now Stops every pending timer before draining
  so AfterFunc cannot fire after FlushAll returns. New tests
  TestIngester_Submit_AfterFlushAll_IsNoOp and
  TestIngester_FlushAll_StopsPendingTimers pin the invariants.

H2 QUIL_HOOK_MODE was a dead knob
  All three hook script producers (sh, ps1, js) now read $QUIL_HOOK_MODE
  and short-circuit the spool emission when mode == "off". Daemon-side
  backstop in emitHookEvent drops events when the resolved
  d.cfg.Notification.Hooks.<Source> is "off" — covers the case where
  the script-side gate didn't fire (older script on disk, env stripped).
  Storm diagnostics always pass.

H3 First hook event being storm flips HookHealthy=true
  daemon.go — emitHookEvent only sets pane.HookHealthy / LastHookEventAt
  when p.HookEvent != hookevents.EventStorm. The rate limiter's own
  self-emitted storm payload would otherwise mark a pane "hook-healthy"
  exactly when real events have stopped flowing, silencing the legacy
  idle excerpt fallback during the 10 s penalty window.

## Medium (10)

Sec M1 Cross-pane spoofing via payload.PaneID
  spool.go parseAndValidate now drops payloads whose pane_id field does
  not match the filename-derived paneID. A compromised plugin in pane A
  can no longer write events attributed to pane B.

Sec M2 Unbounded spool growth
  spool.go new rotation: when readPaneFile observes size == offset
  (fully drained) AND size >= 16 MiB, truncate the file and reset the
  offset. Rotation runs only on idle ticks so it never collides with
  in-flight reads.

CR M2 io.ReadAll OOM on multi-MB single line
  spool.go switches from io.ReadAll + bytes.Split to bufio.Reader.
  ReadBytes('\n') with per-line size enforcement. A 100 MB single line
  from a misbehaving producer no longer allocates into memory; it's
  detected at MaxTotalBytes and dropped after advancing past it.

CR M4 PowerShell JsonEscape backslash + C0 control bytes
  quil-session-hook.ps1 — backslash replacement now produces exactly
  two chars (the prior pattern produced four backslashes per input
  backslash, breaking every Windows path). Added a [regex]::Replace
  pass that converts [\x00-\x1f] to \uXXXX so ESC and other ANSI
  control bytes from Claude tool output don't break the JSON line.

CR M5 OpenCode JS clock-jump-backward
  quil-session-tracker.js — consumeToken clamps elapsed to ≥ 0 and
  resets the refill anchor on backward jumps. Without this, an NTP
  correction would freeze the rate limiter until the clock caught up.

CR M6 OpenCode JS UTF-8 mid-codepoint truncation
  quil-session-tracker.js — truncate walks backward over UTF-8
  continuation bytes (0x80-0xBF) so the cut lands on a codepoint
  boundary. The trailing "…" replaces clean truncated content instead
  of polluting it with U+FFFD.

CR M7 Unbounded timestamp on hook payload
  daemon.go emitHookEvent — clamps p.TsMs to within ±1 hour of the
  daemon's clock; events with skewed timestamps fall back to time.Now.

Rules #2 + #3 + #5 Observability + error wrapping
  spool.go — parsePayloads warns are sampled (1 in N per pane) to
  prevent log flooding from a misbehaving producer. Unmarshal errors
  log only the byte size, never err.Error() which could fragment user
  prompt content into quild.log. Spool.Init returns wrapped errors
  with package context.

Rules #4 docs/features.md updates
  features.md Notification center section now describes hook-driven
  events, the tier knob (default / verbose / off), the flow diagram,
  and the Alt+M mute keybinding.

## Low + polish

L1 forwardedHookEvents duplicate detection
  claudehook_test.go — TestForwardedHookEvents_NoDuplicates catches a
  silently-deduped slice entry at build time.

L2 FlushAll doc comment
  ingest.go — doc comment now correctly notes it's called from
  hookEventsWatcher, not Daemon.Stop.

L3 emitHookEvent Data map double-allocation
  daemon.go — single allocation with len(p.Data)+2 capacity.

L5 Dead awk truncate() function
  quil-session-hook.sh removed; replaced inline truncation guidance.
  json_escape simplified to a portable tr+sed pipeline (no GNU sed
  `:a;N;$!ba` slurp pattern; works on busybox / macOS sed).

Sec L2 PowerShell C0 control bytes
  Same fix as CR M4 above.

EventStorm constant
  ingest.go exports the literal "internal.event_storm" string so both
  producer (stormPayload) and consumer (emitHookEvent) reference the
  same constant.

## Test additions (7 new test functions)

- TestSafePaneID (table-driven, 9 cases)
- TestSpool_Cleanup_RejectsTraversalPaneID
- TestIngester_Cancel_DropsPendingForPane
- TestIngester_Cancel_DoesNotAffectOtherPanes
- TestIngester_Cancel_Idempotent
- TestIngester_Submit_AfterFlushAll_IsNoOp
- TestIngester_FlushAll_StopsPendingTimers
- TestForwardedHookEvents_NoDuplicates

All green under go test ./... and go test -race ./...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant