Skip to content

Ship: BB SDK (core + realtime), custom models, app scaffold, CC workflows + adversarial-review fixes#64

Merged
SawyerHood merged 24 commits into
mainfrom
sawyer/ship-2026-06-04
Jun 4, 2026
Merged

Ship: BB SDK (core + realtime), custom models, app scaffold, CC workflows + adversarial-review fixes#64
SawyerHood merged 24 commits into
mainfrom
sawyer/ship-2026-06-04

Conversation

@SawyerHood
Copy link
Copy Markdown
Collaborator

Summary

Daily ship of sawyer-next → main: the 15-commit delta since sawyer/ship-2026-06-03 (#59), rebased onto current main, plus a full adversarial multi-agent review of the entire diff with all confirmed findings fixed.

Features

  • BB SDK Phase 1 — new @bb/sdk core package, window.bb with full CLI parity, CLI rewired as an SDK consumer
  • BB SDK Phase 2 — realtime events via bb.on (server hub → SDK subscriptions) + hardening
  • Custom provider models via managed config.json (per-provider reasoning ladders)
  • New-app scaffold — Vite/React/TS todo template + SDK-type codegen; template .d.ts ships in the packaged app
  • Claude Code workflows + ultracode reasoning level
  • Apps live-reload on public/ changes; global-app skills catalog hardening
  • Manager-templates feature removed end to end
  • Watchdog: thread-scoped background-task activity keeps provider turns alive
  • Desktop: webview jitter elimination, actions-menu clickability, keychain signing for local packaged builds

Adversarial review → fixes

A 238-agent review (11 units × 3 lens bundles, refuter panels per finding, completeness critic) confirmed 90 findings (10 high / 33 medium / 47 low; 20 refuted). All were fixed in the seven fix(review): commits, including:

  • Lifecycle: lost background tasks now settle on daemon restart / lease expiry / disconnect grace; settling no longer flips completed workflows; late task events no longer break turn-summary expansion (HTTP 400)
  • Contracts: decorative capability/bootstrap fields deleted end to end; workflowsEnabled required downstream of the boundary; dead taskType carry removed; IPC layout change tolerates legacy shape under version skew
  • Perf: window.bb runtime served from a content-hashed immutable endpoint instead of ~865KB inlined per app HTML load; backgroundTask rows fetched by targeted query
  • Codegen: generated app bb-sdk.d.ts is valid (and the drift test now typechecks output); committed runtime bundle has a staleness guard
  • Dedup/dead code: old CLI HTTP client deleted (SDK transport is now the tested path, with typed errors); realtime dispatch machinery collapsed; three host-daemon message buffers unified
  • Tests that couldn't fail now fail on revert (BrowserTabDeck rAF, ThreadActionsMenu, skill-catalog plumbing, runtime-manager hash masking)

Deliberate deferrals / follow-ups

  • partysocket adoption for the realtime client — confirmed-real but low; the hardening commit already fixed and pinned the lifecycle bugs, swapping libraries on ship day is risk-net-negative
  • Two timeline edge cases observed while fixing (not review findings): buildTimelineTurnSummaryDetails doesn't backfill latest backgroundTask state for its range; selectAcceptedClientRequestContextRows computes afterSequence from post-injection rows
  • Move scaffold module + template to services/apps/ atomically (currently co-located in services/threads/ beside the template)
  • Long-term: daemon-reported supported reasoning efforts via provider.list

Validation

  • turbo run build / typecheck — green across all 31 tasks
  • turbo run test --force — 30/30 packages green (includes new regression tests for every behavioral fix)

🤖 Generated with Claude Code

SawyerHood and others added 24 commits June 4, 2026 12:36
…+ CLI consumer

New @bb/sdk package: typed per-area SDK (threads/apps/hosts/environments/
projects/providers/managers/status/guide/replay) with a transport abstraction
that runs in node (CLI), browser, and injected app runtimes. Types derive from
@bb/server-contract / @bb/domain. CLI refactored to a thin consumer.

window.bb is regenerated from the real createInjectedBbSdk (single source of
truth, no hand-rolled string), exposing the full CLI-power surface with object-
arg data/message APIs and websocket-backed data.onChange realtime. No back-compat
shims (prototype). Phase 2 (bb.on realtime event map) deferred with seams left.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The thread/manager header sits in an Electron title-bar drag region, so the
"..." ThreadActionsMenu trigger was swallowed as a window drag. Apply the
existing MACOS_WINDOW_NO_DRAG_CLASS to the header trigger only (sidebar usage
unchanged).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The native WebContentsView bounds were driven only by an async renderer
pipeline (ResizeObserver/window resize -> rAF -> IPC -> setBounds), so during
a synchronous OS window resize the view trailed the window edge by >=1 frame +
an IPC hop. Electron 41's WebContentsView has no setAutoResize.

Introduce a resize-invariant layout descriptor ({left, top, rightInset,
bottomInset}) the renderer emits only on layout-shape changes. The main process
caches it per view and, on the host BrowserWindow will-resize/resize events,
re-projects bounds from getContentBounds() and calls setBounds() synchronously
in lockstep with the OS resize. Preserves the prior keep-alive (#94),
panel-resize (#96), and containment (#128) behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Delete manager templates end to end — server route + storage-template
service, host-daemon command handler, the domain/server/host-daemon contract
types and fields (templateName / managerTemplateName / ManagerTemplateName /
GET /manager-templates / host.list_manager_templates), the bb manager CLI
options, the frontend ManagerTemplatePicker + compose-view usage, the
bb-guide-manager-templates chapter (regenerated), and all related tests.

Managers are still created normally; they now start from default/empty thread
storage with no template selection or seeding. Existing on-disk
~/.bb/manager-templates/ user data is left intact and is now unused.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…degen

Replace the static single-file app scaffold with a full Vite + React +
TypeScript Todo app that showcases the window.bb SDK and live data-binding:
- editable source/ + pre-built public/ (served web root, flat refs), copied
  into each new app by the server (excludes dev-only screenshots/report dirs);
- an add-todos skill + README;
- SDK types are GENERATED from @bb/sdk (self-contained ambient window.bb d.ts,
  no imports) via packages/sdk/scripts/generate-app-globals-dts.mjs, with a
  drift-guard test so the vendored template types can't silently diverge.

Polished UI pass (lucide, bb oklch tokens, responsive); Live-pill + bound-data
showcase removed; error notices retained.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
electron-builder's default node_modules .d.ts pruning stripped the
app-scaffold-template's generated source/src/bb-sdk.d.ts from app.asar, so apps
created in the packaged desktop app scaffolded without their window.bb types.
Add a dedicated files FileSet for the template tree so it ships verbatim,
without relaxing .d.ts pruning elsewhere. Regression test added.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a typed realtime subscription API to @bb/sdk: bb.on({event, ...scope,
callback}) returning an idempotent unsubscribe, across thread/project/
environment/host/system (config/apps) / app / app-data (changed+resync) plus a
realtime:connection lifecycle event. One websocket per SDK instance with
ref-counted subscriptions, reconnect + resubscribe, and contract-derived
payloads (no SDK-only mirrors). window.bb.data.onChange now rides the shared
socket with full Phase 1 parity (subscribe-before-replay, buffer, version
dedupe, resync + reconnect replay). Server broadcasts app:changed alongside
system:apps-changed; app-data stays per-application.

Hardened after dual review: reject orphaned socket-ready promise on
close-before-open; connection listeners are observers (no socket ownership);
reset reconnect intent on idle close (no double-replay); reconnect replays
before emitting connected; fail-safe outgoing broadcast validation.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Local electron-builder runs previously forced identity=null and
CSC_IDENTITY_AUTO_DISCOVERY=false unless the full CI secret set was
present, so every locally packaged bb.app shipped the prebuilt
Electron's invalidated adhoc linker signature (spctl: "code has no
resources but signature indicates they must be present"). macOS
provenance-tracks such bundles, which forces syspolicyd to evaluate and
journal every exec in the app's process tree — observed pegging
syspolicyd at ~380% CPU and stalling process launches system-wide.

Signing now resolves to one of three modes:
- environment: full CI secrets — sign with the provided cert + notarize
  (published-release path, unchanged)
- keychain: no secrets — sign via keychain auto-discovery, skip
  notarization (locally built apps never carry the quarantine xattr, so
  notarization is unnecessary; a valid Developer ID seal is what lets
  Gatekeeper cache assessments and keeps fresh launches out of the
  provenance sandbox)
- disabled: no secrets and CSC_IDENTITY_AUTO_DISCOVERY=false — explicit
  unsigned build (CI workflow-artifact-only path, unchanged)

Validated: packaged app deep-verifies strict with sealed resources
(6105 files), spawn-helper and better_sqlite3.node carry the Team ID,
and the hardened-runtime app fully boots (bridge/server/daemon all
re-exec the signed binary via ELECTRON_RUN_AS_NODE).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a customModels key to the bb-app managed config so users can register
model ids the provider catalog doesn't offer (e.g. non-public preview
models):

  { "customModels": [{ "providerId": "claude-code", "model": "<id>",
    "displayName": "<label>" }] }

The server appends them to the daemon-reported model list in
/system/execution-options — including when the provider catalog fails to
load — with catalog metadata winning on model-id collision (selected-only
entries are promoted rather than shadowed). Reasoning ladders are
per-provider: claude-code gets the full low-max ladder, while codex and pi
cap at xhigh since both reject "max" provider-wide. providerId is
validated by agentProviderIdSchema at the schema boundary so the launcher
and server agree on validity and reload 422s name the offending field.

The launcher preserves customModels across managed-config writes
(pruneManagedConfig previously dropped unknown keys) and surfaces entries
in `bb-app config list`. Reasoning-effort constants move from
@bb/agent-runtime to @bb/domain so the server can build AvailableModel
entries without depending on the runtime package.

No frontend changes: config-changed notifications already invalidate the
execution-options queries, so open pickers refresh on reload.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Translate the agent SDK's task event family (task_started/task_updated/
task_progress/task_notification) into a new backgroundTask thread item:
dynamic workflows render as a live timeline row with phase groups and
per-agent progress, folded from workflow_progress delta batches and
throttled adapter-side. Progress/completion events are thread-scoped so
tasks that outlive their spawning turn stay pagination-safe; timeline
windows backfill the latest task state for in-window items.

Lifecycle ownership: the adapter settles open tasks on thread/resume and
provider process exit; the server settles dangling items when a daemon
session re-registers with a new instance id; superseded progress rows are
pruned keep-latest-while-pending.

Adds the "ultracode" reasoning level (ranked between xhigh and max,
reconciling down to xhigh on model switch) for xhigh-capable claude-code
models. The server-owned workflowsEnabled policy flows explicitly through
the daemon contract (protocol v31); the adapter decomposes ultracode into
effort "xhigh" plus flag-tier settings {enableWorkflows, ultracode}.

Also: status:null now completes dangling contextCompaction items, and
session_state_changed/tool_progress/tool_use_summary are classified so
they no longer surface as provider/unhandled debug rows. Fixtures are
captured from a real workflow run driven through the agent SDK.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Fixes from a 99-agent adversarial review of e0b03c6d7 (41 raw findings →
28 confirmed after 3-lens verification):

Client lifecycle (packages/sdk/src/realtime-client.ts):
- connectSocket constructs the socket before creating the socket-ready
  promise, so a sync throw (bad URL/factory) can no longer orphan a
  pending promise into an unhandled rejection; the reconnect timer body
  also contains sync throws instead of crashing the process
- onclose always records reconnect intent and emits 'disconnected' even
  while a stale backoff timer is pending, and a listener-driven connect
  cancels the pending timer — previously a close during backoff silently
  skipped the reconnect replay (lost app-data events, Phase 1 parity
  regression) and the orphaned timer escalated the delay while connected
- idle close that cancels a pending reconnect now emits a terminal
  disconnected event so observers don't wait forever on a promised retry
- dispatch iterates a listener snapshot: listeners added inside a
  callback no longer receive the in-flight event (preserves
  subscribe-then-replay ordering); replay delivery loops re-check
  listener.active so unsubscribe stops deliveries immediately
- buffered-event flush uses a per-path last-version rule: when the
  replay snapshot already delivered a path's final buffered state, all
  buffered events for that path are skipped — the old exact-match dedupe
  could re-deliver an older value last, leaving consumers permanently
  stale (versions are content hashes, so ordering cannot disambiguate)
- reconnect now notifies app-data:resync subscribers (broadcasts may
  have been missed) before replaying and emitting reconnected
- late realtime:connection observers receive the current state as a
  microtask snapshot instead of observing nothing until the next
  transition

Version-skew tolerance (domain/server-contract):
- changed-message and ThreadChangeMetadata types are now z.infer-derived
  from their schemas — schema/type drift is a compile error instead of a
  silent fail-closed realtime blackout at the hub's outgoing gate
- new lenient inbound schemas (changedMessageLenientSchema,
  serverMessageLenientSchema): clients strip unknown fields and filter
  unknown change kinds instead of dropping whole messages when talking
  to a newer server; SDK and web app now parse inbound traffic with
  them (the strict schemas keep guarding the server's outgoing boundary)
- changes arrays are readonly in the message types; entity events are
  delivered as shared objects with mutation blocked at compile time
- dropped the dead optional id from AppChangedMessage (never emitted,
  never consumed; scaffolded apps could only write filters that never
  match) and documented app:changed as a global app-list signal

Node support (packages/sdk/src/node.ts):
- the node transport ships a default websocket factory: global
  WebSocket on Node 22+, the ws package on supported Node 20 — bb.on
  and bb.data.onChange no longer throw out of the box on Node 20
- BbRealtimeSocket is now a minimal runtime-agnostic shape; default
  factories adapt the environment socket (browser global, node global,
  ws) instead of requiring DOM event types

API surface cleanup:
- *RealtimeOnInput types renamed to *RealtimeOnArgs and on(input) to
  on(args), matching the SDK-wide Args convention (pre-release rename)
- BbSdk and InjectedAppWindowBb derive on() from the shared BbRealtime
  interface instead of triplicating the signature
- CreateCurrentAppDataAreaArgs requires apps/realtime (the only caller
  always passed both); deleted the unreachable onChange runtime throw
- shared cloneJsonValue helper replaces two diverged copies; app-data
  events are cloned once per delivery instead of twice
- realtime URL derivation preserves a path-prefixed baseUrl (mirrors
  the HTTP transport); same-origin browser derivation still uses /ws
- resolveApplicationId reuses requireCurrentApplicationId

Frontend:
- removed the unreachable app:changed cache registry (the SPA never
  subscribes to the app entity; system:apps-changed remains the
  canonical app-list invalidation path) and replaced the bypass test
  with one pinning the no-op
- ws.ts logs dropped inbound messages instead of swallowing them

Tests (+54): buffered-flush deliver/stale-path cases, negative id and
prefix scope filtering, all previously unexercised dispatch paths,
listener exception isolation, unsubscribe-during-dispatch/replay,
backoff growth/cap/reset, close-during-pending-backoff, resync-on-
reconnect ordering, late connection-observer snapshot, lenient parsing,
hub fail-safe validation drop, notifySystem/notifyHost delivery, full
change-kind schema-gate sweep per entity, and notifyGlobalAppsChanged
end-to-end through real websockets.

Known debt (deliberately not addressed here): BbRealtimeClient and the
SPA's WebSocketManager remain two parallel realtime implementations;
consolidating them is a standalone refactor.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ider turn watchdog

The idle watchdog killed healthy workflow turns ("No provider activity
for 906s after item/completed"): item/backgroundTask/progress was added
to the activity list, but those events are thread-scoped (turn_id NULL)
and the candidate query correlated activity strictly on the anchor
row's turn_id, so a streaming workflow was invisible to it.

The anchor is now the newest event that is either scoped to the active
turn (the latest turn/started) or a thread-scoped background task
event. The NULL-turn arm is deliberately restricted to the
backgroundTask family so thread-scoped provider/error noise cannot
defer reaping a wedged turn. item/backgroundTask/completed joins the
activity list — load-bearing, because progress rows are pruned the
moment the completed row lands, which would otherwise false-fire the
watchdog right after a successful workflow.

Guard hardening from adversarial review: active threads with no
turn/started yet are excluded in SQL (a NULL activeTurnId would throw
in row parsing and abort the whole sweep tick); the turn/completed,
pending-interaction, and started-at correlations all key off the
active-turn subquery instead of the anchor's turn_id; OR fragments are
self-parenthesized (drizzle's and() does not wrap raw fragments);
empty-string providerThreadId anchors fall back to the latest real id;
and the persisted lastActivityEventType is a plain string so
activity-list edits never make stored watchdog events unparseable.

The regenerated SDK browser bundle also catches up on the workflow and
reasoning-level domain schemas from a0bd295a7, which shipped without a
bundle regen.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…s on disk

Rebuilding an app (or editing public/ files directly) never reloaded the
open app surface. Two breaks upstream of the SPA's existing reloadToken
mechanism: the daemon's apps-root watcher silently dropped every path
under <app>/public/ (no classification branch), and even reported storage
hints were gated behind an app-list signature diff that content edits
never alter.

Add a per-app content-changed signal end to end:

- host-watcher: classify <app>/public/** as application-content-changed,
  deduped to one event per app per flush batch; source/ stays
  unclassified (nothing served changes until a build writes public/).
- host-daemon: dispatch the new observed kind and send
  {type: "application-content-changed", applicationId} over the daemon
  WS, with per-app offline buffering flushed after reconnect.
- domain/server: APP_CHANGE_KINDS gains "content-changed"; app changed
  messages carry an optional id (absent = list-level, present =
  app-scoped); hub.notifyAppContentChanged broadcasts it, and notifyApp
  is narrowed to the new AppListChangeKind subset so an id-less
  content-changed is unrepresentable at the producer.
- SPA: subscribe to the "app" entity and invalidate only the changed
  app's detail + markdown-preview queries via REALTIME_APP_CHANGE_REGISTRY,
  so the detail refetch bumps dataUpdatedAt and busts the iframe
  reloadToken without reloading other open apps.
- sdk: document the new app:changed semantics; regenerate the scaffold
  bb-sdk.d.ts and browser runtime bundle.

Also extract the shared mock hub socket test helper that was previously
copy-pasted across four server test files.

Validated end to end against a live dev instance: touching
public/index.html broadcasts {entity:"app",id,changes:["content-changed"]}
within the watcher debounce; touching source/ broadcasts nothing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ntime already hosts

A thread.start/turn.submit whose freshly scanned injected-skill catalog
hash differed from the loaded runtime's hash always forced a runtime
replacement, which threw whenever the environment had active threads or
open terminals. An agent that installed a skill mid-turn (e.g. building
an app with a skills/ dir) therefore bricked its own thread: every
subsequent message failed with "already has an active runtime with
injected skill catalog X; requested Y" and was dropped, and the thread
flipped to error status while the daemon-side turn kept running.

Thread commands now pass their targetThreadId down to
ensureCompatibleEntry; when the entry already hosts that thread and has
active runtime work, the stale catalog is reused (with a warn log) and
the refresh is deferred to the next launch on an idle environment. Idle
entries are still replaced as before.

The remaining conflict case (environment busy with other threads or
terminals) now throws a typed SkillCatalogConflictError mapped to the
new workspace resolution failure code "skill_catalog_conflict" instead
of surfacing as "unknown".

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ersarial review

Follow-up to the stale-catalog deferral fix, addressing review findings:

- Keep the about-to-be-active catalog when pruning staging dirs during a
  runtime replacement. The cleanup keep-list was built only from loaded
  entries after the replaced entry was deleted, so the freshly staged new
  catalog was removed before the replacement runtime bound it — the
  refreshed runtime launched with skill roots pointing at a deleted
  directory (pre-existing, but the deferral fix made idle replacement the
  designed refresh path).

- Defer the catalog swap for any busy runtime a thread command targets,
  not just runtimes already hosting the thread. Keying the guard on
  hosting left the original brick reachable: a terminal-first entry
  (no catalog, open terminal) or a sibling thread resuming after daemon
  restart still failed and dropped the message. SkillCatalogConflictError
  now only guards the no-target invariant.

- Warn once per requested catalog per entry instead of on every command
  while the environment stays busy.

- Stop CreateEntryArgs advertising ignored resolution-time fields
  (injectedSkillSources, targetThreadId).

- Pin the full defer-then-refresh-once-idle sequence and staged-catalog
  survival in tests; the prior reuse test compared an entry's hash with
  itself.

- Regenerate the SDK browser bundle for the skill_catalog_conflict
  contract code added in the previous commit.

Known follow-ups (not in this change): the deferral is invisible to the
server (no catalog hash in command results, no idle-transition
reconciliation owner), and thread commands remain unlaned so a
catalog-bearing command racing an in-flight resume can still replace the
runtime mid-resume.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ture

The unarchive-resume fixture landed on main without the new required
turn.submit option introduced by the workflows commit; align it with the
sibling fixtures.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
plans/claude-code-workflows.md was swept into the manager-templates
removal commit by an over-broad add during conflict resolution. Plans
live in thread storage, not the repo.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t, honest contracts, d.ts codegen

Adversarial-review fixes for the apps platform surface:
- extract scaffold machinery out of routes/apps.ts into services modules;
  fs.cp filtered copy applies exclusions at every depth; build-time template
  copy reuses it so dist ships no dev artifacts; README $-pattern safe
- serve the window.bb runtime from a content-hashed immutable endpoint
  instead of inlining ~865KB into every app HTML response; bundle is
  regenerated from the real module graph (no regex+Function eval) with a
  drift-guard test against the committed artifact
- delete decorative AppRuntimeBootstrap capabilities/dataUrl/messageUrl;
  appId/applicationId required on the injected window.bb contract
- app d.ts codegen emits BbRealtime (window.bb.on typechecks in new apps),
  self-containment guards actually fire, drift test typechecks generated
  output; template source typechecked, source<->public drift guard, useTodos
  stale-snapshot race fixed, todo helper renamed for accuracy

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e entry

- delete the ~240-line dead CLI HTTP client; port its tests to the SDK
  transport that production actually runs
- typed BbHttpError (status + server code) replaces 'HTTP 404:' message
  sniffing; missing app vs missing data path report correctly again
- importing @bb/sdk/node no longer eagerly loads CLI config — bb --help/
  --version/guide work without BB_SERVER_URL; one canonical package entry
  (package.json '.' and vitest alias agree)
- status area uses domain unions instead of plain strings; dead area
  exports removed; CLI consumers (guide, pending-todos, context-env) stop
  shadowing SDK types and building network clients for local work

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- fix stale 'connected' observer state when the last targeted listener
  unsubscribes while the socket is CLOSING; fix reentrant connect during
  the 'disconnected' emit corrupting reconnectDelayMs; regression tests
  for both plus the synchronous-throw path through on()/activateListener
- collapse four copy-pasted dispatch methods and eight listener-record
  interfaces into one generic keyed implementation; named aliases replace
  inline Extract<> signatures
- hub.notifyApp signature matches reality (single change kind); hub tests
  assert distinct outcomes per kind; integration test reuses the SDK ws
  adapter; lenient/strict server-message schema drift guard; stale
  daemon-protocol comment removed

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…watchdog + reasoning-ladder single source

- lost background tasks now settle on every realistic path: daemon restart
  (regardless of previous-session status), lease expiry, and disconnect
  grace — mirroring pending-interaction reconciliation; settling no longer
  flips already-completed workflows to interrupted
- late thread-scoped backgroundTask events no longer stretch the spawning
  turn's source range (turn-summary expansion 400 fix) with projection
  regression tests; error-only workflow rows are expandable; shared
  settled-state predicate and timeline row types deduped
- backgroundTask progress rows fetched by targeted query instead of
  loading all rows and filtering in JS; timeline window backfill covered
  by tests and reuses mergeStoredEventRowsById
- workflowsEnabled stays required downstream of the server boundary and
  the default policy is tested with true; watchdog thread-scope list
  derived from threadOnlyThreadEventTypes; SQL fragments deduped;
  custom-model reasoning ladders reuse the server policy table; dead
  taskType contract carry removed

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- delete the dead first-boot PREFERENCES.md host read (and its mode
  plumbing through thread-send/queued-messages/nudge sweep) that could
  never find content after seeding removal; drop the dead request field
- historical host.list_manager_templates command rows stay in the
  read-only prune cohort (test on in-memory sqlite)
- inline the pass-through ManagerSlot wrapper; drop the stale
  'may already exist from user templates' welcome-template sentence and
  the test pinning it

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…at can fail

- browser-view layout projected in a consistent coordinate space so
  docked DevTools no longer misprojects the native view; will-resize uses
  the in-flight bounds instead of stale getContentBounds()
- attach/setBounds accept the legacy bounds shape under shell/server
  version skew; one shared collapsed-layout sentinel in the contract;
  open-tab payload uses the contract type; dead clamp export removed
- BrowserTabDeck resize test flushes rAF and fails on revert;
  ThreadActionsMenu test exercises the actual no-drag fix; no-drag class
  gated on usesDesktopChrome

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ering

- remove unreachable skill_catalog_conflict contract code; thread-brick
  regression test drives the real command path (targetThreadId from a
  thread command); staging-survival assertions check the real catalog
  hash instead of masking null with ''; log-throttle state moved off the
  exported RuntimeEntry
- ServerConnection's three parallel recoverable-message buffers unified
  behind one keyed pending buffer (existing tests pin behavior)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@SawyerHood SawyerHood merged commit 604840b into main Jun 4, 2026
6 checks passed
@SawyerHood SawyerHood deleted the sawyer/ship-2026-06-04 branch June 4, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant