Qurator: wire in Platform MCP Server tools#4840
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #4840 +/- ##
==========================================
+ Coverage 45.69% 46.37% +0.68%
==========================================
Files 829 832 +3
Lines 33532 34097 +565
Branches 5698 5824 +126
==========================================
+ Hits 15323 15814 +491
- Misses 16212 16277 +65
- Partials 1997 2006 +9
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Adds a `PlatformMcpContext` provider that connects to the Quilt Platform
MCP Server (PMS), discovers its tools via `tools/list`, and exposes them
to Qurator's existing tool system. Each MCP tool is namespaced
`mcp__platform__<name>` to avoid collisions with catalog-native tools.
This makes PMS's 25 tools (packages, search, S3, Athena, tabulator)
available inside the catalog chat experience, matching what external MCP
clients like Claude Web get via the Connect Server.
Implementation notes:
- Thin JSON-RPC client over Effect. MCP's official JS SDK doesn't
officially support browsers (README lists Node/Bun/Deno) and the
protocol over stateless HTTP is trivial enough to implement directly.
client.ts + bridge.ts collapsed into a single Mcp.ts; tagged errors
(McpAuthError, McpTransportError, McpProtocolError, McpRpcError)
drive per-kind handling via Effect.catchTags. index.ts uses
runtime.runFork to drive bootstrap and Fiber.interrupt for cleanup —
in-flight fetches are actually aborted on unmount.
- MCP server runs stateless HTTP, so there's no persistent connection —
each tool call is an independent POST carrying the current session
token.
- Auth: catalog session token (HS256 JWT) is forwarded as
`Authorization: Bearer <token>`. PMS doesn't validate — it passes
through to Registry, which authenticates on every call, same as
`/auth/get_credentials` for Bedrock creds.
- Bridge bypasses `Tool.make()`'s Effect-Schema decoding and passes
MCP's JSON Schema straight through to Bedrock; server-side FastMCP +
Pydantic is authoritative on input validation.
- MCP URL: `${cfg.registryUrl}/mcp/platform/mcp`, overridable via
`localStorage.QUILT_MCP_URL` (same pattern as `QUILT_BEDROCK_MODEL_ID`).
- Reads `quilt-platform://search_syntax` as a context message to help
the LLM construct better search queries.
- SSE parser normalizes CRLF → LF before splitting on blank lines
(FastMCP emits `\r\n\r\n` as event separator).
- tsconfig.target=ES5 downlevel emit breaks
`class X extends Eff.Data.TaggedError(...)(...)` — plain classes with
`_tag` field are the Effect-compatible workaround (catchTags only
inspects `_tag`).
- Drops `catalog_global_getObject` (superseded by MCP's `object_read`)
and deletes the now-unused `preview.ts`.
Tests: 14 cases in Mcp.spec.ts against stub McpClient records — covers
mapContent, buildTool, loadContext success/error paths.
Part of qhq-5d0 (Qurator ↔ Platform MCP Server integration).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The module's semantic is "platform tools and context for the assistant", mirroring `GlobalContext`. MCP is the wire transport — its prefix on the folder name leaks an implementation detail. Wire-level types and files in `Mcp.ts` keep the `Mcp*` prefix because they really do describe the MCP transport (`McpClient`, `McpToolDescriptor`, `McpContent`, `Mcp*Error`). - Folder rename via git mv. - Public hook: usePlatformMcpContext → usePlatformContext (re-exported as `use`). - Module logger tag updated to 'PlatformContext'. - Header comment trimmed. Tracked in qhq-382.1. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three coordinated refinements that share the same files; ship together. 1. `getToken` is now `() => Effect<string, McpAuthError>`. The hook reads the live token from `redux.useStore()` per call via a memoized `selectToken = createSelector(authSelectors.tokens, t => t?.token)`, following the pattern in `utils/PFSCookieManager.tsx`. The `tokenRef` indirection and stale-closure-by-effect-deps risk are gone; `Mcp.ts` uses `yield* options.getToken()` directly. 2. JSON-RPC IDs come from `uuid.v4` (already a catalog dep, used in `Conversation.ts:16`). Drops the in-closure `let nextId = 0`. IDs are strings — JSON-RPC permits this. 3. `McpContextState` → `PlatformContextState` as a `_tag`-discriminated union (Loading / Ready / Error). Eliminates the optional-fields shape. Constructors via plain object factories rather than `Eff.Data.taggedEnum` because the babel ES5 transform that broke `Eff.Data.TaggedError` for our error classes has the same problem here. `Match.tag` and conditional narrowing both work. `Context.usePushContext` only fires non-empty payloads on `Ready`. Tests: Mcp.spec.ts grows from 14 to 21 cases. New coverage: - Wire-level `make()` behavior via stubbed global fetch: - Two consecutive RPCs produce different IDs (uniqueness, not format). - Bearer token sourced from the supplied `getToken` effect. - `McpAuthError` short-circuits without firing fetch. - `PlatformContextState` constructor shape per tag. - Existing `loadContext` cases updated to assert `_tag` instead of optional fields, with type narrowing via `if (state._tag !== 'Ready')`. Tracked in qhq-382.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces raw `fetch` with `@effect/platform`'s HttpClient and validates
every wire response through `Eff.Schema` decoders.
Wire transport
- `HttpClientRequest.post` + `bearerToken` + `setHeaders` + `bodyText`
builds the request; `httpClient.execute` sends it. Cancellation and
AbortSignal still work — they're integrated by the layer.
- `Eff.Effect.provide(FetchHttpClient.layer)` is scoped to `post()`
inside `make()` rather than baked into the runtime; PlatformContext
is the only consumer of HttpClient in catalog so far. Internal-only;
the public `McpClient` interface stays `R = never`.
- `HttpClientError`s map to `McpTransportError` with the underlying
cause merged into `detail` for triage.
Wire validation
- Schemas for the JSON-RPC envelope, `tools/list` result,
`tools/call` result, `resources/read` result, and the discriminated
`McpContent` union (Text / Image / Resource + permissive `{ type }`
fallback for forward compat with future content types).
- `inputSchema` on `McpToolDescriptor` stays `Schema.Unknown` —
passthrough to Bedrock as designed.
- `decodeWith(schema, label)` helper maps any `ParseError` to
`McpProtocolError` carrying the label so failures are
source-identified ("envelope: …", "tools/list: …", etc.).
- `rpc()` no longer takes a generic `<T>` and casts; each operation
decodes its own result envelope.
SSE
- Pulled into a pure `parseSseToJson(text)` function. Whether the
response comes back as JSON or `text/event-stream`, we collect the
full body via `resp.text` then route through the same Schema decode.
- Removed the streaming reader that owned the `Response` directly —
HttpClient gives us `resp.text` and that's enough for the one-shot
SSE responses FastMCP emits under `stateless_http=True`.
Tests: Mcp.spec.ts grows from 21 to 28. New coverage:
- Malformed envelope → `McpProtocolError` with `envelope` label.
- `tools/list` shape mismatch (tools: string instead of array) →
`McpProtocolError` with `tools/list` label.
- SSE happy path: `Content-Type: text/event-stream`, CRLF-separated
`data:` line, decodes through the schema.
- `parseSseToJson` standalone: CRLF parsing, multi-line data, empty
stream returns null, malformed JSON throws.
- Stub fetch updated to decode `Uint8Array` request bodies (the form
`bodyText` produces) before JSON.parse — this caught a real-world
papercut for anyone wiring fetch stubs against this layer.
Tracked in qhq-382.3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tool collection keys go from `mcp__platform__<name>` to `platform__<name>`. MCP is the wire protocol, not a name the LLM (or the chat UI showing tool calls) needs to see. Future MCP servers would use their own service prefix, e.g. `catalog__<name>`. Single-line catalog change: extract the prefix as `TOOL_PREFIX` constant exported from `Mcp.ts` so tests reference the same source of truth. Tracked in qhq-9sh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a status banner that renders inline at the top of the conversation,
alongside the welcome message:
- Loading: "Loading platform tools…"
- Ready: "Platform tools ready · N tools available"
- Error: "Couldn’t reach platform tools" + retry action + collapsible
Details (the tagged error's `.message`).
Wrapped in `aria-live="polite"` so screen readers announce state
transitions without interrupting the user.
Hook & retry plumbing
- `usePlatformContext` now returns a `PlatformContextHandle` =
`{ state, retry }` rather than the bare state. Manual retry bumps
an internal token; the existing `useEffect` keys on it, killing
the in-flight fiber and re-running `loadContext` from scratch.
Setting state back to `Loading` at the start of each effect run
gives the user a consistent loading affordance after retry.
- Assistant API exposes the handle as `platform`. Threaded through
`UI.tsx` → `Chat.tsx` → `<PlatformStatus>`.
Auto-retry inside `loadContext`
- Transport-class failures (`McpTransportError`) auto-retry up to
twice with `Schedule.exponential('500 millis')` — total ~1.5s
before surfacing as Error so the UI lands within a reasonable
window for a chat session.
- Auth / Protocol / RPC errors do NOT auto-retry. Auth needs user
action (re-login); the others are code/server mismatches that
won't change on a re-attempt.
Helper
- `matchState(state, { Loading, Ready, Error })` exported from
`Mcp.ts` (and re-exported via `PlatformContext`) for ergonomic
pattern-match in render paths. Pure data dispatch — no Effect
import overhead at the call site.
Tests: Mcp.spec.ts grows from 28 to 34. New coverage:
- Retry-recovery: stub initialize fails twice with `McpTransportError`
then succeeds → `loadContext` returns `Ready`, observed 3 attempts.
- No retry on `McpAuthError` — exactly 1 attempt.
- No retry on `McpProtocolError` — exactly 1 attempt.
- `matchState` dispatch for each branch with payload assertions.
Manual verification still applies for the chat-UI rendering itself:
- Force the three states on the qur dev stack (normal load, stop
local PMS for transport-error, log out for auth-error) and
confirm the inline banner renders + retry button refires bootstrap.
Tracked in qhq-382.4.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…esource
Three things that were wrong or partial in the previous pass:
1. **`PlatformContextState` now uses `Eff.Data.taggedEnum`.**
The earlier pass sidestepped `taggedEnum` on the (incorrect) belief that
the catalog's ES5 target broke the same way it does for
`Data.TaggedError`. Wrong: catalog uses `ts-loader` (no babel), and
`taggedEnum`'s runtime is a `Proxy` over plain functions — no class
extension. `Data.taggedEnum` is used extensively elsewhere in the
Assistant module (Content.ts, Conversation.ts, LLM.ts).
Constructors and `$match` come for free; the manual factories +
`matchState` helper are gone. UI consumers use
`PlatformContextState.$match(state, { Loading, Ready, Error })`.
2. **Tests inject fetch only via `FetchHttpClient.Fetch` service tag.**
The previous "belt + suspenders" (service tag AND `vi.stubGlobal`) was
a misdiagnosis — the actual bug was the test stub not decoding
Uint8Array request bodies before `JSON.parse`. Once that was fixed,
the canonical Effect injection works on its own. No globalThis
pollution, no `afterEach` cleanup needed.
3. **`quilt-platform://athena` surfaced as Qurator context.**
Bootstrap now reads both `search_syntax` and `athena` resources
(concurrent, best-effort — failures fall through to no message).
Audit decisions per the qhq-5d0 design: `buckets` is dropped
(already in `GlobalContext/stack.ts`), `me` is dropped (already in
Qurator core context). XML tag dropped the redundant `mcp-` token:
`platform-search-syntax` / `platform-athena`.
Tests: 35 pass (was 34). New case for the athena resource. Renamed the
existing `matchState` describe to `PlatformContextState.$match`.
Tracked in qhq-382.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mechanical first step of the Connectors refactor (qhq-5d0.2 step 1): the wire client moves out of the soon-to-be-deleted PlatformContext folder into its new home. PlatformContext/index.ts re-points its imports at ../Connectors/Mcp; tests still green. Subsequent steps build the Connectors service around the wire client and delete PlatformContext entirely. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 2 of qhq-5d0.2. Defines the agent-level abstraction surface: ConnectorId, TransportConfig, ConnectorConfig, ConnectorState, the ConnectorRuntime + ConnectorsService interfaces, and a Connectors Tag. ConnectorState is a tagged enum (Connecting / Ready / Disconnected / Failed) covering both transient (auto-retry) and ack-needed transitions; helper predicates (stateIsTransient, stateRequiresAck, stateIsBlocked) give callers the same view aggregate methods will use. Layer factory is a no-op placeholder. Step 3 fills in the per-connector lifecycle fiber; step 4 wires service primitives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Runtime)
Step 3 of qhq-5d0.2. Adds the long-running per-connector state machine
the layer needs to flip Connecting → Ready → Disconnected → Failed.
Wire client (Mcp.ts) gains a ping() method (MCP base protocol). The
Connectors module then owns:
- bootstrap (initialize + tools/list, no resource autoload — D21).
Tools are namespaced `<id>__<name>` and built with executors that
route through the connector's gated callTool (D27).
- manageConnector — the lifecycle effect. Sole writer to the
per-connector state ref (D23). Loops Connecting → Ready (heartbeat
loop with reconnect budget) → Failed → wait-for-retry → repeat.
- runHeartbeat — pings every 30s with 5s timeout, updates a shared
Health ref. Threshold of 2 consecutive failures (from ping or tool
calls) flips state to Disconnected via a SubscriptionRef.changes
watcher (D24).
- runReconnectBudget — 3 attempts at 1s/3s/9s; success returns to
Ready, exhaustion → Failed{acked:false} (no silent degradation,
D22).
- makeConnectorCallTool — gates on Ready, fast-fails otherwise. On
transport error bumps the same Health counter heartbeat watches; on
success resets it. Server-application errors don't disturb health.
- buildConnectorRuntime — factory that allocates state/health/controls
refs, wires callTool, forks the lifecycle fiber under the surrounding
Scope. Optional client param for tests.
Connectors.spec.ts covers state predicates, bootstrap success/failure,
retry from Failed, acknowledge → Failed{acked:true}, callTool gating
in non-Ready states, mapping of tool-side errors, and TestClock-driven
heartbeat / reconnect / threshold-crossing scenarios.
The Layer factory is still the no-op placeholder; step 4 wires aggregate
service primitives across all per-connector runtimes.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 4 of qhq-5d0.2. The empty Layer placeholder is replaced with the
real composition:
- Allocate one ConnectorRuntime per ConnectorConfig (fork lifecycle
fibers under the layer scope; D32).
- isTransient / requiresAck / isBlocked: read all per-connector states
via SubscriptionRef.get and aggregate with the matching predicates.
- awaitUnblocked: merge state.changes streams across all connectors;
re-evaluate isBlocked on each emission; complete on first false.
Zero-config layer short-circuits to Effect.void.
- contextContribution (D28): walk byId, contribute Ready connectors'
tools, render an XML <connectors> overview message with
<connector state="ready|unavailable"> per stable connector
(D29). Transient / acked-false states are skipped — the
Conversation actor blocks on AwaitingConnector before reaching this.
Layer factory takes an optional clientById map (test seam) so the
spec can drive the real layer end-to-end with stub wire clients.
buildConnectorRuntime grew the same seam.
Connectors.spec.ts gains 6 layer-level tests: initial blocked →
unblocked transition, awaitUnblocked completion, contextContribution
shape for Ready and Failed{acked:true}, multi-connector OR
aggregation, zero-config short-circuit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 5 of qhq-5d0.2. The Connectors module replaces PlatformContext at every catalog wiring point. Connectors (index.ts): - buildService extracted from layer so React can build the service via runtime.runSync into a manually-managed Scope (the Layer.scoped wrapper is kept for tests / pure-Effect callers). - Re-exports McpAuthError, McpError + family so the catalog wires auth without importing the wire module directly. Mcp.ts: drops PlatformContextState, loadContext, RESOURCES, TOOL_PREFIX, buildTool, makeExecutor, AnyRecord, the bootstrap retry policy, and the XML/Tool imports they pulled in. The Connectors lifecycle owns bootstrap and the Connectors module owns tool-descriptor construction. mapContent stays — Connectors uses it. Tests for the removed code go with it; the wire-level make() integration / parseSseToJson / mapContent suites remain. Assistant.tsx: - usePlatformConnectorConfig: the platform connector config built at React level (URL from cfg.registryUrl with localStorage override; auth Effect that re-reads the redux session token on every invocation per D33). - useConnectors: allocates a manual Scope, runSync's buildService into it, returns the ConnectorsService. Scope.close on unmount interrupts every per-connector lifecycle fiber (D32). - Connectors.layer is composed alongside Bedrock + Context layers via Layer.succeed(Connectors, service) so the actor sees the service. The same service object is exposed on AssistantAPI as `connectors` (replacing the previous `platform` PlatformContextHandle). - The PlatformContext folder is deleted; index.ts re-export drops it and adds Connectors. UI: Chat.tsx no longer renders the inline PlatformStatus banner — the persistent fixed status region (D30) lands in step 7. UI.tsx stops threading platform through. Aside from the missing status banner, chat behavior is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 6 of qhq-5d0.2.
State + Action additions:
- State.AwaitingConnector { events, timestamp, waiter } — D25 + D26.
Carries no per-connector snapshot; UI reads connector states live
from the SubscriptionRefs in connectors.byId.
- Action.ConnectorReady — fired by the waiter fiber on awaitUnblocked.
advanceFromEvents helper centralizes the gating decision used at the
two decision points (D25):
1. Reads connectors.isBlocked synchronously (handler discipline — no
async observation in handlers).
2. If blocked, forks a waiter that yields connectors.awaitUnblocked
then dispatches ConnectorReady. Stores the fiber on AwaitingConnector
so Abort can interrupt it.
3. Otherwise forks the LLM request and transitions to
WaitingForAssistant.
Wired into:
- Idle.Ask (existing append-event-then-forkRequest flow goes through
advanceFromEvents instead).
- ToolUse.ToolResult when the last call resolves (replaces the inline
forkRequest branch).
- AwaitingConnector handlers: ConnectorReady fires advanceFromEvents
again (state has typically transitioned to unblocked by the time
this lands); Abort interrupts the waiter and returns to Idle;
Discard mirrors Idle's behavior.
llmRequest folds in connectors.contextContribution at request time
(D28): no React-side bridge, no Context.usePushContext from
Connectors.
Conversation.spec.ts covers all five paths under a stub connectors
service whose isBlocked is a SubscriptionRef tests can flip mid-run.
Confirms LLM is NOT invoked after Abort even if the connectors
unblock.
Actor R type extended to include Connectors.Connectors.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 7 of qhq-5d0.2. The previous in-chat `<PlatformStatus>` banner
(removed in step 5) is replaced by a persistent fixed status region in
the chat layout (D30).
Connectors module gains two React bridges over its SubscriptionRefs:
- useConnectorState(connector): subscribes to one connector's state
ref via Stream.runForEach, mirrors changes into React state.
- useIsBlocked(service): re-evaluates the aggregate isBlocked
predicate whenever any connector's state changes.
Chat.tsx gains:
- ConnectorStatusRow: per-connector row that pattern-matches state and
renders the appropriate affordance — spinner for transient, ready /
unavailable badges, Retry / Continue-without buttons for failed-not-
acked, Retry-only for failed-acked. Buttons fork connector.retry /
connector.acknowledge.
- ConnectorStatusRegion: stacks rows above the message history, always
visible regardless of scroll. Renders nothing when no connectors are
configured.
- Input gating: inputDisabled = state !== 'Idle' || connectors.isBlocked.
Failed{acked:true} unblocks the input so the user can ask questions
even though tools from that connector won't run (D29 'unavailable'
state — they reach the prompt and the LLM is told they're out).
UI.tsx threads connectors through to Chat.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 8 of qhq-5d0.2. The Assistant DevTools section grows a Connectors panel between Recording controls and the Context/State/Prompt JSON displays (D31). Per-connector row: id + a one-line state summary (Failed shows acked + tagged-error detail; Disconnected shows retrying flag + error; Ready shows tool count) + manual Retry / Ack buttons that fork the corresponding effects on the runtime. Live preview of `connectors.contextContribution` rendered as a JSON tree. Re-evaluates whenever any connector's state changes via the same mergeAll pattern useIsBlocked uses, so the panel matches what the LLM would see if a turn fired right now. DevToolsProps gains `connectors`; Chat threads it through. Closes the UI work scope of qhq-5d0.3 (the bead is subsumed by this panel; the dedicated 'platform-tools state in DevTools' work no longer needs its own tracking). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final type-safety pass — discovered via tsc audit beyond what the LSP incremental cache surfaced. Chat.tsx: add an AwaitingConnector branch to Conversation.State.$match in the message history. Renders a 'Waiting for connectors…' MessageContainer with an abort action, mirroring the WaitingState shape the user sees while the LLM is mid-call. The persistent connector status region (D30) handles the actual connector status; this entry is just the in-history placeholder so the abort action stays where the user expects it. utils/Actor.ts: export the Actor interface so Conversation.spec.ts can type the actor handle returned by Actor.start. The interface was already part of the module's public shape via start's return type; exporting it makes that explicit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Addresses BLOCKERs surfaced by the fresh-eyes review of qhq-5d0.2.
1. Stale-control drain (D22 user-action gate). Retry / acknowledge
buttons in DevTools and elsewhere can be clicked while the lifecycle
is in Connecting / Ready / Disconnected. The unbounded controls
queue used to buffer those tokens and consume them on the NEXT
Failed entry — meaning a stray click during Ready could short-circuit
the next user-action gate (e.g. exiting Failed{acked:false} without
the user actually clicking Retry; or skipping the gate entirely with
a buffered acknowledge).
Fix: takeAll-drain the queue at entry to awaitRetryControl. Tokens
offered while parked at Failed are processed normally; tokens from
prior cycles get cleared.
2. Read-only-tool auto-retry (D24). Read-only tools auto-retry once on
transport error before the failure counts toward the health
threshold; destructive (or unannotated) tools fail immediately.
- Mcp.McpToolDescriptor + schema gain optional `annotations`
(readOnlyHint, destructiveHint).
- ConnectorRuntime.callTool grows an optional CallToolOptions arg
with `retryOnTransport`.
- makeConnectorCallTool re-attempts once on McpTransportError when
the option is set; only the second attempt's outcome touches
health.
- buildConnectorTool reads mcp.annotations.readOnlyHint and passes
retryOnTransport accordingly.
Tests added to Connectors.spec.ts (27 total, +5 new):
- retryOnTransport on a single transport error → second attempt wins,
status=success, attempts=2.
- retryOnTransport off (default) → single transport error fails,
attempts=1.
- readOnlyHint annotation routes through buildConnectorTool to set
retryOnTransport — verifies the wiring end-to-end through a tool's
Tool.Descriptor executor.
- stale retry/acknowledge offered while Ready do NOT prematurely apply
on transition (state stays Ready, no extra bootstrap).
- stale retry/acknowledge before first Failed entry are drained on
Failed entry — connector parks at Failed{acked:false} and waits
rather than auto-exiting.
Conversation.spec.ts (6 total, +1):
- AwaitingConnector re-enters fresh on each Ask cycle: aborting and
re-asking yields a different waiter handle (no stale-fiber reuse).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promotes the implicit McpClient interface to a named Backend contract
in Connectors/index.ts. Lifecycle becomes generic in Backend; concrete
backends adapt their wire protocol into it.
The TransportConfig single-variant taggedEnum was theatre — the real
abstraction was always McpClient (production via Mcp.make, tests via
stubClient). Names it correctly and drops the indirection.
Backend (Connectors/index.ts):
- BackendError: { _tag, message, transient? } — plain interface, no
classes. transient drives the read-only-tool retry-once branch and
the health-threshold bump (D24).
- BackendToolDescriptor: name, description, inputSchema, readOnly?
(generic version of MCP's readOnlyHint).
- Backend: { initialize, listTools, callTool, ping } — callTool returns
Tool.Result directly (mapContent moves to backend impl).
Mcp.bearerPassthru (Connectors/Mcp.ts):
- 1st-party MCP backend with catalog auth passthrough. Caller resolves
the bearer token (typically a catalog session JWT) on every call;
backend forwards as Authorization: Bearer, no token management.
- Distinct from a hypothetical Mcp.oauth(...) (would manage the OAuth
flow) along the auth-scheme variation axis.
- adaptDescriptor lifts McpToolDescriptor → BackendToolDescriptor.
- adaptResult maps wire content via mapContent and folds isError into
Tool.fail / Tool.succeed.
- adaptError preserves the wire error _tag and marks
McpTransportError as transient (only).
- getToken: () => Effect<string|null> — null short-circuits to
internal McpAuthError; cleaner than nested suspend-with-fail-class
at the catalog wiring site.
Drops:
- TransportConfig taggedEnum + makeTransportClient + $match.
- clientById parameter on buildConnectorRuntime / buildService / layer.
- All Mcp.* runtime references in lifecycle code.
- Connectors.McpAuthError re-export and the Assistant.tsx
construction site.
- McpError / McpTransportError / McpProtocolError / McpRpcError type
re-exports from Connectors (no consumers).
Touches:
- Assistant.tsx: TransportConfig.Mcp({ url, auth }) → Mcp.bearerPassthru
({ url, getToken }). Adds a direct Mcp import — backend factories
are concrete by definition; that's the right abstraction line.
- Connectors.spec.ts: stubClient → stubBackend (Backend shape, drops
readResource); transportError / authError become plain BackendError
literals; tests construct config-with-backend per-test (no clientById
parameter); replaces "delegates to client.callTool / maps content"
with "delegates to backend.callTool / forwards content" (mapContent
moved out).
- New test: "non-transient backend errors do NOT bump health" — covers
the auth-vs-transport split now that BackendError.transient is
explicit instead of the McpTransportError _tag check.
- Mcp.spec.ts: unchanged (still tests the wire client via Mcp.make,
Mcp.mapContent, error classes).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fresh-eyes review of qhq-5d0.4 (5decfa3) flagged three concrete issues plus two known-limitation items. Addresses what's clearly in-scope; the rest filed as qhq-5d0.5 and qhq-5d0.6. BackendError: abstract _tag, wire detail in `cause`. - BackendError._tag is now BackendErrorTag = 'Transport' | 'Auth' | 'Protocol' | 'Application' — connector-layer vocabulary, not wire. - `cause?: string` carries the wire-level tag (McpTransportError, etc.) for telemetry / DevTools detail. - Mcp.bearerPassthru's adaptError remaps via ERROR_TAG_MAP and sets cause = e._tag. - Lifecycle's transientError factory uses _tag: 'Transport' with cause = the lifecycle-internal kind (PingTimeout, HealthDegraded, ReconnectExhausted). - Tests updated to assert abstract tags ('Transport', 'Auth'). Conversation.spec.ts: ToolUse → blocked → AwaitingConnector. - Covers the second `advanceFromEvents` gate point (ToolResult last call), which was previously only exercised at the Idle.Ask gate. - Gates the LLM stub on a deferred so test sequencing is deterministic: Ask → WaitingForAssistant → block → release LLM → ToolUse → Tool.execute (unknown tool → fail) → ToolResult → AwaitingConnector. manageConnector: reset health BEFORE flipping to Ready. - Removes a microtask hazard where a tool call observing the Ready transition could see a stale `consecutiveFailures` from a prior Disconnected cycle. Free fix; preserves same observable behavior. Documented limitations (filed as follow-ups): - qhq-5d0.5 — useConnectors allocates lifecycle fibers during render. React abort-before-commit (Suspense unwind / ErrorBoundary catch / concurrent-mode discard) skips the cleanup useEffect and leaks fibers. Mitigated today by AssistantProvider mounting at app root above Suspense boundaries; proper fix defers allocation into useEffect at the cost of a Loading state on AssistantAPI. - qhq-5d0.6 — D24 escalating heartbeat (5s → 30s during Disconnected) not implemented. Only the 3-attempt bootstrap budget runs; no transport-only probe between attempts. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three parallel reviewers (reuse / quality / efficiency) of the qhq-5d0
diff converged on a small set of clear wins. Applied:
- Drop `useConnectorState` — `Actor.useState(connector.state)` already
does the SubscriptionRef-into-React-state subscription. Two call
sites (Chat.tsx ConnectorStatusRow, DevTools.tsx ConnectorPanelRow)
switch over.
- Drop dead `effectJsonSchema` build in `buildConnectorTool`. Effect's
permissive AnyRecord schema was generated on every descriptor build,
then immediately overwritten by `descriptor.inputSchema` spread; only
`description` survived. Inline directly. Drops the AnyRecord constant
too (unused).
- `resetHealth` helper: skip SubscriptionRef emission when health is
already INITIAL_HEALTH. Used in the hot paths (per-call success in
`makeConnectorCallTool`, per-tick success in `runHeartbeat`); the
state-transition sites in `manageConnector` keep `set` since they're
at-most-once-per-cycle.
- Slim `manageConnector(config, backend, …)` and `bootstrap(config,
backend, …)` and `runReconnectBudget(config, backend, …)` — `backend`
was always `config.backend`, since the qhq-5d0.4 Backend extraction
collapsed transport into the config field.
- Flatten `ConnectorStatusRow`'s 5-branch JSX (Connecting / Ready /
Disconnected / Failed-acked / Failed-unacked) into data-driven
variant + single render. Five copies of `<div className={classes.row}>
<icon><title><text></div>` → one. Saves ~20 LOC; same a11y semantics.
- Drop orphan jsdoc + stale `// Layer (skeleton)` divider in index.ts.
Tests: 57/57 green under Node 20.
Skipped from review:
- `selectToken` promotion to `containers/Auth/selectors` — touches an
unrelated module; out of PR scope.
- `BackendErrorTag` collapse — was just added to fix the Mcp-tag leak
flagged in qhq-5d0.4; collapsing would re-introduce it.
- `Stream.map(stateIsBlocked)` for aggregate predicates — incorrect:
blocked aggregates across all connectors, not the emitting one.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the fixed-schedule reconnect budget (1s/3s/9s, no probing
between attempts) with a probe-driven loop. During Disconnected, ping
at escalating cadence (5s start, ×2 backoff, 30s cap); each successful
ping resets cadence and triggers a bootstrap attempt. Cap at 3
bootstraps. Outer 60s timeout when ping never succeeds → Failed with
synthetic `Transport` error, cause: 'DisconnectedTimeout'.
Why this changes things:
- Before: 13s wall-clock window, no probing, full bootstrap on 1s/3s/9s
schedule regardless of network state.
- After: probe-driven, so a network blip recovering at e.g. 4s triggers
a bootstrap immediately rather than waiting for the next sleep tick.
Up to 60s window if ping stays down (vs. 13s before).
Cadence resets to 5s after each bootstrap attempt (success OR failure)
since the immediately-preceding ping confirmed transport was up;
escalating doesn't reflect reality once that signal exists.
Constants:
- PROBE_CADENCE_INITIAL = 5s
- PROBE_CADENCE_MAX = 30s
- RECONNECT_MAX_ATTEMPTS = 3
- DISCONNECTED_MAX_DURATION = 60s
- HEARTBEAT_TIMEOUT (5s) reused for probe pings
Tests:
- Replaced "reconnect succeeds" → "reconnect succeeds via probe":
asserts probe-driven path (60s heartbeat → 5s probe → success).
- Replaced "reconnect exhaustion" → split into two tests:
- "ping never succeeds → DisconnectedTimeout → Failed{!acked}":
verifies outer 60s timeout fires with the synthetic cause and
correct ping count under the escalating schedule (2 heartbeat
+ 3 probes at clock 65/75/95).
- "3 bootstrap failures after successful pings → Failed{!acked}":
verifies budget exhaustion with the LAST bootstrap error
preserved (not the synthetic timeout).
- New: "probe cadence escalates 5s → 10s → 20s → 30s" — direct
schedule verification via interleaved TestClock.adjust + ping count
assertions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ith `as const` Webpack/ts-loader inferred each $match callback's `role` literal as `string` rather than the `'status' | 'alert'` union RowVariant declares, so the union of branch returns failed to assign to the declared variable type. Vitest's esbuild path doesn't type-check, so the regression slipped past local tests. `as const` on each branch's return narrows role to its literal and makes the union assign cleanly. Comment notes the divergence so future edits know to keep the annotation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptileai have a look pls |
…calls Effect's HttpClient auto-injects W3C `traceparent` + Zipkin `b3` headers from the active span (httpClient.js:154; gated on the `currentTracerPropagation` FiberRef, default true). The platform MCP server's CORS allow-list doesn't include these, so the browser preflight returns 400 "Disallowed CORS headers" and the actual POST never fires. `HttpClient.withTracerPropagation(false)` on the `execute` Effect locally sets the FiberRef to false for the request, dropping the headers. We don't have a tracing collector on the receiving end either, so this is strictly subtractive.
… from healthy states
Two small model-layer fixes surfaced during DevTools polish:
* `buildConnectorTool` was setting `description` on the Tool.Descriptor
AND splatting it into the inputSchema. Bedrock then sent the same
string twice — once as `toolSpec.description`, once inside
`inputSchema.json` — pure context pollution. The schema is now passed
through raw.
* `connector.retry` only offered to the controls queue, which is only
drained from `Failed{!acked}` via `awaitRetryControl`. Clicking
reconnect from Ready or Disconnected went silently nowhere. Now,
Ready / Disconnected force a reconnect by degrading health to the
threshold; the Ready loop's heartbeat-vs-threshold race takes the
threshold branch and `runReconnectWithProbe` runs. Connecting is
still a no-op (already in flight); Failed stays on the queue path.
…ne link actions
Replaces the single-line `<id> | state | [Retry][Ack]` row with a flat,
header-led layout that scales to multiple connectors and reads cleanly
when the connector is healthy:
Connector: <title>
<hint in muted caption>
<state> · reconnect · acknowledge
<error detail, if any>
Specifics:
- Drop the "Connectors" group heading and the bordered card around each
connector — each connector is now a top-level section in the panel.
- Drop the `(<id>)` suffix on the heading; `<title>` carries the
user-facing label, `<id>` is dev-prefix detail (still visible via the
`contextContribution` block).
- No "Hint:" / "State:" labels — content is self-explanatory.
- Inline action affordances styled like the existing `MessageAction`
pattern (cursor:pointer span, opacity 0.7→1 on hover) — grey + 500
weight, separated from state with `·`. Avoids `M.Link`'s blue
default and the baseline misalignment of `component="button"`.
- "Retry" → "reconnect" (matches the lifecycle's actual semantics now
that retry forces a reconnect from healthy states).
- `reconnect` hidden during transient states (Connecting,
Disconnected{retrying}); `acknowledge` hidden unless `Failed{!acked}`.
- `Disconnected` rendered with `warning.dark` instead of `warning.main`
for legibility on the panel's paper background.
- `contextContribution` block hidden when nothing to contribute (no
Ready or Failed{acked} connectors), and no longer auto-expanded.
- Bottom padding belongs at the end of the section (under the
contextContribution box), not between state and contribution.
|
Thanks @fiskus, @drernie, @sir-sigurd for the reviews. Here's the disposition. Landing in this PR
Explicitly skipping
Deferred — tracked in qhq-5d0.16Everything else from the three reviews is captured in the post-merge follow-up epic, with reviewer attribution and pointers to existing siblings ( High-level groups deferred:
Greptile threads (the 04-21 P2 inline pair) are obsolete — both targeted |
Pulls in @drernie's child PR addressing two concerns from the multi-agent review: - Split BackendError.transient (health signal) from .retryable (retry-once eligibility). HTTP non-2xx: transient:true, retryable:false. Network failures: both true. read-only-tool retry now checks .retryable. - Bound initial Connecting bootstrap with a 60s timeoutFail; on hang, transition to Failed{acked:false} with cause ConnectingTimeout. - retry() during Connecting is a true no-op (no stale control token). Co-Authored-By: drernie <ernest.prabhakar@gmail.com> * origin/codex/qurator-mcp-risk-fixes: Document retryable default for connector errors Fix Qurator MCP retry and bootstrap timeout risks
Pulls in @drernie's docs PR with the user-visible documentation that was missing from #4840: catalog/CHANGELOG.md entries (Added/Changed) and a new 'Platform Tools via MCP' section + 'Connector Status' subsection in docs/Catalog/Qurator.md. Co-Authored-By: drernie <ernest.prabhakar@gmail.com> * origin/codex/qurator-mcp-changelog: Docs: drop nonexistent "acknowledge" control; clarify continue-without flow Docs: CHANGELOG + Qurator.md entries for #4840
The listener fiber is forkDaemon'd inside `start` (decoupled from the calling fiber's scope), so its lifetime has to be owned explicitly by the React hook. Without a cleanup, every Assistant remount accumulates a daemon. Mirrors the existing cleanup pattern in `useState`. Resolves the long-standing TODO at the top of useActorLayer. Surfaced by sir-sigurd's multi-agent review of #4840 (Tier-1 #2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Defensive lowercase on the response content-type lookup. @effect/platform lowercases header keys, but the value can still arrive in mixed case. - Validate the JSON-RPC response id against the request id (spec §5). Stateless HTTP makes this unlikely to trip in practice, but a proxy or buggy server emitting a stale id silently returned the wrong result before — now fails as McpProtocolError. - Map HTTP 401 / 403 to the abstract `Auth` tag (transient: false) instead of `Transport`. Token rotation is handled at the redux/auth layer; we don't want auth failures to bump the connector health counter and push us into a reconnect loop that won't help. Test fixtures updated to echo the request id rather than hardcoding 'echo' or 'x', plus two new wire-level tests (id mismatch, 401→Auth). From the multi-agent review of #4840 (sir-sigurd Tier-1 #5, fiskus minor, sir-sigurd Tier-2 #16). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a tab is hidden or the laptop is closed, browser timers freeze. The 30s heartbeat sleep that was scheduled before suspend can fire much later than its nominal duration on resume — so a connector that lost the server during suspend stays Ready (and gates blocked tool calls) for up to 30s + 5s timeout × 2 thresholds + 60s probe before the user sees it as Disconnected. Add a wake stream fed by `document.visibilitychange → visible` and `window.online`. Heartbeat and reconnect-probe sleeps are race-cancelled on wake so connectivity is re-checked immediately on resume. Single PubSub per Connectors layer fans out via `Stream.fromPubSub` to per-connector consumers; DOM listeners attached via `acquireRelease` and detached on layer release (Assistant unmount). `wake` is threaded through manageConnector / runHeartbeat / runReconnectWithProbe; defaults to `Stream.never` so existing tests (and any direct callers) keep their behavior. `buildService` allocates the wake once and passes it to each `buildConnectorRuntime`. SSR-safe: falls back to `Stream.never` if `document` / `window` aren't in scope. Test added: a Queue-backed wake stream with stub backend; firing the queue causes the heartbeat to ping immediately, well before the 30s cadence would elapse. Surfaced by sir-sigurd's review (Tier-2 #11 — was deferred to qhq-ekc; brought forward as part of this connectivity-on-resume work). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the nested ternary on `helperSeverity` with `cx(classes.hint, classes[severity])`. The severity styles use `&$hint` parent selectors so the override compiles to `.warning.hint` (specificity 0,2,0) and wins over the base `.hint` (0,1,0) without resorting to `extend`. Adding a third severity later is one rule, no new branch. From fiskus's review of #4840 (also flagged as a nit in sir-sigurd's multi-agent pass). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Status update — pushed Landed
The wake-stream change is broader than originally planned for this session — it addresses the laptop-closed / tab-hidden case where browser timers freeze and the heartbeat sleep can fire arbitrarily late on resume, leaving Qurator gated for up to ~95s before reflecting reality. Now wakes immediately on Tests
Still deferred → tracked in
|
…ents Strip references to internal design-document anchors (D1..D33) and bead IDs (qhq-*) from code comments and docstrings. These were author navigation hints into a private design doc; they don't help external contributors reading the catalog source. The remaining prose stands on its own — the WHY is preserved, the link to the design just isn't. Also trims a few comments that just restate the code (e.g. "loop back to Ready") and consolidates over-long doc blocks. No behavior change. 74/74 Assistant tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… comments Multi-agent simplify-pass over the PR. Scoped, low-risk fixes: - Mcp.ts adaptError: drop the special case for HTTP 401/403 → Auth. The wire layer now raises McpAuthError directly when post() sees 401/403, and adaptError stays a clean tag-table mapping. McpAuthError gains an optional `detail` so the wire-side instance carries the HTTP status; the existing "no session token" path keeps its default message. - Connectors.ts: extract pingOrTimeout helper (heartbeat + reconnect probe shared the same ping → timeoutFail → either block). - Connectors.ts: bumpHealth helper caps consecutiveFailures at HEALTH_THRESHOLD + 1 and skips emission when nothing changed — prevents a sustained outage from spamming subscribers tick after tick. Replaces the inline SubscriptionRef.update at heartbeat and callTool sites. - Connectors.ts: expose stateIsUnready predicate (s._tag !== 'Ready') alongside stateIsBlocked. Chat.tsx now uses it instead of the raw _tag literal at two call sites (helperSeverityFor + helperLines filter). - Comment trims across Connectors.ts / Mcp.ts / Tool.ts / Input.tsx / DevTools.tsx / Actor.ts: shorter docstrings on retry, buildConnectorTool, rpc-id check, withTracerPropagation, severity classes, and the useActorLayer cleanup. Drops what restated the code; keeps the WHY. 74/74 Assistant tests still green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Cleanup pass — pushed
Tests: 74/74 Assistant green throughout. Skipped (per agent triage): wake-stream → Context.Tag (over-engineering for one ambient stream), |
|
@greptileai one more time pls |
Greptile P1 on commit 5a97edc flagged a "race in awaitUnblocked": the conversation actor reads `isBlocked` synchronously and forks a waiter; if connectors flip to a non-blocking state between the read and the waiter subscribing to `state.changes`, the suggestion was that the waiter could miss the transition. Sir-sigurd's earlier multi-agent review already triaged this as "did not survive verification" — `SubscriptionRef.changes` replays the current value on subscribe, so the merged stream's first emission re-evaluates `isBlocked` against the post-flip state and take(1) exits immediately. Adding: - An inline comment on the `awaitUnblocked` definition documenting the no-race rationale. - A spec that pins the scenario explicitly: bring a connector to Ready first, then call `awaitUnblocked` and verify it returns. 40/40 Connectors tests green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cut the multi-line rationale to one line and dropped the verbose test scaffolding — the test name + the inline replay-semantics line are enough. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without `expect(isBlocked).toBe(false)` before the awaitUnblocked call, the test only proves "doesn't throw" — making the precondition explicit guards against an awaitState mismatch silently passing the test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@greptileai reassess |
Summary
Wires the Quilt Platform MCP Server into Qurator. Catalog users get the same ~25 platform tools (packages, search, S3, Athena, tabulator,
get_resource) that external MCP clients see, with full lifecycle handling and a persistent status surface tied to the chat input.What changes for users
The catalog's prior in-app
catalog_global_getObjectis reshaped intocatalog_preview(singles3_uriinput), and the platform server's read surface (platform__object_read,platform__s3_object_info,platform__package_browse, etc.) is added alongside it. Two read paths, two purposes:catalog_preview— for content the model needs to interpret. Returns thumbnail-resized images (via the/thumbnailAPI gateway, ≤1024×768), native BedrockDocumentblocks for ≤500 KiB PDFs / Office docs, and language-tagged text. Use when the goal is summarize / describe / extract.platform__object_read— for raw bytes, scripting, or any case where the model doesn't need to interpret content. Returns text decoded UTF-8 (capped at 16 KiB) or base64-encoded binary.A
<reading-objects>guidance message is added to the prompt context: callplatform__s3_object_infofirst to inspectContentLength/ContentType, then pick the right tool from the metadata.Trade-offs vs. the prior in-app tool
catalog_previewis catalog-only. External MCP clients (non-catalog) only see the platform server, so they getplatform__object_readbut not the rich preview path. Native-document support on the platform server is a separate follow-up.jpg|jpeg|png|gif|webp|bmp|tiff|tif|czi) won't preview-resize and will fall through to the platform path. Same as before this PR.Other UX changes
destructiveHint/readOnlyHintannotations is a separate follow-up.Architecture
A connector pairs a configuration (id, title, hint) with a
Backend— the protocol contract (initialize,listTools,callTool,ping). The lifecycle (Connecting → Ready → Disconnected → Failed, heartbeat, reconnect budget) is generic inBackend; the platform-specific bits live in one factory call:Mcp.bearerPassthru({ url, getToken }). Adding a different backend (an OAuth-flavored MCP, an in-process tool source) is a new factory; the lifecycle doesn't move.Per-connector lifecycle
pingevery 30 s with a 5 s timeout. Two consecutive transport failures (from heartbeat or tool calls — they share aHealthref) flip the state toDisconnected. The threshold-crossing watcher subscribes to the SubscriptionRef stream so cross-source crossings fire immediately.readOnlyHint: trueretry once on transport error before counting toward the threshold. Destructive / unannotated tools fail immediately.Failed{!acked}until the user clicks reconnect or "continue without".Failed{acked:true}is sticky — the connector reads asunavailablein the prompt and the LLM stops trying its tools.connector.retryfrom Ready or Disconnected degrades health to threshold so the lifecycle naturally falls into the reconnect probe loop. From Failed it drains the controls queue. From Connecting it's a no-op (bootstrap is already in flight).SubscriptionRef<ConnectorState>. Aggregate predicates (isBlocked,awaitUnblocked,contextContribution) merge across all per-connector refs. Lifecycle fibers are scope-bound and interrupted on Assistant unmount.Conversation integration
Two new state-machine elements:
State.AwaitingConnector { events, timestamp, waiter }— only the waiter fiber handle, no per-connector snapshot.Action.ConnectorReady.A small
advanceFromEventshelper centralizes the gating decision at both entry points (Idle.AskandToolUse.ToolResultlast-call):connectors.isBlockedsynchronously (handlers don't observe asynchronously).connectors.awaitUnblocked; that firesConnectorReadyand the handler re-runs.llmRequestfoldsconnectors.contextContributioninto the prompt context at request time. No React-side bridge.Tool dispatch
Each connector tool's executor goes through
connectors.byId[id].callTool(name, input):Tool.failtext block if not Ready.readOnlyHinttool, the call is retried once.Healthref on transport-level outcome (success → reset, failure → bump). Non-transport errors (server-side application errors, schema decode failures) don't disturb health.Wire client (
Mcp.ts)@effect/platformHttpClientoverFetchHttpClient.layer. Cancellation via fiber interrupt aborts the underlying fetch.Eff.Schemadecode at every response boundary; failures map toMcpProtocolError._tagdiscriminator:McpAuthError,McpTransportError,McpProtocolError,McpRpcError. The connector layer maps these onto an abstractBackendError(Transport/Auth/Protocol/Application); the wire-level tag is preserved oncausefor telemetry but the lifecycle and UI never speak MCP vocabulary.initialize+notifications/initialized,tools/list,tools/call,resources/read,ping.tools/list.inputSchemais forwarded to Bedrock as-is. Server is authoritative on its own schema's draft and structure; we don't reshape.stateless_http=Truemode (text/event-streamresponses with a singlemessageevent).HttpClient.withTracerPropagation(false)). Effect's HttpClient auto-injects W3Ctraceparent+ Zipkinb3headers from the active span; the platform MCP server's CORS allow-list rejects unknown headers (preflight returns 400 "Disallowed CORS headers"), and we don't have a tracing collector on the receiving end either.Tool.makeliftsdescriptionout of the JSON schema body so Bedrock doesn't ship it twice (once ontoolSpec.description, once ininputSchema.json.description). The MCP path already keeps these separate.UI
The user-facing surface is the chat input's helper-text region. When all connectors are
Ready, the helper falls back to the default disclaimer. When any connector is non-Ready, the helper-text region carries:<title>: connecting…/reconnecting…(warning, no actions — already auto-progressing)<title>: couldn't connect • reconnect • continue without(error)<title>: unavailable • reconnect(warning, input re-enabled — acked is sticky)Aggregate severity picks the worst category across connectors. Inline actions use the existing
MessageActionstyle (clickable span, opacity transition on hover).The chat panel is otherwise empty at the top — there's no persistent banner. The prior banner pushed history down and competed with the absolute-positioned hamburger menu; consolidating into the helper-text region makes the "input is disabled because…" relationship spatial.
The DevTools panel exposes a per-connector subsection: heading (
Connector: <title>), hint (muted), state line + inline reconnect/acknowledge actions, error detail row, and a livecontextContributionpreview that hides when nothing is contributable.Auth
Mcp.bearerPassthru({ getToken })is the 1st-party adapter — caller resolves the bearer on every call (the adapter never manages refresh, expiry, or session state). Catalog wiresgetTokento re-read the redux session token via a memoized selector, so token rotation is handled without explicit plumbing. A futureMcp.oauth(...)factory (or any other backend) would be a sibling, not a fork of the lifecycle.Test plan
Mcp.spec.ts(17) — wire-levelmake()integration, content mapping, parseSseToJson.Connectors.spec.ts(36) — state predicates, lifecycle transitions, retry/ack from Failed, callTool gating, transport-error → threshold-crossing, TestClock-driven heartbeat / probe-driven reconnect, read-only retry-once, layer-level service primitives + multi-connector aggregation, stale-control drain on Failed entry, autoload of reference-grade resources.Conversation.spec.ts(8) — Idle / AwaitingConnector / Abort / Discard transitions, ToolUse → blocked → AwaitingConnector path, AwaitingConnector waiter freshness across cycles.navigation.spec.ts(7) — unchanged; snapshot still valid.catalog_previewend-to-end (local catalog → dev stack PMS) — model follows the metadata-first guidance:platform__search_objects→platform__s3_object_info→catalog_previewfor an image;catalog_previewreturns a metadata text block + Image content block; the model renders a description of the rendered image. Error fallback path also exercised (graceful text-block error when the thumbnail fetch fails).catalog_previewon text + image + small PDF.Depends on
registry-<stack>.domain.com/mcp/platform/*and setsCORS_ORIGINSon the MCP task — open, dev-stack-deployed; auto-deploys to nightly on merge.Greptile Summary
This PR wires the Quilt Platform MCP Server into Qurator, adding ~25 platform tools (packages, search, S3, Athena, tabulator) to the catalog's chat UI. It introduces a complete connector abstraction layer (
Connectors.ts,Mcp.ts) with lifecycle state machine, heartbeat, probe-driven reconnect, and anAwaitingConnectorconversation state that gates LLM turns when connectors are unresolved. Thecatalog_global_getObjecttool is reshaped intocatalog_previewwith a two-path read strategy guided by a new system message.Confidence Score: 4/5
PR is safe to merge; no P0/P1 findings. All P2 issues are accounting subtleties and documentation gaps that don't break any current code path.
No critical or blocking bugs found. The four P2 comments cover: a nuanced health-accounting difference for read-only retries, a missing WEBP case in the thumbnail format switch that causes a graceful text fallback rather than a crash, a documentation gap on why HTTP 5xx errors are not retry-eligible, and a frozen-URL note for the dev override. All lifecycle, state-machine, auth, and wire-protocol paths look correct.
catalog/app/components/Assistant/Model/GlobalContext/preview.ts (WEBP format switch), catalog/app/components/Assistant/Model/Connectors/Connectors.ts (retry/health accounting), catalog/app/components/Assistant/Model/Assistant.tsx (fiber-leak on aborted render)
Important Files Changed
Sequence Diagram
sequenceDiagram participant UI as Chat UI participant Conv as ConversationActor participant CS as ConnectorsService participant LC as Lifecycle Fiber participant MCP as Platform MCP Server UI->>Conv: Action.Ask Conv->>CS: isBlocked alt blocked CS-->>Conv: true Conv->>CS: fork awaitUnblocked Conv-->>UI: State.AwaitingConnector LC-->>CS: connector flips Ready CS->>Conv: Action.ConnectorReady end Conv->>CS: contextContribution CS-->>Conv: tools + connector overview messages Conv->>MCP: LLM request via Bedrock MCP-->>Conv: LLMResponse with toolUses loop tool calls Conv->>CS: byId[connId].callTool CS->>LC: gate on Ready state LC->>MCP: POST bearer token MCP-->>LC: McpToolResult LC-->>Conv: Tool.Result end Conv->>MCP: LLM request with tool results MCP-->>Conv: final text response Conv-->>UI: State.IdleComments Outside Diff (1)
catalog/app/components/Assistant/UI/Chat/Chat.tsx, line 519-533 (link)Actor.useStatecalls violate the Rules of HooksThe comment acknowledges this but marks it safe because "
connectors.byIdis built once … and never re-keyed." That guarantee holds today, but the comment is the only guard. The ESLint rule is suppressed rather than satisfied. Ifconnectors(theConnectorsService) ever comes from a re-run ofuseConnectors(e.g., after a future hot-reload, StrictMode double-invoke, or a different connector list), the hook call-count will differ between renders and React will throw.Object.values(connectors.byId).map(c => Actor.useState(c.state))is the concrete violation line.A safe alternative is to lift the per-connector subscription into its own small component (
<ConnectorRow connector={c} />) so each hook call is unconditionally inside its own render, or to aggregate state via the existinguseIsBlocked+ a single merged stream hook, avoiding the per-connectoruseStatecalls at theChatlevel entirely.Reviews (6): Last reviewed commit: "Connectors.spec: restore precondition as..." | Re-trigger Greptile