Skip to content

cloud: share executor stack between HTTP and MCP DO; wire DO telemetry#281

Merged
RhysSullivan merged 3 commits intomainfrom
rs/mcp-do-shared-layer
Apr 17, 2026
Merged

cloud: share executor stack between HTTP and MCP DO; wire DO telemetry#281
RhysSullivan merged 3 commits intomainfrom
rs/mcp-do-shared-layer

Conversation

@RhysSullivan
Copy link
Copy Markdown
Owner

Summary

Core changes expand on #280 so that the MCP session DO gets the same telemetry + shared code paths as the stateless HTTP API — without introducing drift between them.

  • Shared makeExecutionStack (new services/execution-stack.ts) — extracts the duplicated executor + engine + autumn wiring that lived in both api/protected.ts:41-48 and mcp-session.ts:121-134. Both callers now go through one function, so changes to engine construction flow to both automatically.
  • Split shared services into three layers
    • CoreSharedServices = WorkOS + Autumn (DB- and tracer-agnostic)
    • HTTP SharedServices = CoreSharedServices + per-request DbLive + fetch TelemetryLive
    • MCP DO merges CoreSharedServices with its long-lived DbLive + DoTelemetryLive
  • DO telemetry via a self-contained OTEL SDK. The DO runs in a separate isolate from the main fetch handler, so otel-cf-workers' global provider isn't available there. DoTelemetryLive provisions its own WebSdk layer with a SimpleSpanProcessor (no ctx.waitUntil to rely on for batching in a DO), so Effect.withSpan inside the DO reaches Axiom.
  • Harden DO init() error handling. A partial init now calls cleanup() before rethrowing, so a failed init can't leave a dangling DB socket + half-built MCP server.

What got tried and reverted

Wrapping McpSessionDO with otel-cf-workers' instrumentDO and @sentry/cloudflare's instrumentDurableObjectWithSentry. Both break `this` binding on `WorkerTransport`'s stream primitives — every MCP call 500s with DOMException "Illegal invocation". Rolled back in-flight and switched to the self-contained `DoTelemetryLive` approach above. DO errors surface via `console.error` (wrangler tail / CF Workers Observability) and as `exception` span events on the `McpSessionDO.init` Effect span in Axiom.

Sentry capture for DO errors is still a gap — worth a follow-up to manually `Sentry.init()` inside the DO isolate so `Sentry.captureException` lands properly there. Didn't want to tangle that with this refactor.

Test plan

  • `bun run typecheck` clean
  • `bun run test` — all 50 cloud tests pass including `mcp-session.e2e.node.test.ts` (which exercises the DO end-to-end through real postgres)
  • Deployed to production; `/`, `/.well-known/oauth-protected-resource`, and `/mcp` all respond correctly
  • Reconnect your local Claude Code MCP client and verify MCP calls still work — the existing session pointed at a pre-refactor DO version, so a reconnection is needed

Pulls the duplicated executor + engine + autumn-tracking wiring out of
`protected.ts` and `mcp-session.ts` into a single `makeExecutionStack`
so changes to engine construction flow to both entry points automatically.

Also splits shared services:
- CoreSharedServices = WorkOS + Autumn (DB- and tracer-agnostic)
- SharedServices = CoreSharedServices + per-request DbLive + fetch TelemetryLive
- MCP DO merges CoreSharedServices with its long-lived DbLive and a
  self-contained `DoTelemetryLive` (WebSdk + SimpleSpanProcessor)

Adds `DoTelemetryLive` for the DO isolate: since the DO has no global
TracerProvider (otel-cf-workers only installs one on the main fetch
handler), the DO provisions its own SDK so Effect.withSpan inside the
DO's Effect programs reaches Axiom.

DO class is deliberately NOT wrapped with otel-cf-workers' instrumentDO
or Sentry's instrumentDurableObjectWithSentry — both break `this`
binding on WorkerTransport's stream primitives and crash every MCP
request with DOMException 'Illegal invocation'. DO errors surface via
console logs (wrangler tail / CF Workers Observability) and Effect span
exception events in Axiom.

Also hardens DO init error handling: partial init now calls cleanup()
before rethrowing so a failed init can't leave a dangling DB socket.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Apr 17, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
executor-cloud 8177e39 Apr 17 2026, 09:58 PM

@pkg-pr-new
Copy link
Copy Markdown

pkg-pr-new Bot commented Apr 17, 2026

Open in StackBlitz

@executor/sdk

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/sdk@281

@executor/plugin-file-secrets

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-file-secrets@281

@executor/plugin-google-discovery

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-google-discovery@281

@executor/plugin-graphql

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-graphql@281

@executor/plugin-keychain

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-keychain@281

@executor/plugin-mcp

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-mcp@281

@executor/plugin-oauth2

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-oauth2@281

@executor/plugin-onepassword

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-onepassword@281

@executor/plugin-openapi

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-openapi@281

@executor/plugin-workos-vault

npm i https://pkg.pr.new/RhysSullivan/executor/@executor/plugin-workos-vault@281

executor

npm i https://pkg.pr.new/RhysSullivan/executor@281

commit: ef5eb1b

HTTP path: TanStack Start + our Effect error handlers swallow throws
and return 500 responses, so `Sentry.withSentry` (which only sees
uncaught exceptions) never captures them. Adds explicit
`Sentry.captureException` in the places that turn errors into 500s:
- `toErrorServerResponse` / `toErrorResponse` for the Effect API paths
- `handleMcpRequest_POST`'s catch for the MCP proxy path

DO path: wraps McpSessionDO with
`Sentry.instrumentDurableObjectWithSentry`. Unlike otel-cf-workers'
`instrumentDO` (which broke `this` binding on WorkerTransport's
stream primitives), Sentry's wrapper uses `Reflect.apply` and is
safe. Sets `instrumentPrototypeMethods: true` because the DO's
`init` / `handleRequest` live on the prototype — the default
auto-wrap only visits own properties, which misses them, and errors
thrown inside `init()` never reach Sentry without this flag.

Verified end-to-end: debug smoke tests captured both an explicit
`captureException` call (HTTP path) and a DO init() throw, both
landing in Sentry issues on the node-cloudflare-workers project.
Debug endpoints removed before merge.
…casts

- otelConfig: read AXIOM_TOKEN / AXIOM_DATASET from the typed `server`
  proxy, drop the inline OtelEnv interface and nullish fallbacks
- sentryOptions: read SENTRY_DSN from `server`, drop the
  `as unknown as { SENTRY_DSN?: string }` cast

`_env: Env` is kept only so Sentry's generics infer the DO's Env type
correctly — the value is unused.
@RhysSullivan RhysSullivan merged commit b574122 into main Apr 17, 2026
8 checks passed
RhysSullivan added a commit that referenced this pull request Apr 18, 2026
Client fingerprint capture originally landed on rs/mcp-do-shared-layer
as commit d5818e3 but wasn't included in the merge of PR #281. Ports
the logic into this PR's Effect-native annotateMcpRequest helper,
replacing the pre-port trace.getActiveSpan().setAttributes call with
Effect.annotateCurrentSpan.

Captures on every /mcp hit (including unauthorized ones):
- CF request metadata (country/city/region/timezone, ASN/AS-org, TLS,
  HTTP protocol, colo)
- Whitelisted request headers under mcp.http.header.* plus the full
  sorted header-name list to surface anything unexpected; Authorization
  recorded only as scheme + length
- Verified JWT claims (org id + account id) when present
- For POST requests, the parsed JSON-RPC body — mcp.rpc.method and
  method-specific attrs (initialize → mcp.client.{name,version,title,
  protocol_version}, capability keys, bounded capabilities/info JSON;
  tools/call → mcp.tool.name; resources/read → mcp.resource.uri;
  prompts/get → mcp.prompt.name)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant