feat(remote-mcp): Step 1 — interactive OAuth + Connection abstraction + security baseline#175
Merged
Merged
Conversation
Lays the groundwork for REMOTE_MCP_CONNECTIONS.md Step 1 without yet touching the OAuth state machine itself. - Redacted<T> wrapper for sensitive values: overrides toString, toJSON, and util.inspect so secrets render as [redacted] in any logger path (S5 in the spec). reveal() is the only sanctioned exit. - CredentialStore interface + FileCredentialStore (plaintext-disk) impl for opaque secrets like OAuth client_secret. Files at workspaces/<wsId>/credentials/secrets/<key>, mode 0o600, atomic rename, key/wsId path-traversal validation. Boundary that admits a SaaS encrypted impl later. - BundleState gains pending_auth. EngineEventType gains bundle.pending_auth. Both unwired for now; the provider/lifecycle hookup is the next chunk.
Per first-principles review: bundle-level state breaks under member-scoped
OAuth (Step 3) where each member has independent auth state. A's RT can
expire while B's stays valid — no single 'running' / 'pending_auth' for
the bundle. Lay down the Connection abstraction now so Step 3 is additive.
- Connection: { principalId, state, source, authorizationUrl?, lastError? }
state machine matches BundleState. WORKSPACE_PRINCIPAL_ID = '_workspace'
for the workspace-shared identity.
- BundleInstance.connections: Map<string, Connection>. Step 1 always uses
one entry keyed '_workspace'; Step 3 lights up multi-entry maps for
member-scope. BundleInstance.state becomes a derived summary —
summarizeConnectionState applies 'any running wins' rules.
- BundleInstance.oauthScope plumbed through types. Step 1 always treats
declared value as 'workspace' regardless.
- EngineEventType: connection.state_changed (replaces a planned
bundle.pending_auth). Forward-compatible for Step 3 — same event shape
per principal whether workspace or member scope.
Wires the interactive OAuth flow end-to-end on the server side:
- WorkspaceOAuthProvider: onInteractiveAuthRequired callback. When the
headless redirect probe doesn't land on our callback, the provider
registers (state, wsId, serverName) with oauth-flow-registry, swaps
pendingFlow.promise to the registry promise (so awaitPendingFlow
resolves when the HTTP callback lands), fires the callback so
lifecycle can transition the Connection to pending_auth, then throws
UnauthorizedError to escape the SDK auth loop. McpSource.start()'s
existing UnauthorizedError catch path takes over from there — awaits
the registry, calls finishAuth, retries connect.
- BundleLifecycleManager: recordConnectionStateChange owns Connection
state transitions, BundleInstance.state summary recompute, and
connection.state_changed SSE emission. getPendingAuthUrl is the
/initiate route's lookup path; only returns a URL when the named
Connection is in pending_auth (no leaks on stale state).
- installRemote: pre-registers the BundleInstance + 'starting' Connection
BEFORE startBundleSource so the interactive callback (fired during
source.start()) can find the instance to transition. On success ->
connection running. On failure -> connection dead + instance removed.
- Routes:
- POST /v1/mcp-auth/initiate (workspace-authed): looks up the captured
auth URL via lifecycle.getPendingAuthUrl, sets nb_oauth_state cookie
bound to /v1/mcp-auth/callback path (sha256 of state, HttpOnly,
SameSite=Lax, Secure off-localhost, Max-Age=900), returns the URL.
POST-only (no GET) so an <img> or prefetch can't trigger a flow.
CSRF protection via the existing X-Workspace-Id preflight (custom
header forces CORS preflight, blocking simple-form CSRF).
- GET /v1/mcp-auth/callback: timing-safe verifies nb_oauth_state cookie
sha256 matches URL state before resolving the flow registry. Without
the cookie, returns 400 — closes the gap where a leaked state value
alone (referrer, history, network log) could let an attacker drop
tokens into someone else's account. Clears the cookie on success.
Step 1 limitation: install API still blocks until interactive auth
completes (provider's awaitPendingFlow blocks inside startBundleSource).
Acceptable for self-host / dev where the user is the same person; the
banner appears the moment we hit pending_auth so the user knows what to
do. Non-blocking install is a follow-up to make this production-ready
for SaaS.
…avior Unit tests (test/unit/lifecycle-connection.test.ts): - recordConnectionStateChange creates the connection on first call - Emits connection.state_changed with the expected payload - BundleInstance.state mirrors the single connection's state for workspace-scope (Step 1) - authorizationUrl is cleared on transition out of pending_auth - getPendingAuthUrl returns the URL only while pending_auth, null otherwise (closes leak vector if state stale) - Returns null for unknown server / wsId / principal (no enumeration leak) - No-op + no emit when the underlying instance is missing - lastError populates on dead, clears on running Integration tests (test/integration/workspace-oauth-provider.test.ts): - Updated the two interactive-branch cases (302 redirect + 200 login page) to match the NEW behavior: provider throws UnauthorizedError (the SDK's own class), fires onInteractiveAuthRequired with the authorization URL, and registers the flow with oauth-flow-registry such that resolveWithCode resolves awaitPendingFlow's promise. - Smoke test that InteractiveOAuthNotSupportedError is still exported for any consumer importing the deprecated symbol. Total: 2209 unit tests pass, 22+5 skip integration tests pass.
Server (src/bundles/startup.ts):
- URL-bundle path now races source.start() against an early-return
signal triggered by onInteractiveAuthRequired. The moment the provider
determines interactive auth is needed (lifecycle records pending_auth,
SSE event fires, banner appears), startBundleSource returns with a
placeholder meta ('remote (pending auth)'). source.start() continues
in the background and resolves when the user completes auth via
/v1/mcp-auth/initiate -> AS -> /v1/mcp-auth/callback. This unblocks
both the install API path and the workspace-runtime boot loop —
neither hangs waiting for user interaction.
- Source registration is gated on success or pending_auth (not failure):
remote-lifecycle test still asserts hasSource() === false after a
failed connect.
Web (web/src/):
- types.ts: BundleState gains 'pending_auth'. New
ConnectionStateChangedEvent in SseEventMap.
- hooks/useEvents.ts: optional onConnectionStateChanged callback on the
workspace SSE subscription.
- api/client.ts: initiateMcpOAuth(serverName, principalId?) POSTs to
/v1/mcp-auth/initiate, returns { authorizationUrl }. Browser stores
the nb_oauth_state cookie set by the server.
- components/PendingAuthBanner.tsx: amber banner row per pending
Connection. Click Connect -> initiateMcpOAuth -> window.location.assign.
Banner self-clears when the connection's state_changed event arrives
with a non-pending_auth state.
- App.tsx: workspace shell holds the pendingAuth Map<key,event> state,
wires onConnectionStateChanged to add/remove rows, drops the map on
workspace switch, mounts <PendingAuthBanner /> inside <ShellLayout>.
…tion wiring
Closes the full pending_auth → user authorization → running cycle for
URL bundles loaded at platform boot (workspace.json), not just install-
API-driven adds. Verified end-to-end against a mock OAuth+MCP server.
- pending-auth-buffer.ts: process-local buffer for interactive-OAuth
notifications captured BEFORE BundleLifecycleManager exists. boot
ordering is workspace-runtime -> startBundleSource (provider may fire
onInteractiveAuthRequired) -> Runtime constructs lifecycle ->
seedInstance. Without the buffer the pending_auth signal is dropped.
Also adds a connection-running notification path used by background
start() completions to drive lifecycle transitions out of pending_auth.
- workspace-runtime.ts: wires onInteractiveAuthRequired callback that
calls setPendingAuth(buffer) keyed by (wsId, serverName).
- lifecycle.seedInstance: consumes the pending-auth buffer for URL
bundles and either records pending_auth (with captured authUrl) or
running. Either way emits the connection.state_changed SSE event.
- runtime.ts: registers the connection-running handler on
pending-auth-buffer post-construction so subsequent background start()
successes route through lifecycle.recordConnectionStateChange.
- startup.ts: calls notifyConnectionRunning(wsId, serverName) inside
start().then so URL bundles whose interactive OAuth completes after
boot transition the BundleInstance.connection out of pending_auth and
fire the SSE clear-banner event.
- mcp-source.ts: transport.onclose handler is wired AFTER successful
start() rather than at construction. A close during the start handshake
(SDK 401 -> auth throws -> SDK closes transport) is part of the start
flow, not a mid-session crash. Without this, the OAuth retry path
fights HealthMonitor's own restart attempts triggered by the
transient close.
- api/routes/mcp-auth.ts: middleware applied per-handler instead of
via .use(*) on a sub-app. Hono's sub-app wildcard middleware
applies to ALL routes mounted under the same prefix, which would
otherwise gate the unauthenticated /callback on workspace headers
the user's browser cannot set on a return-from-AS navigation.
Adds GET /v1/connections/pending — workspace-authed snapshot the
web client uses to populate the banner on first render
(connection.state_changed SSE only delivers from connect time
forward, so a client connecting after boot would otherwise miss the
signal until reload).
- web/api/client.ts: listPendingConnections() helper.
- web/App.tsx: on workspace activation, fetches the pending snapshot
and seeds the banner state. Subsequent transitions arrive via SSE.
E2E test scaffolding:
- test/scripts/mock-oauth-mcp-server.ts: single-process Bun.serve
implementing OAuth 2.1 metadata, DCR (RFC 7591), authorize
(200 HTML form forcing the interactive branch — same shape as
Granola/Notion/HubSpot), token endpoint, and a Streamable HTTP MCP
endpoint with one mock tool. Runs on :19999 by default.
Verified manually:
1. Add { url: 'http://localhost:19999/mcp', serverName: 'mock-granola' }
to ws_default workspace.json + allowInsecureRemotes=true in
nimblebrain.json.
2. Boot dev server. Bundle hits interactive OAuth, returns early,
workspace-runtime moves on. Connection seeded as pending_auth.
3. Browser loads workspace -> snapshot fetch -> banner appears.
4. Click Connect -> POST /v1/mcp-auth/initiate (sets nb_oauth_state
cookie) -> redirect to mock AS -> Continue button -> /callback ->
cookie verified -> flow registry resolves -> McpSource finishAuth +
retry -> bundle running -> notifyConnectionRunning ->
recordConnectionStateChange(running) -> SSE -> banner clears.
The chat panel is fixed top-0 right-0 z-10. The banner was rendered inside ShellLayout's main column, which the chat overlays — when the chat opens, the right portion of the banner including the Connect button gets hidden under it. Lift the banner to fixed full-viewport-width with z-50 so it sits above both sidebar and chat. ShellLayout gets a sibling spacer (PendingAuthBannerSpacer) that pushes its main content down by the banner's height when the pending Map is non-empty.
…dle events - Replace the ad-hoc allowlist in SseEventManager.emit() with a declarative SSE_ROUTES table keyed on EngineEventType. New event types are one-line additions; typos fail at compile time. - Without a route entry, connection.state_changed was dropped at the SSE boundary — the pending-auth banner never auto-cleared after the user completed interactive OAuth. - Add wsId to bundle.installed / crashed / recovered / dead payloads so those events can be filtered by workspace at the table boundary, closing a pre-existing cross-workspace leak (bundle events were previously fanned out to every workspace's clients). - Add unit coverage for the routing table and the mcp-auth route cookie binding (both uncovered before). - Correct misleading "verifier.json deleted after finishAuth" docstring.
…alId in Step 1 QA review follow-ups: - Replace handwritten constant-time hex compare with crypto.timingSafeEqual. Add a SHA256-hex format guard so malformed cookies are rejected up-front instead of throwing on length-mismatched buffers. - Stop reading `principalId` from /v1/mcp-auth/initiate request body. Step 1 is workspace-scope only — the principal is always WORKSPACE_PRINCIPAL_ID. Step 3 will reintroduce the field with member-auth checks and reject user-supplied "_"-prefixed values.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Step 1 of the remote MCP connections work. Lands the application-layer machinery for DCR-based OAuth-protected remote MCP servers, the per-principal Connection abstraction that Steps 2 + 3 build on, and the security plumbing every later step depends on.
Unlocks Granola and Notion (workspace-scope only — per-member scoping is Step 3).
What's in this branch
starting → pending_auth → running → crashed → dead | stopped),BundleInstance.connections: Map<string, Connection>, summary state derivation. Step 1 always populates with a single_workspaceentry; the multi-entry shape is forward-compatible with member-scope (Step 3).WorkspaceOAuthProvider.onInteractiveAuthRequiredcallback + flow registry +mcp-sourceUnauthorizedErrorretry.installRemoteand the workspace-startup loop kick offstart()non-blocking so the install API doesn't hang on user-interactive auth.POST /v1/mcp-auth/initiate(workspace-authed; sets session-bound HttpOnly state cookie; returns{authorizationUrl}).GET /v1/mcp-auth/callback(verifies cookie matchessha256(state)before consulting the flow registry).connection.state_changedper principal — workspace-scope shows one event stream per bundle (principal_workspace); member-scope (Step 3) will show one per active member. Routed through a declarativeSSE_ROUTEStable inevents.tsso workspace-scoped events filter bywsIdand unrouted types stay internal./initiate→window.location = authorizationUrl. Banner clears when the connection transitions torunning.CredentialStoreinterface +FileCredentialStore(plaintext-disk, mode 0o600 / dir 0o700, atomic writes) — introduced as the boundary that admits a SaaS-encrypted impl in a future swap.Redacted<T>wrapper consumed at theCredentialStoreboundary for logging hygiene; threading it through the OAuth provider's token / DCRclient_secret/ PKCE verifier caches is a follow-up.Verification
bun run verify:staticclean (1 pre-existing warning inurl-validator.ts)bun run test:unit— 2390 pass, 0 failbun run test:integration— 460 pass, 6 skip, 0 failbun run smoke— 17 pass, 0 failbun run dev:emptywith workspace.json containing the Granola URL +NB_API_URL=http://localhost:27249→ bundle starts → pending_auth banner appears → Connect → OAuth flow → tokens persist tomcp-oauth/granola/{client,tokens,verifier}.json→ bundle transitions torunning→ 6 tools registered (granola__list_meetings,granola__get_meetings,granola__get_meeting_transcript,granola__list_meeting_folders,granola__query_granola_meetings,granola__get_account_info) →nb__searchfinds them. Restart preserves auth (starting → runningdirectly).What's NOT in this PR
PR #166 (
feat/connections-tracks) is stacked on this and will follow.Migration / breaking changes
None. New routes, new types, new files. Existing workspaces with no remote URL bundles see zero behavior change.