Skip to content

fix: preserve OAuth popup session state#1235

Merged
RhysSullivan merged 1 commit into
RhysSullivan:mainfrom
sethcarlton:fix-selfhost-oauth-callback-state
Jul 1, 2026
Merged

fix: preserve OAuth popup session state#1235
RhysSullivan merged 1 commit into
RhysSullivan:mainfrom
sethcarlton:fix-selfhost-oauth-callback-state

Conversation

@sethcarlton

Copy link
Copy Markdown
Contributor

Summary

OAuth sessions persist a raw random state token. When a self-hosted flow starts from the organization context, the authorization request wraps that token with the organization slug before sending it to the provider. The provider echoes the wrapped value back on callback, but the shared callback handler passed it directly to oauth.complete, which looks up sessions by the raw token.

Cloud already unwraps this state at its request boundary because it must use orgSlug to select the right tenant before constructing the executor. Self-host is single-tenant, so it does not need org routing on callback, but it still needs the raw token for session lookup.

This decodes wrapped callback state in the shared popup callback path and uses the raw token for both OAuth completion and popup result correlation. Raw callback state still passes through unchanged.

Testing

  • bun run --cwd packages/core/api test -- oauth-popup.test.ts
  • bun run --cwd e2e test:selfhost -- selfhost/oauth-callback-unauthenticated.test.ts

The self-host e2e test was failing before this fix with OAuth session expired or not found on the callback page, and now passes.

@greptile-apps

greptile-apps Bot commented Jul 1, 2026

Copy link
Copy Markdown

Greptile Summary

Fixes a self-host OAuth callback bug where the popup handler was forwarding the provider-echoed org-slug-wrapped state token (base64url({"state":"…","orgSlug":"…"})) directly to oauth.complete instead of the raw session token. Cloud unwraps this at its own routing boundary; self-host was missing the unwrap entirely, causing every self-hosted org-context OAuth flow to fail with "OAuth session expired or not found".

  • runOAuthCallback now calls decodeOAuthCallbackState once at the top of the function and falls back to the raw urlParams.state when no wrapping is present; the resolved sessionId is used consistently in the complete() call and all three result branches (success, provider error, defect error).
  • A unit test is added verifying that a wrapped-state callback passes the raw token to complete() and embeds it in the popup HTML sessionId; the success path is now fully exercised, while the provider-error and defect-error paths with a wrapped state remain untested.

Confidence Score: 4/5

Safe to merge — the fix is minimal and targeted, touching only the state-unwrapping step in runOAuthCallback, and the fallback preserves existing behavior for raw (non-wrapped) state values.

The core logic is correct and the implementation is clean: sessionId is computed once and used consistently across the success, provider-error, and defect-error branches. The new test locks in the success path. The only gap is that the provider-error and complete()-failure branches with a wrapped state are not explicitly tested — a future regression there would be silent.

oauth-popup.test.ts — provider-error + wrapped-state scenario is not covered by the new test.

Important Files Changed

Filename Overview
packages/core/api/src/oauth-popup.ts Adds decodeOAuthCallbackState to unwrap org-slug-prefixed state tokens before passing state to complete() and embedding sessionId in all three result branches (success, provider error, defect). Logic is correct and consistent across all paths.
packages/core/api/src/oauth-popup.test.ts Adds a test for wrapped-state success path, verifying both complete() receives the raw token and the popup HTML uses it as sessionId. Provider-error and complete()-failure paths with a wrapped state are not explicitly exercised.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant UI as Opener UI
    participant SH as Self-Host /callback
    participant SDK as decodeOAuthCallbackState
    participant OA as oauth.complete()

    UI->>SH: "GET /callback?state=base64({state:"raw123",orgSlug:"acme"})&code=…"

    Note over SH: Before fix: passes wrapped state to complete()
    SH--xOA: "complete({ state: "base64(…)" }) ❌ session not found"

    Note over SH: After fix
    SH->>SDK: decodeOAuthCallbackState("base64(…)")
    SDK-->>SH: "{ state: "raw123", orgSlug: "acme" }"
    SH->>OA: "complete({ state: "raw123", code: "…" }) ✅"
    OA-->>SH: Connection

    SH-->>UI: "popup HTML { sessionId: "raw123", ok: true }"
    Note over UI: Opener correlates on "raw123" ✅
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant UI as Opener UI
    participant SH as Self-Host /callback
    participant SDK as decodeOAuthCallbackState
    participant OA as oauth.complete()

    UI->>SH: "GET /callback?state=base64({state:"raw123",orgSlug:"acme"})&code=…"

    Note over SH: Before fix: passes wrapped state to complete()
    SH--xOA: "complete({ state: "base64(…)" }) ❌ session not found"

    Note over SH: After fix
    SH->>SDK: decodeOAuthCallbackState("base64(…)")
    SDK-->>SH: "{ state: "raw123", orgSlug: "acme" }"
    SH->>OA: "complete({ state: "raw123", code: "…" }) ✅"
    OA-->>SH: Connection

    SH-->>UI: "popup HTML { sessionId: "raw123", ok: true }"
    Note over UI: Opener correlates on "raw123" ✅
Loading

Reviews (1): Last reviewed commit: "fix: preserve OAuth popup session state" | Re-trigger Greptile

Comment on lines +207 to +230
it("completes wrapped callback state with the raw OAuth session state", async () => {
const providerState = encodeOAuthCallbackState({ state: "session-raw", orgSlug: "default" });
const received: Array<{ state: string }> = [];

const html = await Effect.runPromise(
runOAuthCallback<GoogleAuth, never, never>({
complete: (params) => {
received.push({ state: params.state });
return Effect.succeed({
kind: "oauth2",
accessTokenSecretId: "s",
refreshTokenSecretId: null,
});
},
urlParams: { state: providerState, code: "code1" },
toErrorMessage: () => ({ short: "" }),
channelName: "c",
}),
);

expect(received).toEqual([{ state: "session-raw" }]);
expect(html).toContain('"sessionId":"session-raw"');
expect(html).not.toContain(`"sessionId":"${providerState}"`);
});

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing test coverage for error paths with wrapped state

The new test validates success + wrapped state, but two other branches go untested: (1) provider error (urlParams.error present) with a wrapped state, and (2) complete() failing with a wrapped state. In both cases sessionId should still be the unwrapped raw token so the opener can correlate the failure popup. The code is correct because sessionId is computed once at the top of runOAuthCallback and shared across all three result paths, but a test for the provider-error + wrapped-state scenario would lock that invariant in explicitly — e.g. urlParams: { state: providerState, error: "access_denied" } with an assertion that html contains "sessionId":"session-raw" and not the wrapped value.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@RhysSullivan

Copy link
Copy Markdown
Owner

thanks, hooking up the e2e tests to catch this moving forward as well

@RhysSullivan RhysSullivan merged commit 1d6363f into RhysSullivan:main Jul 1, 2026
1 check passed
@sethcarlton sethcarlton deleted the fix-selfhost-oauth-callback-state branch July 2, 2026 01:23
RhysSullivan added a commit that referenced this pull request Jul 2, 2026
The cloud e2e project never gated CI either, so ten scenarios rotted.
Refresh the four whose product behavior moved intentionally:
- connect-card-ssr-origin: install URLs are org-slug-scoped since the
  org-slug console URLs change (#974); accept the slug form.
- connection-owner-isolation: /api/auth/switch-organization was deleted
  with cookie-based org switching (#1000); switch orgs the way the web
  client does, via the x-executor-organization selector header.
- oauth-connections: the popup-state fix (#1235) envelopes the callback
  state as base64url JSON; decode it and assert the inner state + orgSlug.
- unauthenticated-skeleton: the 404 page shipped as a standalone page in
  the same commit as the shell-framed assertion (#986); assert the page
  it actually renders.

Quarantine the six that need product/harness work, each with a reason:
mcp-browser-approval-org-scope + the two browser-approval scenarios
(cloud-only: the mcporter browser-approval completion never lands),
cli-device-login (device-flow terminal never reaches the emulator), and
run-panel-auto-approve (autoApprove leaves the run paused; never green
since the feature landed in #1183).
RhysSullivan added a commit that referenced this pull request Jul 2, 2026
* e2e: fix stale docs, harden dev-CLI status, add cloud+selfhost CI jobs

- e2e/AGENTS.md: the anatomy example predated the service-yielding scenario()
  signature (no more needs/ctx); capability notes said browser was cloud-only
  and mcp-oauth selfhost-only, both wrong per targets/*.ts; file placement now
  lists cloudflare/, local/, cli/; document summary, motel, test:* scripts,
  the viewer/ SPA, pr-media, and the Windows desktop/cli VM targets.
- e2e dev CLI status: probe the app URL before reporting ready (a zombie
  runner with a dead server used to read as healthy), and only parse real
  state files in .dev/ (cloud.journey.json rendered as a garbage DEAD line).
- CI: run the cloud and selfhost e2e projects on every PR/push with failure
  artifacts (trace.zip, session.mp4, step screenshots) uploaded per target.

* Fix the MCP regressions and policy gaps the e2e suite caught

Cloud (hibernatable MCP DO rework fallout):
- server.ts no longer gates MCP dispatch behind the Axiom tracer install: with
  AXIOM_TOKEN unset (any dev boot without motel) every /mcp request fell
  through to the SPA router and 404ed.
- agent-handler mounts a second serve() on /mcp/toolkits/:slug — the agents
  SDK builds an exact-match URLPattern, so the single /mcp handler never saw
  toolkit paths.
- Restore the old envelope's transport contract: JSON-RPC 405 for verbs
  outside GET/POST/DELETE/OPTIONS (was a bare 404), 200 for session DELETE
  (agents SDK answers 204), and a reconnect-worded 404 for requests that
  race a condemned DO's abort.

Selfhost (org-scoped MCP OAuth discovery):
- The org-segment strip middleware now carries the original pathname in an
  internal header, and the protected-resource metadata echoes it, so a client
  that dialed /<org>/mcp/... passes the MCP SDK's RFC 9728 resource check.
  Bare paths are untouched; the header is stripped from unrewritten requests.

Microsoft Graph URL policy:
- microsoftHttpPlugin gains the hosts' local-network dev posture: selfhost,
  cloud, and the cloudflare host thread allowLocalNetwork into
  allowUnsafeUrlOverrides, and the override now also admits plain-http
  loopback URLs (local emulators). Production behavior is unchanged: the
  flag is unset there, and non-loopback http stays rejected even with it.

Stale e2e assertion refreshed for an intentional product change:
- tool-descriptions: the execute inventory is names-only since the skills
  tool slimming; drop the per-connection description assertions.

* test(e2e): repair self-host scenarios and gate the suite in CI

The self-host e2e project never ran in CI, so it drifted red while the app
moved on. Repair the failing scenarios (stale connect-modal selectors, a racy
action-bar position read, a shared-admin connection-count assertion, a
multi-tenant-only org-slug 404 step, and a cloud-shaped toolkit MCP URL), add a
documented skip affordance to the scenario helper, and quarantine the two
Microsoft emulator scenarios that need a canonical block-YAML Graph spec
(tracked separately).

Cherry-picked from origin/fix-selfhost-e2e-and-ci (PR #1239); its CI job is
superseded by the cloud+selfhost matrix job already on this branch.

* test(e2e): quarantine the two agents-SDK transport gaps

Both are real gaps in the hibernatable Agent bridge (standalone SSE
supersede never resolves; response routing scopes JSON-RPC ids per
session instead of per stream), not regressions on this branch. Skip
with reasons so the suite gates CI while the gaps stay visible;
fixing the bridge is tracked separately.

* test(e2e): repair or quarantine the cloud scenarios that drifted on main

The cloud e2e project never gated CI either, so ten scenarios rotted.
Refresh the four whose product behavior moved intentionally:
- connect-card-ssr-origin: install URLs are org-slug-scoped since the
  org-slug console URLs change (#974); accept the slug form.
- connection-owner-isolation: /api/auth/switch-organization was deleted
  with cookie-based org switching (#1000); switch orgs the way the web
  client does, via the x-executor-organization selector header.
- oauth-connections: the popup-state fix (#1235) envelopes the callback
  state as base64url JSON; decode it and assert the inner state + orgSlug.
- unauthenticated-skeleton: the 404 page shipped as a standalone page in
  the same commit as the shell-framed assertion (#986); assert the page
  it actually renders.

Quarantine the six that need product/harness work, each with a reason:
mcp-browser-approval-org-scope + the two browser-approval scenarios
(cloud-only: the mcporter browser-approval completion never lands),
cli-device-login (device-flow terminal never reaches the emulator), and
run-panel-auto-approve (autoApprove leaves the run paused; never green
since the feature landed in #1183).

* lint: suppress the adapter-boundary error checks in the MCP agent handler

The condemned-DO abort surfaces as a plain runtime Error thrown out of the
agents SDK's serve.fetch; its message string is the only signal. Narrow
suppressions with boundary reasons, per the typed-errors skill.

* test(e2e): quarantine the seat-limit scenario on the emulate 0.9.0 Autumn gap

emulate 0.9.0's Autumn customer balances omit the expanded feature object
autumn-js asserts, so useCustomer crashes the org page into the error
boundary. Fixed upstream in UsefulSoftwareCo/emulate#8 (0.9.1); unskip
once the publish lands and the e2e dependency is bumped.

* ci: retrigger

* ci: shard the cloud e2e job so each shard gets a fresh dev stack

A full-suite run against one long-lived cloud dev server degrades partway
through: sign-in starts refusing connections and everything after fails
with fetch errors (the same SSE/OTel memory growth being instrumented on
main). Four shards, each booting its own stack, stay under the threshold.
Re-merge into one job once the leak is fixed.

* ci: split the cloud e2e job into eight shards

Four shards still hit the dev-server degradation a few minutes in on
2-core runners; eight keeps each stack's lifetime under the threshold.

* ci: retry flaky browser scenarios twice on the same stack

The remaining shard failures are scattered single-test Playwright
waitFor timeouts on 2-core runners, not systemic stack death; vitest
--retry clears them without hiding real regressions (a consistent
failure still fails after 3 attempts).

* test(e2e): quarantine the Graph default-add scenario on CI runners

Compiling the Graph spec inside dev workerd 500s on 2-core GitHub
runners and takes the dev stack down for every scenario after it in the
shard (the auth-hint/org-slug/docs-link failures in the same shard were
all downstream of this). Local runs are unaffected; skip only under CI.

* selfhost: read the local-network posture from env in the plugins seam

plugins() runs per request; loadConfig() does filesystem work (data
dir, secret key resolution) that should not ride the request path. The
env read is the same computation loadConfig makes for the flag.

* e2e: bump @executor-js/emulate to 0.10.0, unskip the seat-limit scenario

0.10.0 ships the Autumn balances.feature expansion autumn-js asserts
(UsefulSoftwareCo/emulate#8), so the org page renders again and the
scenario passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants