Skip to content

feat(auth): support Claude Code subscription tokens via secret.value prefix detection#2

Merged
nickwinder merged 6 commits into
mainfrom
nick/agentic-usability/oauth-subscription-auth
May 15, 2026
Merged

feat(auth): support Claude Code subscription tokens via secret.value prefix detection#2
nickwinder merged 6 commits into
mainfrom
nick/agentic-usability/oauth-subscription-auth

Conversation

@nickwinder
Copy link
Copy Markdown
Contributor

@nickwinder nickwinder commented May 14, 2026

Why

A full A/B sweep against a real SDK costs roughly $135–$270 in per-token API billing (Opus-class models). Claude Code on a Pro / Max / Team / Enterprise plan can authenticate with a long-lived subscription token instead, giving flat-rate billing tied to the plan.

Summary

SandboxAgentConfig stays a single shape — both auth modes flow through the same microsandbox Secret.env() TLS-substitution code path. The runtime sniffs the resolved secret.value's prefix at sandbox-create time and picks which env var name the placeholder lands under:

Resolved value prefix Sandbox env var
sk-ant-api… ANTHROPIC_API_KEY (= secret.envVar)
sk-ant-oat… CLAUDE_CODE_OAUTH_TOKEN

In both cases the cleartext never enters the VM — microsandbox swaps the $MSB_<env-var-name> placeholder for the real value only on outbound TLS to the allowed host (api.anthropic.com for claude).

Point secret.value at the host env var that holds your credential:

"agents": {
  "executor": { "command": "claude", "secret": { "value": "$ANTHROPIC_API_KEY"       } },
  "judge":    { "command": "claude", "secret": { "value": "$CLAUDE_CODE_OAUTH_TOKEN" } }
}

You can mix-and-match per role if you want.

User flow for OAuth

claude setup-token                         # one-time, interactive, ~1yr token
export CLAUDE_CODE_OAUTH_TOKEN='<token>'   # before running the eval

Then set secret: { value: "\$CLAUDE_CODE_OAUTH_TOKEN" } on the sandbox roles you want to bill against the subscription.

End-to-end verification

Smoke-tested against TC-001 with a real Claude Code subscription token + microsandbox VM:

  • Egress shows Authorization: Bearer \$MSB_CLAUDE_CODE_OAUTH_TOKEN (the placeholder, pre-substitution) leaving claude
  • /api/eval/sdk-… returned 200 and /api/claude_code/settings returned 404 (auth accepted; would be 401 if the placeholder had leaked to the wire) — confirming microsandbox swapped the placeholder for the real OAuth token at TLS time
  • Exit 0, real Web SDK solution produced in 27s

Implementation

  • applyAgentAuth(secret, adapter, secrets, env) in src/sandbox/microsandbox.ts is now a single path that always pushes a Secret.env(...) entry; the prefix only picks the env var name.
  • buildJudgeAllowlist simplified — both modes derive the agent-host allowlist entry from secret.baseUrl (always populated by validation from adapter defaults for known agents).
  • Removed the standalone isOAuthSecret helper from the public module surface (inlined as a local conditional inside applyAgentAuth).
  • Removed the earlier resolveOAuthToken() and buildAgentSecret() helpers — both subsumed by applyAgentAuth.

Trade-off

Depends on Anthropic's documented OAuth-vs-API-key prefix scheme (sk-ant-oat… vs sk-ant-api…). If that scheme ever changes the eval would route to the wrong env var slot and Claude would respond with an auth error visible in run logs.

Verification

  • 328 unit tests pass; type-check + lint clean
  • End-to-end TC-001 smoke verified above with real OAuth token + microsandbox + Anthropic API

…iption token

Adds a first-class subscription-auth option for executor/judge sandbox roles:

    "agents": {
      "executor": { "command": "claude", "useOAuth": true },
      "judge":    { "command": "claude", "useOAuth": true }
    }

Why this exists: per-token API billing for a full A/B sweep against a
real SDK costs ~$135-$270 (Opus 4.7); we just stopped a run partway in
at ~$30 sunk. Claude Code on a Pro/Max/Team/Enterprise plan can
authenticate via a long-lived subscription token instead — flat-rate
billing tied to the plan.

How it works: the framework's existing `secret` path uses microsandbox
TLS-injection — the cleartext value never enters the VM, only a
placeholder substituted on the wire for the allowed host. That model is
fundamentally incompatible with `CLAUDE_CODE_OAUTH_TOKEN` because Claude
reads the token directly from `process.env`. So `useOAuth: true`:

  - Resolves CLAUDE_CODE_OAUTH_TOKEN from the host environment
    (fail-fast with a setup-token hint if unset)
  - Injects it into the sandbox as a plain env var via `sandbox.env`
  - Skips `buildAgentSecret` for that role (no API key in env at all,
    so Claude's auth precedence falls through cleanly to OAuth)
  - Sets ANTHROPIC_BASE_URL from the adapter default
  - For judge: contributes the adapter default hostname to the network
    lockdown allowlist

Validation: exactly one of `secret` or `useOAuth: true` must be set per
sandbox role. `useOAuth: true` requires `command: "claude"`. Setting
both is rejected. Adapter-side enforcement keeps the auth surface
intentionally narrow.

User flow (one-time host setup):
  claude setup-token                          # interactive, ~1 yr token
  export CLAUDE_CODE_OAUTH_TOKEN='<token>'    # then run the eval

Tests: 354 pass (added 6 — 4 config validation + 2 resolveOAuthToken).
README + config-schema reference document the new path. Type-check clean.
@nickwinder nickwinder self-assigned this May 14, 2026
…uth flag

Replaces the dedicated `useOAuth: true` field on `SandboxAgentConfig` with
runtime prefix detection on the resolved `secret.value`:

  - `sk-ant-api-…` → API-key path (microsandbox TLS injection, unchanged)
  - `sk-ant-oat-…` → OAuth path (plain `CLAUDE_CODE_OAUTH_TOKEN` env var)

User experience is now one consistent shape — always set `secret.value` to a
host env var reference; the runtime picks the auth mode from the resolved
value at sandbox-create time:

    "executor": { "command": "claude", "secret": { "value": "$ANTHROPIC_API_KEY"      } }
    "executor": { "command": "claude", "secret": { "value": "$CLAUDE_CODE_OAUTH_TOKEN" } }

Mechanics:
  - New `applyAgentAuth(secret, adapter, secrets, env)` helper in
    `microsandbox.ts` consolidates the 3 sandbox-creation call sites
    (`execute.ts`, `judge.ts`, `sandbox.ts`) into a single call. The helper
    handles both auth modes internally.
  - New `isOAuthSecret(secret)` helper exposes the same prefix check for the
    judge's `buildJudgeAllowlist` (which adds the adapter's default hostname
    to the lockdown allowlist when in OAuth mode, since the OAuth path has no
    `secret.baseUrl` to derive from).
  - `SandboxAgentConfig.secret` becomes required again (no parallel field).
  - Removed `resolveOAuthToken()` and `buildAgentSecret()` — both subsumed by
    `applyAgentAuth`.

Trade-off: depends on Anthropic's documented OAuth-vs-API-key prefix scheme
(`sk-ant-oat-` vs `sk-ant-api-`). If that scheme changes, the eval silently
misclassifies. Caller failure mode is an auth error from the Claude API at
request time, which is observable in run logs.

Tests: drop 4 useOAuth config tests + 2 resolveOAuthToken tests; add 3 tests
for `isOAuthSecret` (OAuth value, API-key value, unset env var) and 3 tests
for `applyAgentAuth` (OAuth path, API-key path, missing required fields).
331 tests pass; type-check + lint clean.
@nickwinder nickwinder changed the title feat(auth): add useOAuth path so sandboxed Claude agents use a subscription token feat(auth): support Claude Code subscription tokens via secret.value prefix detection May 14, 2026
…t sk-ant-oat-

Smoke-tested against a real `claude setup-token` token and discovered the
prefix is `sk-ant-oat01-…`, not `sk-ant-oat-…`. The trailing dash in
OAUTH_TOKEN_PREFIX caused all real OAuth tokens to misclassify as API keys
and route through the TLS-injection path.

Dropping the trailing dash:
  - matches every documented variant: `sk-ant-oat01-…`, `sk-ant-oat02-…`, etc.
  - still cleanly distinguishes from API keys (`sk-ant-api…`) since `oat` ≠ `api`.

Test fixtures and docs updated to the real `sk-ant-oat01-` form. Verified
end-to-end with the smoke script: isOAuthSecret returns true, applyAgentAuth
populates env.CLAUDE_CODE_OAUTH_TOKEN as plain env var, no TLS secrets added.

331 tests pass; type-check + lint clean.
The original PR routed OAuth through plain env-var injection on the
assumption that Claude Code's CLAUDE_CODE_OAUTH_TOKEN reader was
incompatible with microsandbox's wire-time placeholder substitution.
Smoke-testing the placeholder path against a real Claude Code session
proved that wrong: Claude tolerates a `\$MSB_CLAUDE_CODE_OAUTH_TOKEN`
placeholder as the env var value, constructs `Authorization: Bearer
\$MSB_…` as the outbound header, and microsandbox substitutes the
placeholder for the real OAuth token at TLS interception time —
Anthropic returned 200 on `/api/eval/sdk-…` and the eval completed
end-to-end.

Collapse the two-mode dispatch in `applyAgentAuth` into one TLS-substituted
path. The resolved value's prefix only picks the env var name that carries
the placeholder:
- `sk-ant-oat…` → `CLAUDE_CODE_OAUTH_TOKEN`
- anything else → `secret.envVar` (= `ANTHROPIC_API_KEY` for claude, etc.)

Benefits:
- OAuth recovers the same "cleartext never enters the VM" security property
  API keys already had — the real subscription token only ever touches the
  outbound TLS layer to api.anthropic.com.
- One code path, fewer test modes. \`isOAuthSecret\` was only used to pick
  the env var name (now inlined as a local conditional) and to choose the
  allowlist hostname in \`buildJudgeAllowlist\` — but both auth paths now
  derive that hostname from \`secret.baseUrl\` (validation already fills it
  from the adapter default for known agents), so the OAuth branch in
  \`buildJudgeAllowlist\` is gone too.
- Less surface area in the public module API (\`isOAuthSecret\` removed
  from exports).

Tests collapse from a four-test isOAuthSecret + applyAgentAuth suite to
three unified \`applyAgentAuth\` cases (OAuth value → CLAUDE_CODE_OAUTH_TOKEN
slot; API-key value → adapter slot; precondition error). 328 unit tests
pass; type-check + lint clean.

End-to-end verified against TC-001 with a real CLAUDE_CODE_OAUTH_TOKEN:
exit 0, 27s, real solution produced; egress log shows
\`Authorization: Bearer \$MSB_CLAUDE_CODE_OAUTH_TOKEN\` (pre-substitution),
\`/api/claude_code/settings\` returns 404 (auth accepted; would be 401 if
the placeholder leaked to the wire).
The schema reference previously read like the `sk-ant-oat…` /
`sk-ant-api…` prefix dispatch was a generic feature across all
adapters. In reality it's a claude-only fork — codex, gemini, and
custom agents only have the API-key path today. Reframe the
SandboxAgentConfig description so the default behavior leads (TLS
substitution into the adapter-default env var) and the OAuth slot is
clearly tagged as the claude-specific opt-in.
…flection in tests

Code review on the PR surfaced four cleanups, all in the diff and none
changing runtime behaviour:

- `applyAgentAuth` accepted a local `AgentAuthAdapter` interface that
  duplicated three fields of the exported `AgentAdapter`. Switched to
  `Pick<AgentAdapter, 'baseUrlEnvVar' | 'additionalAllowHosts'>` so the
  shape stays tied to the real adapter type and unused fields
  (`defaultBaseUrl`) drop out.
- Added a paired `OAUTH_TOKEN_ENV_VAR = 'CLAUDE_CODE_OAUTH_TOKEN'`
  constant next to `OAUTH_TOKEN_PREFIX` so the two Anthropic-specific
  strings live together; replaced the inline literal.
- The `applyAgentAuth` tests previously inspected the opaque
  `SecretEntry` shape through a three-field reflection helper — fragile
  if the microsandbox SDK ever renames an internal field. Tests now
  assert on `Secret.env`'s call args (already mocked in this file), so
  we verify the wire contract rather than the library's internal
  representation.
- Dropped dead `isOAuthSecret: vi.fn()` mock entries in
  `execute.test.ts` and `judge.test.ts` (the function was unexported
  in an earlier refactor commit; the mocks were leftovers).
- Trimmed a verbose comment in `buildJudgeAllowlist` that restated what
  the surrounding `if` already expressed.

328 tests pass; type-check + lint clean. No runtime behaviour change,
so no re-smoke needed.
@nickwinder nickwinder requested a review from HungKNguyen May 15, 2026 00:16
@nickwinder nickwinder marked this pull request as ready for review May 15, 2026 00:16
}
const value = resolveValue(secret.value, secret.envVar);

const envVar = value.startsWith(OAUTH_TOKEN_PREFIX)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My wishlist here is to better compartmentalize the location of each part of the PR. The current architectural is that

  • /agents/ contains information about each agent's config, hardcoded secrets, desired env variable
  • /core/config.ts handle parsing the config.json and deciding the correct secret.envVar (based on agent's wishes)
  • applyAgentAuth should just follow config.ts instruction and inject the secret with actual value

This PR is mising these areas together, but I get why, since the current flow is agent -> config -> secret, but now you need the secret to decide the config. Let me see if I can create a follow up PR for this

Copy link
Copy Markdown
Collaborator

@HungKNguyen HungKNguyen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can approve, concern can be a follow up

@nickwinder nickwinder merged commit b3f7adc into main May 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants