Skip to content

Provider-managed OAuth credential refresh #1066

@zredlined

Description

@zredlined

Problem Statement

OpenShell providers protect credentials at the network boundary today: a sandbox sees only a placeholder (e.g. openshell:resolve:env:MS_GRAPH_ACCESS_TOKEN), and the proxy substitutes the real value when the request leaves the sandbox. This works well for static credentials with long lifetimes, including personal access tokens (PATs) for services like GitHub or GitLab — those flow through the existing provider model unchanged, with no refresh needed.

This issue is specifically about OAuth flows, which behave differently. Many enterprise environments do not permit long-lived PATs for SaaS APIs and require OAuth (often device-code or authorization-code flows) backed by short-lived access tokens and refresh tokens. Microsoft Graph (Outlook, Teams, SharePoint, OneDrive), ChatGPT/Codex, Google Workspace, Slack, and an increasing number of enterprise SSO-bound APIs fall into this category. Refresh is mandatory there: access tokens commonly expire in 15-60 minutes, so any agent that runs longer than the access-token TTL will start failing API calls mid-run.

There is no way today for a sandboxed agent to extend its OAuth session without either:

  • exposing the refresh token inside the sandbox (defeats the credential-protection property providers exist to enforce), or
  • restarting the sandbox with a host-refreshed token (loses session state, doesn't work for autonomous long-running agents).

The public demo at https://github.com/zredlined/openshell-oauth-providers-demo illustrates the current limit: host-auth.mjs performs host-only device-code login and refresh, sandbox creation passes only the current access token through a provider, and the sandbox sees a placeholder. It works for short interactive sessions; it does not handle in-session expiry for longer runs.

Proposed Design

Extend the upcoming Provider Type Profiles (#865) with refresh metadata. The profile already describes credential injection (header / bearer / query / URL path) — extend it to describe credential renewal:

  • refresh.endpoint — token endpoint URL (e.g. https://login.microsoftonline.com/<tenant>/oauth2/v2.0/token)
  • refresh.request — shape of the refresh request (form/json body, parameter names: grant_type=refresh_token, refresh_token, client_id, scope)
  • refresh.response — JSONPath to the new access token, expiry, and (if rotated) refresh token
  • refresh.expiry_hint — how to determine when refresh is needed (response field, JWT exp claim, or fixed TTL)

Refresh execution lives outside the sandbox, in the gateway/supervisor process that already holds the real credentials:

  1. The gateway tracks token expiry per provider attachment.
  2. When a sandboxed request is near expiry (or returns 401 with a known signal), the gateway transparently refreshes using the stored refresh token, updates the placeholder mapping, and forwards/retries the request.
  3. The sandbox observes only the new access token via its existing placeholder. The refresh token never leaves the gateway.
  4. Each refresh emits an OCSF ConfigStateChange event scoped to the sandbox + provider.

This preserves the security property the providers system was built for — the agent execution boundary never sees raw refresh material — and keeps configuration declarative: users describe what the provider is, not how to refresh it.

Alternatives Considered

  1. Host-side refresh before each sandbox launch (status quo). What the OSS demo does. Works for short or stateless runs; breaks for long-running autonomous agents whose access tokens expire mid-session.
  2. Per-provider helper plugins (one script per OAuth flow). Doesn't scale across many SaaS APIs; duplicates well-known refresh shapes; fragments where OAuth logic lives.
  3. Pass the refresh token into the sandbox. Defeats the security property providers exist to enforce — refresh material should never cross the agent execution boundary.
  4. Rely on upstream identity providers issuing longer-lived tokens. Out of OpenShell's control; runs counter to the industry trend toward shorter-lived tokens.

The proposed design is preferred because it (a) keeps credential protection intact, (b) reuses the declarative profile schema introduced in #865 instead of adding a parallel mechanism, and (c) absorbs the refresh logic into the supervisor where it can be uniformly observable (OCSF), rate-limited, and audited.

Agent Investigation

  • RFC Provider Enhancements -- Declarative Profiles, Auto-Injected Policy, Multi-Provider Inference #865 covers credential injection metadata (header / bearer / query / URL path, declarative profiles) but does not cover token lifecycle / refresh.
  • Architecture docs only mention "refresh" in the inference-bundle path (externally rotated keys picked up on next bundle fetch) — not protocol-level OAuth refresh.
  • No existing tracking issue found for credential refresh / OAuth refresh / token lifecycle.
  • Public OSS demo demonstrates the gap concretely: host-side host-auth.mjs is what stands between expired access tokens and a working sandboxed agent today.

Checklist

  • I've reviewed existing issues and the architecture docs
  • This is a design proposal, not a "please build this" request

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions