feat(sandbox): supervisor should periodically refresh provider credentials for running sandboxes

## Problem Statement

When a provider's credentials are updated (e.g., a rotated API key), running sandboxes never observe the change. The supervisor calls `GetSandboxProviderEnvironment` exactly once at startup (`crates/openshell-sandbox/src/lib.rs:272-301`) and stores the result in an immutable `Arc<SecretResolver>`. The `run_policy_poll_loop` refreshes policy and settings every 10 seconds but never re-fetches provider credentials.

This means:
- Rotating a provider's API key requires restarting every sandbox using that provider.
- Short-lived tokens (OAuth2, SSO) cannot be consumed by long-running sandboxes without external workarounds.
- The operator/gateway correctly propagates Secret changes to the gateway DB, but the last mile — gateway DB to running supervisor — is broken.

Issue #896 proposes a comprehensive credential refresh lifecycle as part of Provider Profiles (including `oauth2` and `external` strategies), and #1171 addresses attach/detach semantics for new process launches. However, neither addresses the simpler immediate gap: the supervisor's existing `SecretResolver` should be refreshable at runtime so that credential rotation works for the current provider model without waiting for the full Provider v2 redesign.

## Proposed Design

Add periodic credential refresh to the supervisor's existing polling loop. The change is minimal and incremental:

1. **Change `SecretResolver` storage from `Arc<SecretResolver>` to `Arc<RwLock<SecretResolver>>`** (or `arc_swap::ArcSwap<SecretResolver>` for lock-free reads). The proxy hot-path reads credentials on every request; a `RwLock` with rare writes is acceptable, but `ArcSwap` would be zero-cost for the read path.

2. **Add a `refresh_provider_environment` call to `run_policy_poll_loop`** (or a parallel loop). Every N seconds (e.g., 60s, configurable), call `GetSandboxProviderEnvironment`, diff the result against the current `SecretResolver`, and swap if changed.

3. **The proxy reads the current `SecretResolver` snapshot per-request.** With `ArcSwap`, this is a single atomic load — no contention with the refresh writer.

4. **No changes to child process environment variables.** Already-running processes keep their placeholder env vars (`openshell:resolve:env:*`). The placeholders don't change — only the real values in the `SecretResolver` mapping change. Since resolution happens at the proxy layer at request time, the agent seamlessly uses the new credentials on the next API call without restart.

5. **No new RPCs needed.** The existing `GetSandboxProviderEnvironment` unary RPC is sufficient. The supervisor already calls it; it just needs to call it more than once.

```
Startup (existing):
  fetch_provider_environment() → SecretResolver (immutable)

Proposed:
  fetch_provider_environment() → ArcSwap<SecretResolver>
  poll loop (every 60s):
    new_env = fetch_provider_environment()
    if new_env != current_env:
      secret_resolver.swap(SecretResolver::from_provider_env(new_env))
```

This is fully backward-compatible:
- The `static` refresh strategy from #896 describes exactly this behavior ("takes effect on the next proxied request or next supervisor provider-cache refresh") — this PR would implement that cache refresh.
- No proto changes. No gateway changes. Supervisor-only.
- Child processes are unaffected (env vars contain placeholders, not secrets).

## Alternatives Considered

- **Push-based streaming RPC (e.g., `WatchProviderEnvironment`):** More complex, requires gateway-side subscription management, and doesn't align with the existing pull-based polling model used for policy. The polling approach is simpler and consistent with existing architecture.
- **Restart sandbox on credential change:** Current workaround, but breaks long-running agent sessions and is unacceptable for production use with token rotation.
- **Wait for full Provider v2 (#896):** The roadmap is comprehensive but large. The credential refresh gap exists today with the v1 provider model and can be fixed incrementally without conflicting with v2 work.
- **External sidecar/daemon that restarts sandboxes:** Adds operational complexity and still disrupts running sessions unnecessarily.

## Agent Investigation

Codebase analysis of `crates/openshell-sandbox/`:

- `src/lib.rs:272-301`: `fetch_provider_environment` called once at startup, result stored in `HashMap<String, String>`.
- `src/lib.rs:303-304`: `SecretResolver::from_provider_env()` creates an immutable `SecretResolver`, wrapped in `Arc<SecretResolver>`.
- `src/secrets.rs:67-90`: `SecretResolver` is `#[derive(Clone)]` with a plain `HashMap<String, String>` field. No `RwLock`, no mutation methods.
- `src/lib.rs:2148-2307`: `run_policy_poll_loop` polls `GetSandboxConfig` for policy/settings only — never credentials.
- `proto/openshell.proto:84`: Comment explicitly says "called by sandbox supervisor at startup" — this is a documentation issue, not a protocol limitation.
- The proxy reads `SecretResolver` on every HTTP request via `resolve()` method — this is the natural injection point for refreshed credentials.
- `ArcSwap` is not currently in `Cargo.toml` but is a single dependency add. Alternatively, `tokio::sync::watch` could broadcast new resolvers to proxy tasks.

Related: #896 (Provider v2 roadmap, credential refresh lifecycle), #1171 (attach/detach for running sandboxes)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sandbox): supervisor should periodically refresh provider credentials for running sandboxes #1264

Problem Statement

Proposed Design

Alternatives Considered

Agent Investigation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(sandbox): supervisor should periodically refresh provider credentials for running sandboxes #1264

Description

Problem Statement

Proposed Design

Alternatives Considered

Agent Investigation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions