Skip to content

feat(services): generic OAuth 2.0 + PKCE service#277

Merged
emal-avala merged 2 commits into
mainfrom
feat/oauth-service
May 5, 2026
Merged

feat(services): generic OAuth 2.0 + PKCE service#277
emal-avala merged 2 commits into
mainfrom
feat/oauth-service

Conversation

@emal-avala
Copy link
Copy Markdown
Member

Summary

  • Adds crates/lib/src/services/oauth.rs — a provider-agnostic OAuth 2.0 Authorization-Code-with-PKCE flow that powers Phase 8.12 of the service-layer fill-in. OAuthProviderConfig carries per-provider URLs / client id / scopes / redirect uri; nothing is hardcoded.
  • Public surface: OAuthService::new, login, current_token, logout, plus a pluggable CredentialStore trait with a default 0o600 JSON-file backend under the agent config directory. The trait shape lets a follow-up plug an OS keychain without touching the flow.
  • Security posture: token strings never appear in any OAuthError variant or in TokenSet's Debug output (custom impl prints ***); loopback server hard-times-out at 5 min; refresh that returns 4xx clears the cached creds and surfaces ReauthRequired; no panics on a missing config dir.
  • No new workspace dependencies.

Test Plan

  • cargo check --all-targets
  • cargo test --all-targets (25 new OAuth tests pass; pre-existing bwrap_* sandbox failures on this machine are environmental and reproduce on main)
  • cargo clippy --all-targets -- -D warnings
  • cargo fmt --all -- --check
  • New tests cover:
    • PKCE verifier alphabet + length (RFC 7636 §4.1) and challenge = base64url(SHA256(verifier)) (RFC 7636 §4.2 + Appendix B test vector)
    • base64url no-padding, url-safe alphabet
    • Loopback server: captures code, rejects state mismatch, accepts correct state, returns the close-this-tab HTML
    • Authorization URL builder: PKCE params, scope encoding, query merge into a URL that already has ?
    • Token refresh against a hand-rolled tokio::net::TcpListener fake (200 happy path + 4xx terminal path)
    • current_token: refresh-and-persist when stale, 4xx clears creds and demands reauth, no-creds returns ReauthRequired
    • logout idempotency
    • Full login flow with stub authz callback + stub token server, asserting credential write and request body shape
    • File credential store round-trip, 0o600 perms on unix, key sanitization rejecting ..//
    • Token-leak regression: every OAuthError variant's Display+Debug, and TokenSet's Debug, never print injected token strings

Phase 8.12 service-layer fill-in. Adds a provider-agnostic
Authorization-Code-with-PKCE flow under crates/lib/src/services/oauth.rs.

- OAuthProviderConfig holds per-provider URLs, client id, scopes,
  redirect uri; nothing is hardcoded.
- OAuthService::login spawns a loopback HTTP server on 127.0.0.1:0,
  opens the auth URL via the platform default opener, captures the
  redirect, exchanges the code (PKCE S256) at the token endpoint,
  and persists the token set.
- current_token transparently refreshes when within 60s of expiry;
  4xx refresh failures clear the cached creds and surface
  ReauthRequired so callers can drive a fresh browser flow.
- logout is idempotent.
- Loopback hard-times-out at 5 min so an abandoned tab cannot hang.
- TokenSet has a custom Debug that prints *** for token fields;
  no OAuthError variant carries a token. Regression test asserts
  Display+Debug do not leak.
- Pluggable CredentialStore trait. Default backend is a 0o600 JSON
  file under the agent config directory; misses (no HOME, locked
  store) return KeychainError instead of panicking. Trait lets a
  follow-up plug an OS keychain backend without touching the flow.

Tests (25 new, all passing):
- PKCE verifier alphabet + length conformance (RFC 7636 §4.1)
- PKCE challenge = base64url(SHA256(verifier)) (RFC 7636 §4.2 +
  Appendix B test vector)
- base64url no-padding url-safe alphabet
- Loopback captures code, rejects state mismatch, accepts correct state
- Authorization URL building (PKCE params, scope encoding, query merge)
- Refresh against a hand-rolled fake token server (200 + 4xx paths)
- current_token refresh-and-persist, 4xx-clears-creds-and-reauth,
  no-creds-stored, logout idempotency
- Full login flow with stub authz callback + stub token server
- File credential store round-trip, 0o600 perms (unix), key sanitization
- Token-leak regression: errors and TokenSet Debug never print tokens

No new workspace dependencies.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b7d06d9ce7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread crates/lib/src/services/oauth.rs Outdated
Comment on lines +571 to +573
if let Some(state) = params.get("state")
&& state != expected_state
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Reject callbacks that omit OAuth state

The state validation in capture_redirect only rejects mismatches when a state param is present, but it still accepts any request containing code when state is missing. That means a local process can inject a callback without the CSRF binding and have its authorization code exchanged and persisted. The callback should require state to exist and exactly match before accepting code.

Useful? React with 👍 / 👎.

Comment thread crates/lib/src/services/oauth.rs Outdated
Comment on lines +442 to +443
fn is_terminal_refresh(status: u16) -> bool {
(400..500).contains(&status)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Do not treat every refresh 4xx as terminal

is_terminal_refresh classifies all 4xx responses as terminal, and current_token deletes stored credentials on that path. This will force re-authentication and discard cached credentials for transient or policy-driven 4xx responses (for example throttling-style 429), even when retrying later would succeed. Only truly invalid/expired refresh-token conditions should trigger credential deletion.

Useful? React with 👍 / 👎.

Comment thread crates/lib/src/services/oauth.rs Outdated
Comment on lines +275 to +282
let res = std::process::Command::new(bin)
.args(args_for(bin))
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.spawn();
match res {
Ok(_) => return Ok(()),
Err(e) => last_err = e.to_string(),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Surface browser launch failure instead of timing out

The browser launcher reports success as soon as spawn() succeeds, without checking the child exit status. If the opener binary exists but immediately fails (e.g., no handler/display), login still waits on loopback until timeout and returns a misleading loopback error. Waiting for process completion and treating non-zero exit as BrowserLaunchFailed avoids this false-success path.

Useful? React with 👍 / 👎.

Comment thread crates/lib/src/services/oauth.rs Fixed
…nt/terminal, await opener exit, validate https endpoints

Addresses four PR review findings on the generic OAuth service.

1. P1 CSRF: capture_redirect now requires the `state` query
   param to be both present and exactly equal to the per-login
   expected value. A callback that delivers a `code` without
   `state` is rejected with HTTP 400 and never reaches the
   token endpoint — closes a path where a local process could
   smuggle a stolen authorization code into the loopback.
   Mismatched state is also a hard error rather than a silent
   "keep waiting".

2. P2 refresh classification: 4xx is no longer a blanket
   "delete the cached creds" signal. RefreshFailed gains a
   `transient: bool` field. The new is_terminal_refresh parses
   the RFC 6749 §5.2 error code from the body and returns
   terminal only for `invalid_grant` / `invalid_client`, plus
   HTTP 401 with no parsable body. 429 / 503 / 408 / network
   blips / unparseable non-401 bodies are transient — creds
   are preserved and the caller can retry. current_token only
   deletes credentials when transient is false.

3. P2 browser launcher: spawn() no longer counts as success.
   The launcher now waits for the opener and checks its exit
   status. Non-zero (or signal) returns BrowserLaunchFailed
   with the actual exit code in the message instead of letting
   the user sit through a 5-minute LoopbackTimeout. macOS,
   Linux, and Windows openers all exit promptly after handing
   off, so the wait is short.

4. CodeQL #59 (HTTPS): OAuthProviderConfig::validate rejects
   any authorization_url / token_url that is not https://.
   `redirect_uri` is allowed http://127.0.0.1 / localhost
   (RFC 8252 §7.3 loopback). A new `allow_insecure_local`
   flag opens a narrow door for loopback-only test fixtures;
   off by default. OAuthService::new and with_store now return
   Result and propagate InvalidConfig.

Tests cover every documented case: no-state and mismatched-
state callbacks, 429/503 transient + invalid_grant terminal
refresh, opener-exits-non-zero surfacing as BrowserLaunchFailed
with the exit code, http authorization_url rejected, https +
loopback redirect accepted.
@emal-avala
Copy link
Copy Markdown
Member Author

Addressed review findings: required state on every loopback callback (P1 CSRF), classified refresh failures as transient vs terminal so 429/503/network blips no longer drop creds (P2), made the browser launcher wait on the opener and surface non-zero exits as BrowserLaunchFailed (P2), and added HTTPS validation on OAuthProviderConfig with a narrow allow_insecure_local door for loopback test fixtures (CodeQL #59).

@emal-avala emal-avala merged commit ed47b01 into main May 5, 2026
14 checks passed
@emal-avala emal-avala deleted the feat/oauth-service branch May 5, 2026 03:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants