Skip to content

fix(oidc-provider): surface federation failures as actionable errors, not opaque 500s#66

Merged
alukach merged 3 commits into
mainfrom
fix/clearer-backend-auth-errors
Jun 4, 2026
Merged

fix(oidc-provider): surface federation failures as actionable errors, not opaque 500s#66
alukach merged 3 commits into
mainfrom
fix/clearer-backend-auth-errors

Conversation

@alukach
Copy link
Copy Markdown
Member

@alukach alukach commented Jun 4, 2026

Problem

Every backend-federation failure collapsed into ProxyError::Internal, so the proxy returned a generic 500 InternalError / "Internal server error" regardless of cause, with the real reason discarded — both from the response and from logs. We hit this repeatedly while debugging the federated-test smoke test: a key/issuer mismatch (AWS STS InvalidIdentityToken) was indistinguishable from an actual proxy crash.

Root cause (crates/oidc-provider/src/lib.rs):

impl From<OidcProviderError> for ProxyError {
    fn from(e: OidcProviderError) -> Self {
        ProxyError::Internal(e.to_string())   // every variant -> 500, detail dropped
    }
}

Change

Map each federation failure to a status that reflects whose problem it is, and log the full (possibly ARN-bearing) provider detail at the conversion site — the last place it's available and somewhere it must not reach the caller:

Cause Before After
STS rejection (InvalidIdentityToken, …) 500 InternalError 502 BackendAuthenticationFailed
STS AccessDenied (trust policy / perms) 500 InternalError 403 AccessDenied
Bad OIDC_PROVIDER_KEY / signing failure 500 InternalError 502 (cause logged)
Broker unreachable / unparseable reply 500 InternalError 503 ServiceUnavailable

Adds ProxyError::BackendAuthError, which carries the provider error code only (e.g. InvalidIdentityToken) — safe to surface; the raw message is logged via tracing::error!, never returned. With the worker's console subscriber, wrangler tail now shows the actual STS code + message instead of nothing.

Tests

Unit tests for every mapping (crates/oidc-provider/src/lib.rs, crates/core/src/error.rs): each asserts the expected status, that it is not an opaque 500, and that the raw provider message does not leak into safe_message().

Verified locally with the CI command set: cargo check, cargo clippy -- -D warnings, cargo test, and cargo check -p multistore-cf-workers --target wasm32-unknown-unknown — all green.

🤖 Generated with Claude Code

alukach and others added 2 commits June 3, 2026 21:51
Backend federation failures all collapsed into ProxyError::Internal, so a
rejected STS exchange, a malformed signing key, or an unreachable broker
each returned an opaque 500 InternalError / "Internal server error" with
the real cause discarded — undiagnosable from the response and unlogged.

Map each cause to a status reflecting whose problem it is, and log the full
(possibly ARN-bearing) provider detail at the conversion site, where it is
the last place still available and must not reach the caller:

- STS rejection (InvalidIdentityToken, …) -> 502 BackendAuthenticationFailed
- STS AccessDenied (trust policy/perms)    -> 403 AccessDenied
- bad OIDC_PROVIDER_KEY / signing failure   -> 502 (logged)
- broker unreachable / unparseable reply    -> 503 ServiceUnavailable

Adds ProxyError::BackendAuthError, which carries the provider error *code*
only (safe to surface); the raw message is logged, never returned. Unit
tests cover each mapping and assert no opaque 500 / no message leak.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the openssl commands for SESSION_TOKEN_KEY and OIDC_PROVIDER_KEY, and
call out that OIDC_PROVIDER_KEY must be a PKCS#8 PEM RSA key (genpkey, not
genrsa) — a random/symmetric value fails to parse and the worker then 500s
on every request. Note preview and staging must share the same key.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

📖 Docs preview deployed to https://multistore-docs-pr-66.development-seed.workers.dev

  • Date: 2026-06-04T04:54:16Z
  • Commit: 3fb1b5a

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 4, 2026

🚀 Latest commit deployed to https://multistore-proxy-pr-66.development-seed.workers.dev

  • Date: 2026-06-04T04:54:16Z
  • Commit: 3fb1b5a

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@alukach alukach merged commit 0422288 into main Jun 4, 2026
15 checks passed
@alukach alukach deleted the fix/clearer-backend-auth-errors branch June 4, 2026 05:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant