Skip to content

feat: OIDC authentication (Bearer JWT + Basic→JWT exchange)#45

Open
knutties wants to merge 37 commits into
mainfrom
oidc-auth
Open

feat: OIDC authentication (Bearer JWT + Basic→JWT exchange)#45
knutties wants to merge 37 commits into
mainfrom
oidc-auth

Conversation

@knutties

@knutties knutties commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Summary

Replaces Kronos's single shared TE_API_KEY bearer token with a real OIDC auth model.

Two new framework-agnostic workspace crates (extractable, reusable in other Rust services):

  • oidc-rs — JWT validator (JWKS-backed), Basic→JWT exchanger (cached, single-flighted, negative-cached), AuthConfig builder, Identity/Claims types.
  • oidc-rs-actix — Actix-Web adapter: AuthMiddleware Transform, Authenticated extractor, AuthError → HttpResponse mapping.

Kronos integration:

  • kronos-common: thin env shim translating TE_* envs into oidc_rs::AuthConfig.
  • kronos-api: AuthMiddleware wraps /v1/*; new endpoints GET /v1/auth/whoami and POST /v1/auth/cache/flush.
  • kronos-dashboard (Leptos/WASM): public OIDC client running PKCE in WASM; tokens in memory only, no refresh tokens, redirect-on-expiry.
  • kronos-worker: small fix so its synthetic AppConfig literal still builds.

Auth modes:

  • TE_AUTH_MODE=disabled (default for just dev) → bypass.
  • TE_AUTH_MODE=enabled (prod) → require OIDC envs (fail-fast at startup).
  • Authorization: Bearer <jwt> → validate against issuer JWKS (60 s clock-skew tolerance, background refresh).
  • Authorization: Basic <client_id:client_secret> → exchange via IdP client_credentials grant; cache JWT until min(expires_in, hard_ttl) − 60s, negative-cache rejections for 30 s and transient failures for 5 s, single-flight per client_id with race-free inflight cleanup.

SDK / CLI updates:

  • Rust client (crates/client): Config::with_basic / with_bearer / deprecated with_token via a hand-authored auth_ext.rs side file preserved across Smithy regens. Wire-level smoke tests confirm the actual Authorization header on the wire.
  • Smithy: @httpBasicAuth added to the service model.
  • TypeScript CLI: env-driven credential selection (KRONOS_CLIENT_ID/SECRET or KRONOS_BEARER_TOKEN); no-op Bearer scheme when neither is set so the CLI works against TE_AUTH_MODE=disabled.

Operator-facing:

  • docker-compose.prod.yml: required OIDC envs use ${VAR:?required} for fail-loud behaviour.
  • justfile: defaults TE_AUTH_MODE=disabled for just dev / just api.
  • README: new "Setting up authentication" section with Keycloak/Auth0/Okta walkthroughs, cache-flush snippet, CLI env-var docs.

Design + plan documents (committed for review):

  • docs/superpowers/specs/2026-06-14-oidc-auth-design.md
  • docs/superpowers/plans/2026-06-14-oidc-auth.md

Test plan

  • cargo check --workspace is clean.
  • cargo test -p oidc-rs --lib -- --test-threads=1 shows 15 tests passing (Identity, AuthConfig, Validator, BasicExchanger, inflight-leak regression).
  • cargo test -p oidc-rs-actix shows the smoke test passing.
  • cargo test -p kronos-common --lib -- --test-threads=1 shows 23 tests passing.
  • cargo test -p kronos-api --test auth_integration -- --test-threads=1 shows 6 tests passing (disabled / Bearer-valid / Basic-valid / missing-header / expired-bearer / flush-noop).
  • cd crates/client && cargo test shows 10 lib + 2 wire-level smoke + 25 doc tests passing.
  • cargo check -p kronos-dashboard --features hydrate --target wasm32-unknown-unknown is clean (zero warnings).
  • docker compose -f docker-compose.prod.yml config fails without OIDC envs set; succeeds when TE_OIDC_ISSUER, TE_OIDC_AUDIENCES, TE_OIDC_DASHBOARD_CLIENT_ID, TE_OIDC_DASHBOARD_REDIRECT_URL are all set.
  • just smithy-build round-trips: crates/client/src/auth_ext.rs, pub mod auth_ext; in lib.rs, and the base64 + tokio deps survive a regen.
  • Manual: dashboard against a Keycloak realm (per the README walkthrough) — unauthenticated load bounces to /authorize, callback exchanges the code, jobs page renders with the OIDC sub visible via /v1/auth/whoami.
  • Manual: curl -s :8080/v1/auth/whoami -u <client_id>:<secret> returns {"type":"basic", ...} after a Keycloak kronos-api client setup; cache-flush invalidates the next request.

Known follow-ups (deferred, not blocking this PR)

  • /metrics remains unauthenticated (pre-existing; consider a separate listener or network policy).
  • flush_cache endpoint authenticates but does not authorise — any valid identity can flush; worth a scope-claim check (e.g. kronos.admin) for multi-tenant deployments.
  • Audit logging of identity (sub, iss, variant) into the tracing/access-log line.
  • Per-IP rate limit on the Basic-exchange path (currently relies on the IdP's own rate limiting).
  • SSR window.__KRONOS_CONFIG__ script-literal injection: operator-controlled values are interpolated raw; switch to serde_json::to_string for defence-in-depth.

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

  • New Features

    • Implemented OIDC-based authentication supporting Bearer tokens and HTTP Basic credentials.
    • Added /v1/auth/whoami endpoint to retrieve current authentication identity.
    • Added /v1/auth/cache/flush endpoint to clear cached authentication credentials.
    • Dashboard now uses OIDC with authorization code flow instead of static API keys.
  • Configuration Changes

    • Replaced TE_API_KEY with TE_AUTH_MODE and OIDC-related environment variables for production deployments.
  • Documentation

    • Updated configuration guide with new authentication setup and OIDC provider instructions.

knutties and others added 30 commits June 14, 2026 04:51
Replace the single shared TE_API_KEY with an identity-only OIDC model that
accepts both `Authorization: Bearer <jwt>` (validated against the issuer's
JWKS) and `Authorization: Basic <client_id:client_secret>` (exchanged against
the IdP's token endpoint and cached). All credentials live in the IdP; no
new tables or management endpoints in Kronos. Dashboard becomes a public
OIDC client (PKCE in WASM) with in-memory tokens and no refresh tokens.
TE_AUTH_MODE=disabled bypasses auth for local dev. Tenant authorization
(X-Org-Id / X-Workspace-Id mapping) is deferred to a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace Identity::Human / Identity::M2M with Identity::Bearer(Claims) /
Identity::Basic(Claims), plus a shared Claims struct. The variant now reflects
what Kronos actually verifies (the wire-level credential format), not a guess
at who the caller is — a service that does its own client_credentials grant
and presents the JWT directly arrives on the Bearer path, and we no longer
relabel it. Renames TE_OIDC_M2M_AUDIENCE/SCOPE to TE_OIDC_BASIC_AUDIENCE/SCOPE
and the matching AuthConfig fields for consistency with the new naming theme.
Section headings and the whoami JSON shape updated to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step-by-step plan derived from the OIDC auth design spec. Decomposes the
work into 20 TDD-style tasks across four phases: common auth library, API
middleware/endpoints integration, dashboard PKCE flow, and migration aids
(Rust client, Smithy/CLI, docker-compose, README). Each task has explicit
file paths, copy-pasteable code, exact verification commands, and a commit
step.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ates

Restructure the spec's Code structure section and the implementation plan
around two new workspace crates instead of keeping the auth code in
kronos-common::auth. oidc-rs is the framework-agnostic core (Identity,
Claims, AuthError, AuthConfig builder, Validator, BasicExchanger);
oidc-rs-actix is the Actix-Web adapter (AuthMiddleware, Authenticated
extractor, error mapping). kronos-common retains a thin shim that reads
TE_* envs and constructs an oidc_rs::AuthConfig via the builder. The
AuthConfig API becomes env-agnostic so other Rust services (Axum, raw
Tower, etc.) can consume the same machinery later.

Plan grows from 20 to 19 tasks across five phases: oidc-rs core,
oidc-rs-actix adapter, Kronos integration, dashboard, migration aids
and docs. Phases 3-5 carry over from the prior plan with import path
adjustments; Phases 1-2 are new.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both crates had been pinned a major version behind current (openidconnect
3 → 4.0.1; jsonwebtoken 9 → 10.4.0). Catching this at scaffold time avoids
a forced migration later. Also drops the placeholder `repository` field
from the oidc-rs manifest until a real upstream URL exists.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address code-review minors from Task 2:
- Remove AuthError::InvalidCredentials — no code path produces it; the
  middleware mapping aliased it to IdpRejected anyway.
- Document why Identity doesn't derive Deserialize (preventing
  trust-by-deserialization of identity blobs).
- Clarify Claims::scopes empty-vec semantics.
- Add a Display smoke test covering interpolated AuthError variants.

Plan updated to keep middleware code_for() in sync.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add AuthConfig enum (Disabled / Enabled), EnabledConfig struct,
AuthConfigBuilder with required issuer/audiences validation, and
BuildError. Three new tests join the four existing identity tests
(7 total); cargo check --workspace passes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds Validator: discovers the OIDC issuer, caches the JWKS, refreshes it
on a background tokio task, and verifies inbound JWTs by signature, iss,
aud, exp, and nbf with 60s leeway.

openidconnect v4 adaptations versus the v3-shaped plan:
- Builds a redirect-disabled reqwest::Client once and passes &http to
  CoreProviderMetadata::discover_async / CoreJsonWebKeySet::fetch_async
  (the v3 async_http_client free function was removed in v4).
- v4's CoreJsonWebKey has no public n_b64/e_b64/x_b64/y_b64 accessors,
  so the matching JWK is round-tripped through serde_json into
  jsonwebtoken::jwk::Jwk and consumed via DecodingKey::from_jwk -- the
  RFC 7517 JWK wire format makes this safe.
- Imports the JsonWebKey trait to bring key_id / key_use into scope, and
  uses CoreJsonWebKeyUse (the concrete enum) instead of the JsonWebKeyUse
  trait, which is the v4 split.
- Algorithm is taken from the JWK's `alg` field via KeyAlgorithm::Display
  -> Algorithm::FromStr, falling back to RS256/ES256 per key type.

Enables jsonwebtoken's `rust_crypto` feature workspace-wide so that v10
has a crypto provider compiled in (v10 panics at first verify otherwise).
Address two production-stability items from code review of Task 4:

1. reqwest::Client lacked a request timeout. A stalled IdP could now hang
   Validator::new or a background refresh tick indefinitely; now capped
   at 10s.
2. The initial JWKS fetch silently swallowed errors via .await.ok(). When
   keys failed to load on startup, the operator saw only the downstream
   'jwks not loaded yet' error on the first /validate call. Now we log
   the underlying failure reason at WARN so the root cause is visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Translates `Authorization: Basic <client_id:client_secret>` into a JWT by
calling the IdP's `client_credentials` grant. Caches positive results
until the JWT's expiry (capped by `hard_ttl - 60s`), single-flights
concurrent exchanges per `client_id` via a per-`client_id` Mutex to
prevent IdP-thrash, and negative-caches IdP 4xx rejections for 30s and
transient failures (network / 5xx) for 5s.

Adds a small `build_http_client()` helper that returns a `reqwest::Client`
with a 10s timeout. Called from both `BasicExchanger::new` (discovery)
and the `ExchangerInner.http` construction, so a stalled IdP cannot hang
either path. Matches the precedent set by `validator.rs`.

Re-exports `BasicExchanger` from `lib.rs` (alphabetical within its
module group).

Tests (wiremock, 4 new, all passing):
- caches_successful_exchange
- single_flights_concurrent_exchanges
- negative_caches_idp_rejection
- flush_evicts_entries

Full `oidc-rs` suite: 14 / 14 pass.
Mirrors the SSRF guard already on Validator. Discovery URLs come from
operator-provided TE_OIDC_ISSUER and aren't user-controlled, but matching
the validator's convention keeps the two HTTP clients aligned and closes
the redirect-to-arbitrary-host path during /.well-known/openid-configuration
fetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…apping

Adds the Actix-Web glue layer: AuthMiddleware (Transform + Service) that
resolves Bearer/Basic credentials into oidc_rs::Identity and injects it into
request extensions, the Authenticated FromRequest extractor, AuthError→HTTP
mapping (401/503), and a smoke test confirming disabled mode works end-to-end.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The held auth state was Rc, which forced Kronos integration code to
reconstruct AuthState inside each per-worker HttpServer::new closure
(via Rc::new((*auth_state).clone())) — readable but surprising. Arc lets
callers build it once at startup and just .clone() into the closure.
Internal sharing inside the Validator/BasicExchanger is already Arc-based,
so there's no extra background-task or HTTP-discovery cost from
per-worker clones.

Plan's Task 9 and Task 12 reference code updated to use the simpler
auth_state.clone() pattern; the smoke test in oidc-rs-actix updated to
use Arc::new directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
….api_key

Adds crates/common/src/auth.rs with read_auth_config_from_env() bridging
TE_AUTH_MODE / TE_OIDC_* env vars to oidc_rs::AuthConfig. Wires auth field
into AppConfig, removes legacy api_key from ServerEnv (warns if TE_API_KEY
still set), and fixes the worker library client's manual AppConfig construction
to match the new shape.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… Identity

- Promote `kronos-api` to lib+bin so integration tests can build the app.
- Build `oidc_rs_actix::AuthState` at startup from `AppConfig.auth` and
  wrap the `/v1` scope with `AuthMiddleware`; `/health` and `/metrics`
  remain unauthenticated.
- Refactor `AuthenticatedRequest` into a thin newtype over
  `oidc_rs_actix::Authenticated` that yields `oidc_rs::Identity`.
- Consolidate all `/v1/*` routes into a single `v1_routes(ServiceConfig)`
  fn so the auth middleware wraps them once.
- Re-export `AuthMiddleware`/`AuthMode`/`AuthState` from
  `crate::middleware` for ergonomic call sites.
- Drop the last `config.server.api_key` reference; the dashboard's
  per-request credential is replaced in T13-T16 (LoginState).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up from Task 9 review: explain why build_auth_state intentionally
fails on initial IdP unreachability — a crash-loop on misconfigured issuer
is preferable to silently serving unauthenticated traffic. Operators running
rolling deploys keep the old pods serving during a transient IdP outage.

(Originally also narrowed pub mod dashboard to pub(crate), but main.rs
needs it through the library form — reverted that piece.)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The Task 9 implementation registers AuthState as web::Data::from(arc) so
handlers receive web::Data<AuthState> directly (not web::Data<Arc<AuthState>>).
The Task 11 flush_cache handler depends on that shape. Updating the plan's
Task 9 reference code and Task 12 test helpers to match.
Adds integration tests that spin up an in-process Actix app and exercise
the auth middleware end-to-end:

- whoami in disabled mode returns Identity::Disabled
- flush_cache in disabled mode is a no-op
- Bearer with a valid JWT minted by the mock IdP returns Identity::Bearer
- Basic credentials get exchanged at the mock /token endpoint and the
  resulting JWT is validated, returning Identity::Basic
- Missing Authorization returns 401
- Expired Bearer returns 401

The MockIdp helper reuses oidc-rs's RSA test fixtures (priv key / JWKS
modulus + exponent) via include_bytes/include_str from the api crate's
tests/common module. Tests are forced single-threaded to keep wiremock
from exhausting ephemeral ports.

The helper functions return impl Service<actix_http::Request, ...,
ServiceResponse<BoxBody>, ...> — actix-web's init_service returns a
service keyed on actix_http::Request (not ServiceRequest), so actix-http
is pulled in as a dev-dep purely to spell out that bound.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… signal

Adds the auth module (LoginState Leptos signal), swaps the dashboard's
api_key config field for oidc_issuer / oidc_client_id / oidc_redirect_url
/ oidc_audience / auth_disabled, and wires provide_login_state() at the
App root.

The ~20 sites in dashboard/src/api/client.rs that previously read
config.api_key are bridged with String::new() + TODO(T15) comments so the
workspace keeps building; Task 15 replaces them with LoginState reads.
The SSR-side DashboardConfig constructor in kronos-api's main.rs is also
updated to populate the new fields with safe defaults (auth_disabled =
true, all OIDC fields None) until Task 16 plumbs AppConfig::auth through.

Adds workspace dashboard deps for upcoming PKCE/JWT/storage work
(jsonwebtoken, getrandom[js], gloo-storage, gloo-utils, base64, sha2,
form_urlencoded); the gloo crates are pulled in only by the hydrate
feature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the 34 String::new() placeholders bridged in by Task 13 with
auth_header(&config) + with_auth(...) helpers that read the access token
from the LoginState Leptos signal. When auth_disabled is true OR no access
token is present, the Authorization header is omitted entirely.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l + dev compose

Add hydrate-only `redirect_to_idp` and `logout` helpers to the dashboard's
`auth` module — `redirect_to_idp` builds the IdP `/authorize` URL with PKCE
artifacts persisted to sessionStorage; `logout` clears `LoginState` and
best-effort posts to the IdP's `end_session_endpoint` if discovery
advertises one.

Wire an `Effect` in `App` that bounces unauthenticated users to the IdP
unless we're already on `/auth/callback` or `auth_disabled` is true.
Register the `/auth/callback` route via an `AuthCallbackRoute` wrapper
that renders the real `CallbackPage` under hydrate and a placeholder
under SSR.

Populate `DashboardConfig` OIDC fields in `kronos-api/main.rs` from
`AppConfig::auth`, with two new envs — `TE_OIDC_DASHBOARD_CLIENT_ID` and
`TE_OIDC_DASHBOARD_REDIRECT_URL` — letting the operator configure the
dashboard's public OIDC client without baking it into `AuthConfig`.

The redirect helpers consume `pkce::code_challenge` and `pkce::persist`,
silencing the two outstanding `dead_code` warnings from Task 14.

The HTML shell injection of `window.__KRONOS_CONFIG__` (including all
five OIDC globals) was already wired in Task 13's `app::shell`, so no
additional shell changes were needed. The dev `docker-compose.yml` has
no API service block (devs run `cargo run -p kronos-api` directly), so
there was no `TE_AUTH_MODE`/`TE_API_KEY` env entry to flip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the obsolete TE_API_KEY export (Task 8 dropped the field; the
server WARNs if it's still set) and adds TE_AUTH_MODE=disabled so
`just dev` / `just api` boot without an OIDC provider configured.

Closes the Task 16 carve-out: docker-compose.yml has no API service to
configure, but the justfile is where Kronos's dev workflow actually
sets local envs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds Config::with_basic(id, secret) — sends Authorization: Basic <b64>;
Kronos API exchanges the credentials with the IdP server-side. Adds
Config::with_bearer(token) for externally-obtained JWTs. with_token is
retained as a #[deprecated] alias of with_bearer.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
knutties and others added 5 commits June 14, 2026 22:28
…before_signing

The original `with_basic` shipped in a33a9a1 fails at runtime with
`NoMatchingAuthSchemeError`. The smithy orchestrator resolves identity
BEFORE `modify_before_signing` runs. Every operation in this crate
advertises only `HTTP_BEARER_AUTH_SCHEME_ID` in its auth options, so
when no bearer token is configured `BearerAuthScheme::identity_resolver`
returns `None`, the orchestrator finds no matching auth scheme, and the
request fails before the interceptor ever fires.

Additionally, when both `with_basic` and `with_bearer` were configured,
`BearerAuthSigner::sign_http_request` ran AFTER `modify_before_signing`
and overwrote the Basic header — so Bearer always won regardless of
call order.

Fix:

1. Move the `StaticAuthHeaderInterceptor` from `modify_before_signing`
   to `modify_before_transmit`. The latter is the last interceptor hook
   before the request leaves the client and runs AFTER signing, so our
   Basic header is the one that hits the wire.

2. In `with_basic`, install a placeholder bearer token (`"x"`) so the
   orchestrator's identity resolution succeeds. The signer writes
   `Authorization: Bearer x` during signing; our interceptor then
   overwrites it with `Authorization: Basic <b64(id:secret)>` before
   transmit.

Add `tests/basic_auth_smoke.rs` (with `tokio` as a new dev-dep) that
runs each builder against a hand-rolled `TcpListener` and asserts on
the actual `Authorization` header byte string the SDK puts on the
wire — not just on config-building. Two cases:

  * `with_basic("svc-1", "s3cret")` -> `Authorization: Basic c3ZjLTE6czNjcmV0`
  * `with_bearer("abc")`            -> `Authorization: Bearer abc`

Both pass. Full `cargo test` from `crates/client/` is green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tial selection in CLI

- smithy/model/main.smithy: add @httpBasicAuth + @auth([httpBearerAuth, httpBasicAuth])
  to KronosService so both schemes are declared; SDK regenerated from updated model.
- crates/client/: regenerated Rust SDK reflecting the updated auth traits.
- cli/src/helpers.ts: replace hardcoded dev-api-key with authFromEnv() that reads
  KRONOS_CLIENT_ID/KRONOS_CLIENT_SECRET (Basic, mutually exclusive with Bearer) or
  KRONOS_BEARER_TOKEN (Bearer); neither set -> no Authorization header (disabled-mode).
  smithy-typescript 0.26.0 does not emit a first-class Basic config field; worked around
  via custom httpAuthSchemes entry with an inline HttpSigner that writes
  Authorization: Basic <base64(id:secret)>.
- justfile: preserve crates/client/tests/basic_auth_smoke.rs across future smithy-build runs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…abled

T18 review caught that the "neither set → no Authorization header" path
actually threw NoMatchingAuthSchemeError before sending — smithy demands
an auth scheme resolve at every operation. Register a no-op Bearer signer
that satisfies the orchestrator without writing a header, so CLI commands
work against a disabled-mode server with no env vars set.

Also drops the unused 'ipc' param from bearerScheme(), silencing a tsc
unused-var warning surfaced by the same review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes Task 19 / final task of the OIDC auth plan. docker-compose.prod.yml
gains required TE_OIDC_* env vars with :?required syntax for fail-loud
behaviour. README adds a 'Setting up authentication' section with a
Keycloak walkthrough as the primary example, plus Auth0/Okta notes, CLI
env-var docs, and a cache-flush snippet for IdP secret rotations.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…, inflight leak)

- SDK convenience methods (`Config::with_basic` / `with_bearer` / deprecated
  `with_token`) and `StaticAuthHeaderInterceptor` move to a new hand-authored
  side file `crates/client/src/auth_ext.rs` that codegen does not produce, so
  the smithy regen wipe leaves them on disk. The justfile's `smithy-build`
  recipe now `git checkout`s the side file (and README + smoke test) after
  the codegen wipe, and idempotently APPENDS the `pub mod auth_ext;` line in
  `lib.rs` and the `base64` / `tokio` deps in `Cargo.toml` (both files are
  regenerated, so we can't blanket-restore them — `grep -q ... || cat <<EOF`
  inserts only when absent so future regens don't double-insert and don't
  clobber legitimate codegen changes). Recipe converted to a `#!/usr/bin/env
  bash` shebang body so heredocs can span lines. `crates/client/Cargo.lock`
  gitignored — it's a stray from running `cargo` inside the SDK crate.

- Dashboard `/authorize` URL gets `&audience=<TE_OIDC_DASHBOARD_AUDIENCE>`
  appended when set. Without this Auth0 issues opaque `/userinfo`-scoped
  access tokens whose `aud` claim doesn't match TE_OIDC_AUDIENCES and the
  API rejects every dashboard request. `api/main.rs` no longer aliases the
  audience to `client_id` (incorrect for every IdP we support); a dedicated
  `TE_OIDC_DASHBOARD_AUDIENCE` env now feeds it. README + compose updated.

- `BasicExchanger.inflight` no longer leaks. `exchange()` drops its local
  `Arc<Mutex<()>>` clone then calls `DashMap::remove_if` with a
  `Arc::strong_count <= 1` predicate so concurrent in-flight callers keep
  the entry alive while idle entries are evicted (race-free: remove_if
  holds the bucket lock). `flush()` also clears matching inflight entries
  for operator-driven cleanup. New regression test asserts inflight drains
  to 0 after success AND after IdP rejection.

Verified: `cargo check --workspace` passes, `crates/client` builds with
10 lib + 2 smoke + 25 doc tests green, `oidc-rs --lib` has 15 tests
passing (4 existing exchanger + 1 new + 10 others), dashboard WASM build
for `--features hydrate --target wasm32-unknown-unknown` succeeds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown

Review Change Stack

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 77971132-0384-4062-8243-2cfb92615e0b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Walkthrough

This PR replaces API-key-based auth assumptions with OIDC-based auth across shared Rust crates, the API server, dashboard login flow, CLI and Rust client auth wiring, deployment config, tests, and documentation, while keeping a disabled-auth mode for local development.

Changes

OIDC authentication rollout

Layer / File(s) Summary
Shared OIDC crates and contracts
Cargo.toml, smithy/model/main.smithy, crates/oidc-rs/*, crates/oidc-rs-actix/*
Adds new shared auth crates, auth config and identity types, JWT validation, Basic→token exchange with caching, Actix middleware and extractor support, and supporting tests and fixtures.
Server auth config and protected endpoints
crates/common/*, crates/api/*, crates/worker/*
Reads auth settings from TE_* env vars, builds shared auth state, wraps /v1 routes with middleware, replaces the request extractor, adds /v1/auth/whoami and /v1/auth/cache/flush, and adds integration tests.
Dashboard OIDC login and API headers
crates/dashboard/*
Adds dashboard auth state, PKCE helpers, OIDC callback handling, logout and redirect helpers, callback routing, hydrated auth config, and bearer-token injection for browser API requests.
CLI and generated client auth support
cli/src/helpers.ts, crates/client/*, .gitignore, justfile
Switches CLI auth to env-based selection, adds Rust client with_basic and with_bearer helpers, registers basic auth in generated client config and operations, adds wire-level smoke tests, and preserves hand-authored auth files during regeneration.
Deployment and documentation updates
README.md, docker-compose.prod.yml, docs/superpowers/specs/2026-06-14-oidc-auth-design.md
Updates runtime configuration, request examples, auth setup guidance, dashboard settings, cache flush documentation, production compose variables, and the detailed auth design spec.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant API
  participant IdP
  participant JWKS
  Client->>API: Request with Bearer or Basic Authorization
  alt Basic credentials
    API->>IdP: client_credentials token exchange
    IdP-->>API: access token
  end
  API->>JWKS: fetch or refresh signing keys
  JWKS-->>API: JWKS
  API-->>Client: authenticated JSON response
Loading
sequenceDiagram
  participant Browser
  participant Dashboard
  participant Issuer
  participant KronosAPI
  Browser->>Dashboard: open protected page
  Dashboard->>Issuer: authorize redirect with PKCE
  Issuer-->>Browser: redirect to /auth/callback with code
  Browser->>Dashboard: callback request
  Dashboard->>Issuer: exchange code for token
  Issuer-->>Dashboard: access token and id_token
  Dashboard->>KronosAPI: API request with Bearer token
  KronosAPI-->>Dashboard: JSON data
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐇 I traded old keys for moonlit claims,
Through JWKS gates and PKCE lanes.
A bearer hopped, a basic flew,
The dashboard learned new paths to view.
Now carrots cache in auth burrows bright.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch oidc-auth

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🧹 Nitpick comments (4)
crates/api/tests/auth_integration.rs (1)

54-71: ⚡ Quick win

Add cache-hit and enabled flush-eviction assertions to the Basic-flow test.

Current test proves first exchange works, but not that cached requests avoid extra /token calls or that /auth/cache/flush invalidates cache in enabled mode. Adding that sequence will protect the core cache contract from regressions.

Suggested test extension
 #[actix_web::test]
 async fn basic_with_valid_creds_exchanges_and_returns_identity_basic() {
@@
     let resp: serde_json::Value = test::call_and_read_body_json(&app, req).await;
     assert_eq!(resp["type"], "basic");
     assert_eq!(resp["sub"], "test-m2m-client");
     assert_eq!(idp.token_calls.load(Ordering::SeqCst), 1);
+
+    // cache hit should not call /token again
+    let req2 = test::TestRequest::get()
+        .uri("/v1/auth/whoami")
+        .insert_header(("Authorization", header.clone()))
+        .to_request();
+    let _resp2: serde_json::Value = test::call_and_read_body_json(&app, req2).await;
+    assert_eq!(idp.token_calls.load(Ordering::SeqCst), 1);
+
+    // flush then request again should force a new exchange
+    let flush_req = test::TestRequest::post()
+        .uri("/v1/auth/cache/flush")
+        .set_json(&serde_json::json!({ "client_id": "svc-1" }))
+        .insert_header(("Authorization", header.clone()))
+        .to_request();
+    let _flush_resp: serde_json::Value = test::call_and_read_body_json(&app, flush_req).await;
+
+    let req3 = test::TestRequest::get()
+        .uri("/v1/auth/whoami")
+        .insert_header(("Authorization", header))
+        .to_request();
+    let _resp3: serde_json::Value = test::call_and_read_body_json(&app, req3).await;
+    assert_eq!(idp.token_calls.load(Ordering::SeqCst), 2);
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api/tests/auth_integration.rs` around lines 54 - 71, The test function
basic_with_valid_creds_exchanges_and_returns_identity_basic currently only
verifies the first token exchange succeeds. Extend the test by making a second
identical request to /v1/auth/whoami with the same Basic authentication header
and verify that idp.token_calls remains at 1 (proving the response was cached).
Then call the /auth/cache/flush endpoint via the app, make a third request to
/v1/auth/whoami with the same credentials, and assert that idp.token_calls has
incremented to 2 (proving the cache flush invalidated the cached response and
forced a new token exchange).
crates/common/src/auth.rs (1)

72-90: ⚡ Quick win

Test helper is not panic-safe and may cause test pollution.

The with_env helper modifies process-wide environment variables but doesn't restore them if f() panics. Additionally, tests modifying shared env vars can race when run in parallel (cargo test uses threads by default).

Consider using std::panic::catch_unwind or the temp_env crate for safer env manipulation:

♻️ Proposed panic-safe wrapper
 fn with_env<F: FnOnce()>(vars: &[(&str, Option<&str>)], f: F) {
     let saved: Vec<(String, Option<String>)> = vars
         .iter()
         .map(|(k, _)| (k.to_string(), std::env::var(k).ok()))
         .collect();
     for (k, v) in vars {
         match v {
             Some(val) => std::env::set_var(k, val),
             None => std::env::remove_var(k),
         }
     }
-    f();
-    for (k, v) in saved {
+    let result = std::panic::catch_unwind(std::panic::AssertUnwindSafe(f));
+    for (k, v) in &saved {
         match v {
             Some(val) => std::env::set_var(&k, val),
             None => std::env::remove_var(&k),
         }
     }
+    if let Err(e) = result {
+        std::panic::resume_unwind(e);
+    }
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/common/src/auth.rs` around lines 72 - 90, The with_env function does
not ensure environment variables are restored if the closure f panics, which can
cause test pollution. Wrap the call to f() using std::panic::catch_unwind or a
similar panic-safe mechanism so that the restoration loop (where saved variables
are restored) always executes regardless of whether f panics or completes
normally. This guarantees that all environment variables are properly restored
even in the presence of panics within the test closure.
crates/dashboard/src/auth.rs (1)

118-126: ⚡ Quick win

Consider adding post_logout_redirect_uri to the logout redirect.

When redirecting to end_session_endpoint, many IdPs (Keycloak, Auth0, Okta) require or recommend a post_logout_redirect_uri parameter to redirect the user back to the application after logout. Without it, users may land on the IdP's generic logout confirmation page.

♻️ Suggested enhancement
                     if let Some(end_session) =
                         body.get("end_session_endpoint").and_then(|v| v.as_str())
                     {
                         let mut url = end_session.to_string();
+                        let mut has_param = false;
                         if let Some(t) = &id_token {
-                            url.push_str(&format!("?id_token_hint={}", url_encode(t)));
+                            url.push_str(&format!("?id_token_hint={}", url_encode(t)));
+                            has_param = true;
+                        }
+                        // Redirect back to app root after logout
+                        if let Some(w) = web_sys::window() {
+                            if let Ok(origin) = w.location().origin() {
+                                let sep = if has_param { "&" } else { "?" };
+                                url.push_str(&format!("{}post_logout_redirect_uri={}", sep, url_encode(&origin)));
+                            }
                         }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dashboard/src/auth.rs` around lines 118 - 126, The logout redirect in
the end_session_endpoint URL construction is missing the
post_logout_redirect_uri parameter, which many IdPs like Keycloak, Auth0, and
Okta require to properly redirect users back to the application after logout.
Modify the URL building logic in the section where id_token_hint is added to
also append a post_logout_redirect_uri query parameter that points back to your
application's origin or configured logout callback URL, ensuring users are
redirected correctly after logout instead of landing on the IdP's generic logout
page.
crates/dashboard/src/auth/callback.rs (1)

100-107: ⚡ Quick win

Consider handling non-2xx token response status before parsing JSON.

The token endpoint may return error responses (4xx/5xx) with JSON error bodies. Currently, resp.json() is called unconditionally and may fail with a confusing decode error rather than surfacing the actual IdP error message.

♻️ Suggested improvement
             let resp = match resp.send().await {
                 Ok(r) => r,
                 Err(e) => {
                     set_error.set(Some(format!("Token request failed: {e}")));
                     return;
                 }
             };
+            if !resp.ok() {
+                let status = resp.status();
+                let body = resp.text().await.unwrap_or_default();
+                set_error.set(Some(format!("Token endpoint returned {status}: {body}")));
+                return;
+            }
             let tokens: TokenResponse = match resp.json().await {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/dashboard/src/auth/callback.rs` around lines 100 - 107, The code
unconditionally calls resp.json() without first checking the HTTP status code of
the token endpoint response. If the IdP returns a non-2xx status (4xx/5xx), the
JSON parsing may fail with a generic decode error, hiding the actual error
details in the response body. Before attempting to parse resp.json() into
TokenResponse, first check the HTTP status code using resp.status(). If the
status is not successful, extract and log the error information from the
response body before returning the error, otherwise proceed with parsing the
TokenResponse.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cli/src/helpers.ts`:
- Around line 91-101: The authFromEnv() function currently only validates the
case where both KRONOS_CLIENT_ID and KRONOS_CLIENT_SECRET are set, but it
silently falls back to no authentication if only one of these credentials is
provided, which masks configuration errors. Add validation logic to check if
exactly one of KRONOS_CLIENT_ID or KRONOS_CLIENT_SECRET is set (but not both),
and throw an appropriate error message in that case to fail fast and alert the
user to the incomplete configuration, preventing the silent fallback to
unauthenticated requests.

In `@crates/api/src/handlers/auth.rs`:
- Around line 35-49: The flush_cache function currently only checks that the
request is authenticated via AuthenticatedRequest, but does not verify that the
caller has the necessary privileges to perform this sensitive operation. Add
authorization logic within the function to extract privilege or role information
from the AuthenticatedRequest and validate that the caller has the required
permissions before allowing the cache flush. If the caller lacks the necessary
privileges, return an appropriate error response (such as Forbidden or
Unauthorized) instead of proceeding with the flush operation.
- Around line 37-41: The current code uses `unwrap_or_default()` on the `body`
parameter which treats both malformed JSON (resulting in actix-web returning
None) and empty bodies the same way, creating a default FlushRequest with
client_id: None and flushing the entire cache. Instead of the direct
`body.map(|b| b.into_inner()).unwrap_or_default()` approach, explicitly handle
the result from actix-web's JSON parsing: return a 400 Bad Request response when
the body is present but contains invalid JSON (when Option is None due to parse
error), and only apply the default FlushRequest when the body is genuinely
absent or empty. This requires checking whether the failure came from malformed
content versus missing content, and responding accordingly rather than silently
defaulting.

In `@crates/api/src/main.rs`:
- Around line 112-130: In the `oidc_rs::AuthConfig::Enabled(c)` branch where
`client_id` and `redirect_url` are read using `std::env::var(...).ok()`, add
validation to detect when these required dashboard OIDC configuration variables
are missing. After reading both variables, check if they are None and either log
a warning or fail fast with a clear error message indicating that
`TE_OIDC_DASHBOARD_CLIENT_ID` and `TE_OIDC_DASHBOARD_REDIRECT_URL` must be set
when auth is enabled and dashboard mode is active, rather than silently allowing
`None` values that will cause runtime failures.

In `@crates/common/src/auth.rs`:
- Around line 11-16: Update the `read_auth_config_from_env()` function to align
the default auth mode with PR objectives. Change the default value passed to
`get_from_env_or_default()` for the `TE_AUTH_MODE` environment variable from
`"enabled"` to `"disabled"` to make disabled the default for local development.
Also update the docstring at the beginning of the function to reflect that
`TE_AUTH_MODE=disabled` is now the default, with `TE_AUTH_MODE=enabled` as the
alternative that requires OIDC configuration.

In `@crates/dashboard/Cargo.toml`:
- Line 32: The jsonwebtoken dependency declared in crates/dashboard/Cargo.toml
with the use_pem feature is not used anywhere in the dashboard crate, including
in callback.rs which manually handles JWT claim decoding using base64 and
serde_json instead. Remove the entire jsonwebtoken dependency line from the
dependencies section in crates/dashboard/Cargo.toml to eliminate this unused
dependency.

In `@crates/dashboard/src/app.rs`:
- Around line 44-54: The current implementation in the format string injects
config values directly into JavaScript without escaping special characters,
creating an XSS vulnerability if any config value contains quotes, backslashes,
or script tags. Replace the manual string formatting that concatenates
c.api_base_url, c.api_prefix, c.dashboard_prefix, c.auth_disabled,
c.oidc_issuer, c.oidc_client_id, c.oidc_redirect_url, and c.oidc_audience with
JSON serialization of the entire DashboardConfig struct using serde_json, then
assign it to window.__KRONOS_CONFIG__ in JavaScript. This ensures all values are
properly escaped and prevents injection attacks while maintaining the same
configuration object structure.

In `@crates/dashboard/src/auth/callback.rs`:
- Around line 121-123: The `artifacts.return_to` value used in the `set_href()`
call is derived from user-controlled input stored in sessionStorage and lacks
validation for relative paths, creating an open redirect vulnerability. Before
calling `set_href(&artifacts.return_to)`, add a validation check to ensure the
`return_to` value is a relative path by verifying it starts with a forward slash
and does not contain a URL protocol (http://, https://, etc.). If validation
fails, either skip the redirect or use a safe default path instead of proceeding
with the unvalidated user input.

In `@crates/oidc-rs-actix/src/error.rs`:
- Around line 13-17: The AuthError::IdpMalformedResponse variant is currently
matching the default catch-all pattern and returning a 401 Unauthorized
response, but it should be treated as a server-side failure like IdpUnreachable.
Add an explicit match arm for AuthError::IdpMalformedResponse before the default
case that returns HttpResponse::ServiceUnavailable() with the "Retry-After"
header set to "5", matching the same structure as the existing IdpUnreachable
case, then keep the default case for all other error variants.
- Around line 9-11: The error response in the JSON body construction is exposing
internal error details by directly using err.to_string() in the message field,
which can leak sensitive information to clients. Replace the err.to_string()
call with a stable, public-facing error message that corresponds to the error
code (obtained from the code(err) function), and move the full error details
(err.to_string()) to a server-side log statement using an appropriate logging
mechanism to aid internal debugging while keeping the client response generic
and secure.

In `@crates/oidc-rs-actix/src/middleware.rs`:
- Around line 134-148: The header parsing logic uses case-sensitive strip_prefix
checks for "Bearer " and "Basic " schemes, which rejects valid headers with
different casing like "bearer" or "basic". Convert the header string to
lowercase before performing the strip_prefix checks on both the Bearer and Basic
authentication scheme branches to ensure case-insensitive matching while
preserving the original token/credentials data for validation and exchange
operations.

In `@crates/oidc-rs/README.md`:
- Line 3: In the README.md file, clarify the Basic authentication header format
description to explicitly state that the `client_id:client_secret` credentials
must be base64 encoded before being included in the Authorization header.
Instead of the current `Authorization: Basic <client_id:client_secret>`
notation, update the documentation to clearly indicate that the wire format
contains base64-encoded credentials, such as `Authorization: Basic
<base64(client_id:client_secret)>` or by adding text explaining that the
colon-separated credentials must be base64 encoded to prevent developers from
sending unencoded credentials.

In `@crates/oidc-rs/src/config.rs`:
- Around line 93-105: In the build method of the AuthConfigBuilder, add
validation to reject zero or non-positive Duration values for both jwks_refresh
and basic_cache_ttl fields. Currently these fields use unwrap_or with default
Duration values but do not check if explicitly set values are zero or negative.
After unwrapping the optional values (or before creating the EnabledConfig
struct), validate that jwks_refresh and basic_cache_ttl are both greater than
Duration::ZERO, returning an appropriate BuildError variant if either validation
fails. This prevents the panic that occurs when jwks_refresh reaches
tokio::time::interval() with a zero duration and ensures basic_cache_ttl is
actually effective.

In `@crates/oidc-rs/src/exchanger.rs`:
- Around line 137-151: When checking if cached entries have expired in the
positive and negative cache retrieval logic, actively remove the expired entries
from the maps instead of just returning early. In the positive cache check,
after determining that cached.expires_at is not greater than Instant::now(),
delete the entry from self.inner.positive using the key before continuing.
Similarly, in the negative cache check, after determining that neg.until is not
greater than Instant::now(), delete the entry from self.inner.negative using the
key before continuing. Apply this same eviction pattern to all cache retrieval
locations referenced at lines 174-206.

In `@crates/oidc-rs/tests/fixtures/extract_jwk.sh`:
- Line 12: The extract_jwk.sh script hardcodes the RSA exponent value as "AQAB"
when writing to the test_rsa_e.txt file. Instead of hardcoding this value,
extract the actual RSA exponent from the certificate or key being processed
(available in the HERE directory) and encode it properly in base64url format
before writing it to the file. This ensures the fixture remains consistent with
the actual key's exponent regardless of whether it uses the standard 65537
exponent or a different value.

In `@justfile`:
- Around line 115-130: The current implementation uses a single grep check for
[dependencies.base64] to guard the appending of both the base64 and tokio
dependency sections. This causes the dev-dependencies.tokio section to be
skipped whenever [dependencies.base64] already exists, leaving the smoke test
dependency missing after regeneration. Split the idempotency guards into two
separate grep checks and cat commands - one for [dependencies.base64] and one
for [dev-dependencies.tokio] - so each section can be independently re-appended
after regeneration without being blocked by the other's existence.

---

Nitpick comments:
In `@crates/api/tests/auth_integration.rs`:
- Around line 54-71: The test function
basic_with_valid_creds_exchanges_and_returns_identity_basic currently only
verifies the first token exchange succeeds. Extend the test by making a second
identical request to /v1/auth/whoami with the same Basic authentication header
and verify that idp.token_calls remains at 1 (proving the response was cached).
Then call the /auth/cache/flush endpoint via the app, make a third request to
/v1/auth/whoami with the same credentials, and assert that idp.token_calls has
incremented to 2 (proving the cache flush invalidated the cached response and
forced a new token exchange).

In `@crates/common/src/auth.rs`:
- Around line 72-90: The with_env function does not ensure environment variables
are restored if the closure f panics, which can cause test pollution. Wrap the
call to f() using std::panic::catch_unwind or a similar panic-safe mechanism so
that the restoration loop (where saved variables are restored) always executes
regardless of whether f panics or completes normally. This guarantees that all
environment variables are properly restored even in the presence of panics
within the test closure.

In `@crates/dashboard/src/auth.rs`:
- Around line 118-126: The logout redirect in the end_session_endpoint URL
construction is missing the post_logout_redirect_uri parameter, which many IdPs
like Keycloak, Auth0, and Okta require to properly redirect users back to the
application after logout. Modify the URL building logic in the section where
id_token_hint is added to also append a post_logout_redirect_uri query parameter
that points back to your application's origin or configured logout callback URL,
ensuring users are redirected correctly after logout instead of landing on the
IdP's generic logout page.

In `@crates/dashboard/src/auth/callback.rs`:
- Around line 100-107: The code unconditionally calls resp.json() without first
checking the HTTP status code of the token endpoint response. If the IdP returns
a non-2xx status (4xx/5xx), the JSON parsing may fail with a generic decode
error, hiding the actual error details in the response body. Before attempting
to parse resp.json() into TokenResponse, first check the HTTP status code using
resp.status(). If the status is not successful, extract and log the error
information from the response body before returning the error, otherwise proceed
with parsing the TokenResponse.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6dd365db-190f-4d23-b72d-77ccba04b261

📥 Commits

Reviewing files that changed from the base of the PR and between d2fd0dc and 6fbdbad.

⛔ Files ignored due to path filters (2)
  • Cargo.lock is excluded by !**/*.lock
  • crates/oidc-rs/tests/fixtures/test_rsa_priv.pem is excluded by !**/*.pem
📒 Files selected for processing (103)
  • .gitignore
  • Cargo.toml
  • README.md
  • cli/src/helpers.ts
  • crates/api/Cargo.toml
  • crates/api/src/extractors.rs
  • crates/api/src/handlers.rs
  • crates/api/src/handlers/auth.rs
  • crates/api/src/lib.rs
  • crates/api/src/main.rs
  • crates/api/src/middleware.rs
  • crates/api/src/router.rs
  • crates/api/tests/auth_integration.rs
  • crates/api/tests/common/mod.rs
  • crates/client/Cargo.toml
  • crates/client/src/auth_ext.rs
  • crates/client/src/config.rs
  • crates/client/src/lib.rs
  • crates/client/src/operation/cancel_execution.rs
  • crates/client/src/operation/cancel_execution/builders.rs
  • crates/client/src/operation/cancel_job.rs
  • crates/client/src/operation/cancel_job/builders.rs
  • crates/client/src/operation/create_endpoint.rs
  • crates/client/src/operation/create_endpoint/builders.rs
  • crates/client/src/operation/create_job.rs
  • crates/client/src/operation/create_job/builders.rs
  • crates/client/src/operation/create_payload_spec.rs
  • crates/client/src/operation/create_payload_spec/builders.rs
  • crates/client/src/operation/delete_endpoint.rs
  • crates/client/src/operation/delete_endpoint/builders.rs
  • crates/client/src/operation/delete_payload_spec.rs
  • crates/client/src/operation/delete_payload_spec/builders.rs
  • crates/client/src/operation/get_endpoint.rs
  • crates/client/src/operation/get_endpoint/builders.rs
  • crates/client/src/operation/get_execution.rs
  • crates/client/src/operation/get_execution/builders.rs
  • crates/client/src/operation/get_job.rs
  • crates/client/src/operation/get_job/builders.rs
  • crates/client/src/operation/get_job_status.rs
  • crates/client/src/operation/get_job_status/builders.rs
  • crates/client/src/operation/get_job_versions.rs
  • crates/client/src/operation/get_job_versions/builders.rs
  • crates/client/src/operation/get_payload_spec.rs
  • crates/client/src/operation/get_payload_spec/builders.rs
  • crates/client/src/operation/list_endpoints.rs
  • crates/client/src/operation/list_endpoints/builders.rs
  • crates/client/src/operation/list_execution_attempts.rs
  • crates/client/src/operation/list_execution_attempts/builders.rs
  • crates/client/src/operation/list_execution_logs.rs
  • crates/client/src/operation/list_execution_logs/builders.rs
  • crates/client/src/operation/list_job_executions.rs
  • crates/client/src/operation/list_job_executions/builders.rs
  • crates/client/src/operation/list_jobs.rs
  • crates/client/src/operation/list_jobs/builders.rs
  • crates/client/src/operation/list_payload_specs.rs
  • crates/client/src/operation/list_payload_specs/builders.rs
  • crates/client/src/operation/update_endpoint.rs
  • crates/client/src/operation/update_endpoint/builders.rs
  • crates/client/src/operation/update_job.rs
  • crates/client/src/operation/update_job/builders.rs
  • crates/client/src/operation/update_payload_spec.rs
  • crates/client/src/operation/update_payload_spec/builders.rs
  • crates/client/src/types.rs
  • crates/client/src/types/builders.rs
  • crates/client/src/types/error.rs
  • crates/client/src/types/error/builders.rs
  • crates/client/tests/basic_auth_smoke.rs
  • crates/common/Cargo.toml
  • crates/common/src/auth.rs
  • crates/common/src/config.rs
  • crates/common/src/lib.rs
  • crates/dashboard/Cargo.toml
  • crates/dashboard/src/api/client.rs
  • crates/dashboard/src/app.rs
  • crates/dashboard/src/auth.rs
  • crates/dashboard/src/auth/callback.rs
  • crates/dashboard/src/auth/pkce.rs
  • crates/dashboard/src/config.rs
  • crates/dashboard/src/lib.rs
  • crates/oidc-rs-actix/Cargo.toml
  • crates/oidc-rs-actix/README.md
  • crates/oidc-rs-actix/src/error.rs
  • crates/oidc-rs-actix/src/extractor.rs
  • crates/oidc-rs-actix/src/lib.rs
  • crates/oidc-rs-actix/src/middleware.rs
  • crates/oidc-rs-actix/tests/smoke.rs
  • crates/oidc-rs/Cargo.toml
  • crates/oidc-rs/README.md
  • crates/oidc-rs/src/config.rs
  • crates/oidc-rs/src/exchanger.rs
  • crates/oidc-rs/src/identity.rs
  • crates/oidc-rs/src/lib.rs
  • crates/oidc-rs/src/validator.rs
  • crates/oidc-rs/tests/fixtures/extract_jwk.sh
  • crates/oidc-rs/tests/fixtures/test_rsa_e.txt
  • crates/oidc-rs/tests/fixtures/test_rsa_n.txt
  • crates/worker/Cargo.toml
  • crates/worker/src/client.rs
  • docker-compose.prod.yml
  • docs/superpowers/plans/2026-06-14-oidc-auth.md
  • docs/superpowers/specs/2026-06-14-oidc-auth-design.md
  • justfile
  • smithy/model/main.smithy

Comment thread cli/src/helpers.ts
Comment on lines +91 to +101
const clientId = process.env.KRONOS_CLIENT_ID;
const clientSecret = process.env.KRONOS_CLIENT_SECRET;
const bearer = process.env.KRONOS_BEARER_TOKEN;

if (clientId && clientSecret) {
if (bearer) {
throw new Error(
"KRONOS_CLIENT_ID/KRONOS_CLIENT_SECRET and KRONOS_BEARER_TOKEN are mutually exclusive — " +
"unset one auth method before continuing"
);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fail fast when only one Basic credential env var is set.

At Line 95, authFromEnv() only handles the “both set” case. If just one of KRONOS_CLIENT_ID / KRONOS_CLIENT_SECRET is set, it silently falls back to the no-auth path (Line 121), which sends unauthenticated requests and hides config errors.

💡 Proposed fix
 function authFromEnv(): Record<string, unknown> {
   const clientId = process.env.KRONOS_CLIENT_ID;
   const clientSecret = process.env.KRONOS_CLIENT_SECRET;
   const bearer = process.env.KRONOS_BEARER_TOKEN;
 
+  if ((clientId && !clientSecret) || (!clientId && clientSecret)) {
+    throw new Error(
+      "KRONOS_CLIENT_ID and KRONOS_CLIENT_SECRET must be set together"
+    );
+  }
+
   if (clientId && clientSecret) {
     if (bearer) {
       throw new Error(
         "KRONOS_CLIENT_ID/KRONOS_CLIENT_SECRET and KRONOS_BEARER_TOKEN are mutually exclusive — " +
         "unset one auth method before continuing"
       );

Also applies to: 117-122

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cli/src/helpers.ts` around lines 91 - 101, The authFromEnv() function
currently only validates the case where both KRONOS_CLIENT_ID and
KRONOS_CLIENT_SECRET are set, but it silently falls back to no authentication if
only one of these credentials is provided, which masks configuration errors. Add
validation logic to check if exactly one of KRONOS_CLIENT_ID or
KRONOS_CLIENT_SECRET is set (but not both), and throw an appropriate error
message in that case to fail fast and alert the user to the incomplete
configuration, preventing the silent fallback to unauthenticated requests.

Comment on lines +35 to +49
pub async fn flush_cache(
_req: AuthenticatedRequest,
body: Option<web::Json<FlushRequest>>,
state: web::Data<AuthState>,
) -> impl Responder {
let req = body.map(|b| b.into_inner()).unwrap_or_default();
let Some(exchanger) = state.exchanger() else {
return HttpResponse::Ok().json(serde_json::json!({
"positive_evicted": 0,
"negative_evicted": 0,
"note": "auth mode is disabled; nothing to flush"
}));
};
let (pos, neg) = exchanger.flush(req.client_id.as_deref());
HttpResponse::Ok().json(serde_json::json!({

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict cache flush to a privileged identity/scope.

flush_cache performs a global/shared auth-cache eviction but does not authorize the caller beyond “authenticated.” That lets any valid principal force cache churn and token-endpoint traffic spikes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api/src/handlers/auth.rs` around lines 35 - 49, The flush_cache
function currently only checks that the request is authenticated via
AuthenticatedRequest, but does not verify that the caller has the necessary
privileges to perform this sensitive operation. Add authorization logic within
the function to extract privilege or role information from the
AuthenticatedRequest and validate that the caller has the required permissions
before allowing the cache flush. If the caller lacks the necessary privileges,
return an appropriate error response (such as Forbidden or Unauthorized) instead
of proceeding with the flush operation.

Comment thread crates/api/src/handlers/auth.rs Outdated
Comment on lines +37 to +41
body: Option<web::Json<FlushRequest>>,
state: web::Data<AuthState>,
) -> impl Responder {
let req = body.map(|b| b.into_inner()).unwrap_or_default();
let Some(exchanger) = state.exchanger() else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In actix-web, what is the exact behavior of Option<web::Json>when the request body contains malformed JSON—does it returnNone or propagate a 400 error?

💡 Result:

In actix-web, using Option<web::Json> as an extractor does not result in a 400 Bad Request error when the JSON is malformed [1]. Instead, the extractor catches the deserialization error internally and returns None to the handler [1]. This behavior is designed to allow for optional request bodies, but it effectively swallows any deserialization errors, which are typically only logged as DEBUG messages by the framework [1]. If you require the request to fail with a 400 error when the JSON is invalid, you should use web::Json or actix_web::Result<web::Json> as your extractor instead [1]. Using these types forces the framework to propagate deserialization errors, resulting in an automatic 400 Bad Request response [2][3]. If you need to customize the error message or format when using the standard web::Json extractor, you can configure a custom error handler using the.error_handler method on web::JsonConfig [4].

Citations:


🏁 Script executed:

# First, let's check if the file exists and read the relevant section
cd crates/api/src/handlers || exit 1
wc -l auth.rs

Repository: juspay/kronos

Length of output: 68


🏁 Script executed:

# Read the relevant lines around 37-41
sed -n '30,50p' crates/api/src/handlers/auth.rs

Repository: juspay/kronos

Length of output: 894


🏁 Script executed:

# Find FlushRequest definition to understand what default() does
rg "struct FlushRequest" -A 10

Repository: juspay/kronos

Length of output: 1616


🏁 Script executed:

# Find the flush method definition
rg "fn flush" -A 5 --context 2

Repository: juspay/kronos

Length of output: 3299


Malformed JSON silently triggers full-cache flush.

When Option<web::Json<FlushRequest>> receives malformed JSON, actix-web returns None rather than a 400 error. The subsequent unwrap_or_default() creates a FlushRequest with client_id: None, which calls exchanger.flush(None) — flushing the entire cache. This is a significant availability risk. Parse invalid JSON as a 400 Bad Request and reserve defaulting for empty bodies only.

Safer parsing shape
-pub async fn flush_cache(
-    _req: AuthenticatedRequest,
-    body: Option<web::Json<FlushRequest>>,
-    state: web::Data<AuthState>,
-) -> impl Responder {
-    let req = body.map(|b| b.into_inner()).unwrap_or_default();
+pub async fn flush_cache(
+    _req: AuthenticatedRequest,
+    body: web::Bytes,
+    state: web::Data<AuthState>,
+) -> impl Responder {
+    let req = if body.is_empty() {
+        FlushRequest::default()
+    } else {
+        match serde_json::from_slice::<FlushRequest>(&body) {
+            Ok(v) => v,
+            Err(_) => {
+                return HttpResponse::BadRequest().json(serde_json::json!({
+                    "error": "invalid JSON body"
+                }));
+            }
+        }
+    };
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
body: Option<web::Json<FlushRequest>>,
state: web::Data<AuthState>,
) -> impl Responder {
let req = body.map(|b| b.into_inner()).unwrap_or_default();
let Some(exchanger) = state.exchanger() else {
pub async fn flush_cache(
_req: AuthenticatedRequest,
body: web::Bytes,
state: web::Data<AuthState>,
) -> impl Responder {
let req = if body.is_empty() {
FlushRequest::default()
} else {
match serde_json::from_slice::<FlushRequest>(&body) {
Ok(v) => v,
Err(_) => {
return HttpResponse::BadRequest().json(serde_json::json!({
"error": "invalid JSON body"
}));
}
}
};
let Some(exchanger) = state.exchanger() else {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api/src/handlers/auth.rs` around lines 37 - 41, The current code uses
`unwrap_or_default()` on the `body` parameter which treats both malformed JSON
(resulting in actix-web returning None) and empty bodies the same way, creating
a default FlushRequest with client_id: None and flushing the entire cache.
Instead of the direct `body.map(|b| b.into_inner()).unwrap_or_default()`
approach, explicitly handle the result from actix-web's JSON parsing: return a
400 Bad Request response when the body is present but contains invalid JSON
(when Option is None due to parse error), and only apply the default
FlushRequest when the body is genuinely absent or empty. This requires checking
whether the failure came from malformed content versus missing content, and
responding accordingly rather than silently defaulting.

Comment thread crates/api/src/main.rs
Comment on lines +112 to +130
let (auth_disabled, oidc_issuer, oidc_client_id, oidc_redirect_url, oidc_audience) =
match &config.auth {
oidc_rs::AuthConfig::Disabled => (true, None, None, None, None),
oidc_rs::AuthConfig::Enabled(c) => {
let issuer = Some(c.issuer.clone());
// Three extra envs let the operator configure the dashboard's
// public OIDC client without baking it into AuthConfig.
let client_id = std::env::var("TE_OIDC_DASHBOARD_CLIENT_ID").ok();
let redirect_url = std::env::var("TE_OIDC_DASHBOARD_REDIRECT_URL").ok();
// Audience for the `/authorize` request. For Auth0 this is
// REQUIRED to get an API-scoped access_token (otherwise the
// token's `aud` is `https://<tenant>.auth0.com/userinfo` and
// the API rejects it). For Keycloak / Okta it's typically
// unused / harmless. NEVER fall back to `client_id` here —
// they are independent concepts in every IdP we support.
let audience = std::env::var("TE_OIDC_DASHBOARD_AUDIENCE").ok();
(false, issuer, client_id, redirect_url, audience)
}
};

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing dashboard OIDC configuration produces silent failure.

When TE_AUTH_MODE=enabled, the dashboard requires TE_OIDC_DASHBOARD_CLIENT_ID and TE_OIDC_DASHBOARD_REDIRECT_URL for the PKCE flow to work. Currently these are read with .ok(), silently yielding None. Users will only discover the misconfiguration at runtime when the login button fails.

Consider logging a warning or failing fast when auth is enabled but dashboard mode is active and these vars are missing:

🛡️ Proposed validation when building dashboard config
                 oidc_rs::AuthConfig::Enabled(c) => {
                     let issuer = Some(c.issuer.clone());
-                    // Three extra envs let the operator configure the dashboard's
-                    // public OIDC client without baking it into AuthConfig.
                     let client_id = std::env::var("TE_OIDC_DASHBOARD_CLIENT_ID").ok();
                     let redirect_url = std::env::var("TE_OIDC_DASHBOARD_REDIRECT_URL").ok();
-                    // Audience for the `/authorize` request. ...
                     let audience = std::env::var("TE_OIDC_DASHBOARD_AUDIENCE").ok();
+                    if client_id.is_none() || redirect_url.is_none() {
+                        tracing::warn!(
+                            "Dashboard auth enabled but TE_OIDC_DASHBOARD_CLIENT_ID and/or \
+                             TE_OIDC_DASHBOARD_REDIRECT_URL not set. Login will fail."
+                        );
+                    }
                     (false, issuer, client_id, redirect_url, audience)
                 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
let (auth_disabled, oidc_issuer, oidc_client_id, oidc_redirect_url, oidc_audience) =
match &config.auth {
oidc_rs::AuthConfig::Disabled => (true, None, None, None, None),
oidc_rs::AuthConfig::Enabled(c) => {
let issuer = Some(c.issuer.clone());
// Three extra envs let the operator configure the dashboard's
// public OIDC client without baking it into AuthConfig.
let client_id = std::env::var("TE_OIDC_DASHBOARD_CLIENT_ID").ok();
let redirect_url = std::env::var("TE_OIDC_DASHBOARD_REDIRECT_URL").ok();
// Audience for the `/authorize` request. For Auth0 this is
// REQUIRED to get an API-scoped access_token (otherwise the
// token's `aud` is `https://<tenant>.auth0.com/userinfo` and
// the API rejects it). For Keycloak / Okta it's typically
// unused / harmless. NEVER fall back to `client_id` here —
// they are independent concepts in every IdP we support.
let audience = std::env::var("TE_OIDC_DASHBOARD_AUDIENCE").ok();
(false, issuer, client_id, redirect_url, audience)
}
};
let (auth_disabled, oidc_issuer, oidc_client_id, oidc_redirect_url, oidc_audience) =
match &config.auth {
oidc_rs::AuthConfig::Disabled => (true, None, None, None, None),
oidc_rs::AuthConfig::Enabled(c) => {
let issuer = Some(c.issuer.clone());
let client_id = std::env::var("TE_OIDC_DASHBOARD_CLIENT_ID").ok();
let redirect_url = std::env::var("TE_OIDC_DASHBOARD_REDIRECT_URL").ok();
let audience = std::env::var("TE_OIDC_DASHBOARD_AUDIENCE").ok();
if client_id.is_none() || redirect_url.is_none() {
tracing::warn!(
"Dashboard auth enabled but TE_OIDC_DASHBOARD_CLIENT_ID and/or \
TE_OIDC_DASHBOARD_REDIRECT_URL not set. Login will fail."
);
}
(false, issuer, client_id, redirect_url, audience)
}
};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/api/src/main.rs` around lines 112 - 130, In the
`oidc_rs::AuthConfig::Enabled(c)` branch where `client_id` and `redirect_url`
are read using `std::env::var(...).ok()`, add validation to detect when these
required dashboard OIDC configuration variables are missing. After reading both
variables, check if they are None and either log a warning or fail fast with a
clear error message indicating that `TE_OIDC_DASHBOARD_CLIENT_ID` and
`TE_OIDC_DASHBOARD_REDIRECT_URL` must be set when auth is enabled and dashboard
mode is active, rather than silently allowing `None` values that will cause
runtime failures.

Comment thread crates/common/src/auth.rs
Comment on lines +11 to +16
/// `TE_AUTH_MODE=enabled` (default) → requires `TE_OIDC_ISSUER` and
/// `TE_OIDC_AUDIENCES` and uses the optional `TE_OIDC_BASIC_*` and
/// `TE_OIDC_JWKS_REFRESH_SEC` vars where set.
pub fn read_auth_config_from_env() -> Result<AuthConfig, String> {
let mode = get_from_env_or_default("TE_AUTH_MODE", "enabled".to_string())
.to_lowercase();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Default auth mode contradicts PR objectives.

The docstring at line 11 and code at line 15 default TE_AUTH_MODE to "enabled", but the PR objectives state "TE_AUTH_MODE=disabled (default for local dev)". This mismatch could cause developers to unexpectedly need OIDC configuration in local environments.

If "disabled" is intended as the local dev default, update the code:

🐛 Proposed fix to align default with PR objectives
 /// `TE_AUTH_MODE=disabled` → [`AuthConfig::Disabled`].
-/// `TE_AUTH_MODE=enabled` (default) → requires `TE_OIDC_ISSUER` and
+/// `TE_AUTH_MODE=disabled` (default) → authentication bypassed.
+/// `TE_AUTH_MODE=enabled` → requires `TE_OIDC_ISSUER` and
 /// `TE_OIDC_AUDIENCES` and uses the optional `TE_OIDC_BASIC_*` and
 /// `TE_OIDC_JWKS_REFRESH_SEC` vars where set.
 pub fn read_auth_config_from_env() -> Result<AuthConfig, String> {
-    let mode = get_from_env_or_default("TE_AUTH_MODE", "enabled".to_string())
+    let mode = get_from_env_or_default("TE_AUTH_MODE", "disabled".to_string())
         .to_lowercase();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/common/src/auth.rs` around lines 11 - 16, Update the
`read_auth_config_from_env()` function to align the default auth mode with PR
objectives. Change the default value passed to `get_from_env_or_default()` for
the `TE_AUTH_MODE` environment variable from `"enabled"` to `"disabled"` to make
disabled the default for local development. Also update the docstring at the
beginning of the function to reflect that `TE_AUTH_MODE=disabled` is now the
default, with `TE_AUTH_MODE=enabled` as the alternative that requires OIDC
configuration.

Comment thread crates/oidc-rs/README.md Outdated
@@ -0,0 +1,9 @@
# oidc-rs

Lightweight OIDC Resource Server primitives for Rust services. Validates inbound JWTs against an OIDC issuer's JWKS, and exchanges `Authorization: Basic <client_id:client_secret>` credentials against the issuer's `client_credentials` grant (cached, single-flighted) so that machine clients don't pay a token-endpoint roundtrip per request.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify Basic header wire format in the README.

Authorization: Basic <client_id:client_secret> is ambiguous; on-wire Basic auth must be base64 of client_id:client_secret. This wording can cause broken client examples.

Suggested doc tweak
-... exchanges `Authorization: Basic <client_id:client_secret>` credentials ...
+... exchanges `Authorization: Basic <base64(client_id:client_secret)>` credentials ...
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Lightweight OIDC Resource Server primitives for Rust services. Validates inbound JWTs against an OIDC issuer's JWKS, and exchanges `Authorization: Basic <client_id:client_secret>` credentials against the issuer's `client_credentials` grant (cached, single-flighted) so that machine clients don't pay a token-endpoint roundtrip per request.
Lightweight OIDC Resource Server primitives for Rust services. Validates inbound JWTs against an OIDC issuer's JWKS, and exchanges `Authorization: Basic <base64(client_id:client_secret)>` credentials against the issuer's `client_credentials` grant (cached, single-flighted) so that machine clients don't pay a token-endpoint roundtrip per request.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/oidc-rs/README.md` at line 3, In the README.md file, clarify the Basic
authentication header format description to explicitly state that the
`client_id:client_secret` credentials must be base64 encoded before being
included in the Authorization header. Instead of the current `Authorization:
Basic <client_id:client_secret>` notation, update the documentation to clearly
indicate that the wire format contains base64-encoded credentials, such as
`Authorization: Basic <base64(client_id:client_secret)>` or by adding text
explaining that the colon-separated credentials must be base64 encoded to
prevent developers from sending unencoded credentials.

Comment thread crates/oidc-rs/src/config.rs Outdated
Comment on lines +93 to +105
pub fn build(self) -> Result<AuthConfig, BuildError> {
let issuer = self.issuer.ok_or(BuildError::MissingIssuer)?;
if self.audiences.is_empty() {
return Err(BuildError::EmptyAudiences);
}
Ok(AuthConfig::Enabled(EnabledConfig {
issuer,
audiences: self.audiences,
basic_audience: self.basic_audience,
basic_scope: self.basic_scope,
basic_cache_ttl: self.basic_cache_ttl.unwrap_or(Duration::from_secs(3600)),
jwks_refresh: self.jwks_refresh.unwrap_or(Duration::from_secs(300)),
}))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Verify zero can flow from env -> builder -> validator refresh loop.
rg -n "TE_OIDC_JWKS_REFRESH_SEC|get_from_env_or_default|jwks_refresh|interval\\(" \
  crates/common/src/auth.rs \
  crates/oidc-rs/src/config.rs \
  crates/oidc-rs/src/validator.rs

Repository: juspay/kronos

Length of output: 1839


Reject zero-duration timing values during config build.

jwks_refresh and basic_cache_ttl currently accept Duration::ZERO. The jwks_refresh value flows into tokio::time::interval() (line 174 of validator.rs), which panics at runtime with zero-duration intervals. The basic_cache_ttl similarly becomes ineffective. Both should be validated to reject zero/non-positive values during config build, catching configuration errors at startup rather than runtime.

Proposed fix
 pub fn build(self) -> Result<AuthConfig, BuildError> {
+    if matches!(self.basic_cache_ttl, Some(ttl) if ttl.is_zero()) {
+        return Err(BuildError::NonPositiveBasicCacheTtl);
+    }
+    if matches!(self.jwks_refresh, Some(interval) if interval.is_zero()) {
+        return Err(BuildError::NonPositiveJwksRefresh);
+    }
+
     let issuer = self.issuer.ok_or(BuildError::MissingIssuer)?;
     if self.audiences.is_empty() {
         return Err(BuildError::EmptyAudiences);
@@
 pub enum BuildError {
@@
     #[error("at least one audience is required")]
     EmptyAudiences,
+    #[error("basic cache TTL must be greater than zero")]
+    NonPositiveBasicCacheTtl,
+    #[error("JWKS refresh interval must be greater than zero")]
+    NonPositiveJwksRefresh,
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
pub fn build(self) -> Result<AuthConfig, BuildError> {
let issuer = self.issuer.ok_or(BuildError::MissingIssuer)?;
if self.audiences.is_empty() {
return Err(BuildError::EmptyAudiences);
}
Ok(AuthConfig::Enabled(EnabledConfig {
issuer,
audiences: self.audiences,
basic_audience: self.basic_audience,
basic_scope: self.basic_scope,
basic_cache_ttl: self.basic_cache_ttl.unwrap_or(Duration::from_secs(3600)),
jwks_refresh: self.jwks_refresh.unwrap_or(Duration::from_secs(300)),
}))
pub fn build(self) -> Result<AuthConfig, BuildError> {
if matches!(self.basic_cache_ttl, Some(ttl) if ttl.is_zero()) {
return Err(BuildError::NonPositiveBasicCacheTtl);
}
if matches!(self.jwks_refresh, Some(interval) if interval.is_zero()) {
return Err(BuildError::NonPositiveJwksRefresh);
}
let issuer = self.issuer.ok_or(BuildError::MissingIssuer)?;
if self.audiences.is_empty() {
return Err(BuildError::EmptyAudiences);
}
Ok(AuthConfig::Enabled(EnabledConfig {
issuer,
audiences: self.audiences,
basic_audience: self.basic_audience,
basic_scope: self.basic_scope,
basic_cache_ttl: self.basic_cache_ttl.unwrap_or(Duration::from_secs(3600)),
jwks_refresh: self.jwks_refresh.unwrap_or(Duration::from_secs(300)),
}))
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/oidc-rs/src/config.rs` around lines 93 - 105, In the build method of
the AuthConfigBuilder, add validation to reject zero or non-positive Duration
values for both jwks_refresh and basic_cache_ttl fields. Currently these fields
use unwrap_or with default Duration values but do not check if explicitly set
values are zero or negative. After unwrapping the optional values (or before
creating the EnabledConfig struct), validate that jwks_refresh and
basic_cache_ttl are both greater than Duration::ZERO, returning an appropriate
BuildError variant if either validation fails. This prevents the panic that
occurs when jwks_refresh reaches tokio::time::interval() with a zero duration
and ensures basic_cache_ttl is actually effective.

Comment thread crates/oidc-rs/src/exchanger.rs Outdated
Comment on lines +137 to +151
if let Some(cached) = self.inner.positive.get(&key) {
if cached.expires_at > Instant::now() {
return Ok(cached.jwt.clone());
}
}
if let Some(neg) = self.inner.negative.get(&key) {
if neg.until > Instant::now() {
return Err(match neg.error {
NegativeReason::Rejected => AuthError::IdpRejected,
NegativeReason::Transient => {
AuthError::IdpUnreachable("recent transient failure".into())
}
});
}
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Expired cache entries are never evicted, allowing unbounded memory growth.

Line 137 and Line 142 only bypass expired entries; they do not remove them. With repeated unique Basic credentials, both positive and especially negative maps can grow without bound, creating a remote memory-DoS path.

Immediate mitigation (partial) + direction
 pub async fn exchange(&self, client_id: &str, secret: &str) -> Result<String, AuthError> {
+    let now = Instant::now();
     let key = CacheKey {
         client_id: client_id.to_string(),
         secret_hash: hash_secret(secret),
     };

     if let Some(cached) = self.inner.positive.get(&key) {
-        if cached.expires_at > Instant::now() {
+        if cached.expires_at > now {
             return Ok(cached.jwt.clone());
         }
+        drop(cached);
+        self.inner.positive.remove(&key);
     }
     if let Some(neg) = self.inner.negative.get(&key) {
-        if neg.until > Instant::now() {
+        if neg.until > now {
             return Err(match neg.error {
                 NegativeReason::Rejected => AuthError::IdpRejected,
                 NegativeReason::Transient => {
                     AuthError::IdpUnreachable("recent transient failure".into())
                 }
             });
         }
+        drop(neg);
+        self.inner.negative.remove(&key);
     }

This should be paired with a bounded-size eviction policy (for cold keys that are never read again), otherwise one-shot credential spray can still grow maps indefinitely.

Also applies to: 174-206

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/oidc-rs/src/exchanger.rs` around lines 137 - 151, When checking if
cached entries have expired in the positive and negative cache retrieval logic,
actively remove the expired entries from the maps instead of just returning
early. In the positive cache check, after determining that cached.expires_at is
not greater than Instant::now(), delete the entry from self.inner.positive using
the key before continuing. Similarly, in the negative cache check, after
determining that neg.until is not greater than Instant::now(), delete the entry
from self.inner.negative using the key before continuing. Apply this same
eviction pattern to all cache retrieval locations referenced at lines 174-206.

n_bytes = bytes.fromhex(n_hex)
def b64u(b): return base64.urlsafe_b64encode(b).rstrip(b"=").decode()
open("$HERE/test_rsa_n.txt","w").write(b64u(n_bytes))
open("$HERE/test_rsa_e.txt","w").write("AQAB")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Do not hardcode RSA exponent in fixture extraction.

Line 12 always writes AQAB; if the fixture key changes to a non-65537 exponent, generated JWK parts become inconsistent.

Proposed fix
 python3 - <<PY
 import base64, subprocess
 out = subprocess.check_output(["openssl","rsa","-in","$PRIV","-noout","-modulus"]).decode()
 n_hex = out.strip().split("=")[1]
 n_bytes = bytes.fromhex(n_hex)
+text = subprocess.check_output(["openssl","rsa","-in","$PRIV","-noout","-text"]).decode()
+e_int = None
+for line in text.splitlines():
+    line = line.strip()
+    if line.startswith("publicExponent:"):
+        e_int = int(line.split(":")[1].split("(")[0].strip())
+        break
+if e_int is None:
+    raise RuntimeError("publicExponent not found")
+e_bytes = e_int.to_bytes((e_int.bit_length() + 7) // 8, "big")
 def b64u(b): return base64.urlsafe_b64encode(b).rstrip(b"=").decode()
 open("$HERE/test_rsa_n.txt","w").write(b64u(n_bytes))
-open("$HERE/test_rsa_e.txt","w").write("AQAB")
+open("$HERE/test_rsa_e.txt","w").write(b64u(e_bytes))
 PY
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
open("$HERE/test_rsa_e.txt","w").write("AQAB")
python3 - <<PY
import base64, subprocess
out = subprocess.check_output(["openssl","rsa","-in","$PRIV","-noout","-modulus"]).decode()
n_hex = out.strip().split("=")[1]
n_bytes = bytes.fromhex(n_hex)
text = subprocess.check_output(["openssl","rsa","-in","$PRIV","-noout","-text"]).decode()
e_int = None
for line in text.splitlines():
line = line.strip()
if line.startswith("publicExponent:"):
e_int = int(line.split(":")[1].split("(")[0].strip())
break
if e_int is None:
raise RuntimeError("publicExponent not found")
e_bytes = e_int.to_bytes((e_int.bit_length() + 7) // 8, "big")
def b64u(b): return base64.urlsafe_b64encode(b).rstrip(b"=").decode()
open("$HERE/test_rsa_n.txt","w").write(b64u(n_bytes))
open("$HERE/test_rsa_e.txt","w").write(b64u(e_bytes))
PY
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/oidc-rs/tests/fixtures/extract_jwk.sh` at line 12, The extract_jwk.sh
script hardcodes the RSA exponent value as "AQAB" when writing to the
test_rsa_e.txt file. Instead of hardcoding this value, extract the actual RSA
exponent from the certificate or key being processed (available in the HERE
directory) and encode it properly in base64url format before writing it to the
file. This ensures the fixture remains consistent with the actual key's exponent
regardless of whether it uses the standard 65537 exponent or a different value.

Comment thread justfile
Comment on lines +115 to +130
grep -q '^\[dependencies.base64\]' crates/client/Cargo.toml || cat >> crates/client/Cargo.toml <<'EOF'

# ── Hand-authored deps (NOT produced by codegen) ─────────────────────────
# Used by `src/auth_ext.rs::Config::with_basic` to base64-encode the
# `Authorization: Basic <b64(id:secret)>` value. The justfile's `smithy-build`
# recipe re-appends this entry after each regeneration.
[dependencies.base64]
version = "0.22"

# ── Hand-authored dev-deps (NOT produced by codegen) ─────────────────────
# Used by `tests/basic_auth_smoke.rs` (`#[tokio::test]`). The justfile's
# `smithy-build` recipe re-appends this entry after each regeneration.
[dev-dependencies.tokio]
version = "1"
features = ["macros", "rt-multi-thread", "net", "io-util", "time"]
EOF

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Split idempotency guards for base64 and tokio sections.

Using one grep condition for both inserts can skip restoring [dev-dependencies.tokio] whenever [dependencies.base64] already exists, leaving the smoke test dependency missing after regeneration.

Suggested fix
-    grep -q '^\[dependencies.base64\]' crates/client/Cargo.toml || cat >> crates/client/Cargo.toml <<'EOF'
+    grep -q '^\[dependencies.base64\]' crates/client/Cargo.toml || cat >> crates/client/Cargo.toml <<'EOF'
 
     # ── Hand-authored deps (NOT produced by codegen) ─────────────────────────
     # Used by `src/auth_ext.rs::Config::with_basic` to base64-encode the
     # `Authorization: Basic <b64(id:secret)>` value. The justfile's `smithy-build`
     # recipe re-appends this entry after each regeneration.
     [dependencies.base64]
     version = "0.22"
+    EOF
 
+    grep -q '^\[dev-dependencies.tokio\]' crates/client/Cargo.toml || cat >> crates/client/Cargo.toml <<'EOF'
     # ── Hand-authored dev-deps (NOT produced by codegen) ─────────────────────
     # Used by `tests/basic_auth_smoke.rs` (`#[tokio::test]`). The justfile's
     # `smithy-build` recipe re-appends this entry after each regeneration.
     [dev-dependencies.tokio]
     version = "1"
     features = ["macros", "rt-multi-thread", "net", "io-util", "time"]
     EOF
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@justfile` around lines 115 - 130, The current implementation uses a single
grep check for [dependencies.base64] to guard the appending of both the base64
and tokio dependency sections. This causes the dev-dependencies.tokio section to
be skipped whenever [dependencies.base64] already exists, leaving the smoke test
dependency missing after regeneration. Split the idempotency guards into two
separate grep checks and cat commands - one for [dependencies.base64] and one
for [dev-dependencies.tokio] - so each section can be independently re-appended
after regeneration without being blocked by the other's existence.

knutties and others added 2 commits June 18, 2026 07:43
13 issues from the post-push review:
- Reject zero-duration timing values in AuthConfigBuilder (avoids runtime
  panic in tokio::time::interval).
- BasicExchanger evicts expired positive/negative entries on read.
- flush_cache returns 400 on malformed JSON instead of silently flushing
  everything.
- IdpMalformedResponse now maps to 503 (not 401).
- Auth error responses use stable public messages; full err detail logged.
- Dashboard HTML shell uses serde_json::to_string for safe JS injection.
- CLI throws when only one of KRONOS_CLIENT_ID/SECRET is set.
- API warns when dashboard is served with auth enabled but no OIDC
  dashboard envs.
- Auth-scheme prefix parsing is now case-insensitive (RFC 7235).
- Dashboard callback rejects non-relative return_to (open-redirect guard).
- Removed unused jsonwebtoken dep from dashboard.
- justfile splits base64 / tokio dep guards so each restores independently.
- oidc-rs README clarifies Basic header is base64-encoded.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tree

The OIDC machinery has been extracted to a standalone repo at
https://github.com/juspay/oidc-rs. Kronos now depends on it via a git
dep pinned to the `main` branch (currently 2315ddb8); promote to a tag
or rev once upstream cuts a release.

Changes:
- Workspace Cargo.toml: drop crates/oidc-rs{,-actix} from members; add
  the two git deps to [workspace.dependencies]; drop now-unused
  openidconnect entry (kronos itself never consumed it directly).
- crates/{common,api,worker}/Cargo.toml: switch path deps to
  workspace = true.
- crates/api/tests/fixtures/: relocate the RSA test fixtures previously
  under crates/oidc-rs/tests/fixtures/ — kronos-api's auth integration
  tests are the only consumer.
- crates/api/tests/common/mod.rs: update 4 include_bytes!/include_str!
  paths to point at the relocated fixtures.
- crates/oidc-rs/, crates/oidc-rs-actix/: removed.

Verified:
- cargo check --workspace clean.
- cargo test -p kronos-api --test auth_integration: 8/8 pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant