feat(debug): protect /debug behind Cloudflare Access + token fallback#5522
Conversation
require_admin now accepts either a Cloudflare Access JWT (Cf-Access-Jwt-Assertion) for browser users, validated against the team's JWKS and an admin_allowed_emails allow-list, OR an X-Admin-Token header for CI / break-glass / local dev. Process-wide PyJWKClient cached via lru_cache so a worker only fetches the JWKS endpoint once. Mobile UX: anyplot.ai/debug now goes through Google OAuth at the Cloudflare edge with FaceID/TouchID; an audit log identifies each request. The X-Admin-Token path stays available via the *.run.app direct URL so a misconfigured Cloudflare Access never strands the operator. - core/config.py: cf_access_team_domain, cf_access_aud, admin_allowed_emails - pyproject.toml + uv.lock: pyjwt[crypto]>=2.10.0 - api/cloudbuild.yaml: ADMIN_TOKEN secret + CF_ACCESS_* env vars - app/src/pages/DebugPage.tsx: credentials: 'include' so the .anyplot.ai cookie travels cross-origin to api.anyplot.ai; relabel token UI as fallback - tests/unit/api/test_debug.py: new TestRequireAdminCfAccess class covering allowed/denied/invalid-jwt fall-through paths - .env.example: document ADMIN_TOKEN and CF_ACCESS_* settings https://claude.ai/code/session_01W5oLzRRyQyoc1t9CRSV4GY
CI lint job's `ruff format --check .` flagged these two files; just applies the formatter, no behaviour change. https://claude.ai/code/session_01W5oLzRRyQyoc1t9CRSV4GY
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Pull request overview
Adds Cloudflare Access–backed authentication for /debug/* endpoints, with the existing X-Admin-Token retained as a break-glass/CI fallback.
Changes:
- Implement Cloudflare Access JWT verification (JWKS +
aud/iss) and an email allow-list inrequire_admin. - Add new config/env wiring for Cloudflare Access parameters and
ADMIN_TOKENsecret injection on Cloud Run. - Update the DebugPage fetch behavior to include cookies cross-origin and adjust related UI copy/tests; add unit tests for the new JWT auth path.
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
api/routers/debug.py |
Adds JWKS-backed Cloudflare Access JWT verification and integrates it into require_admin with token fallback. |
core/config.py |
Introduces settings for Cloudflare Access issuer/audience and an admin email allow-list. |
tests/unit/api/test_debug.py |
Adds unit tests covering allow/deny/fallback behavior for the Cloudflare Access path. |
app/src/pages/DebugPage.tsx |
Sends credentials: 'include' on debug requests and updates admin-auth UX copy. |
app/src/pages/DebugPage.test.tsx |
Updates tests to match the new admin token placeholder text. |
api/cloudbuild.yaml |
Wires ADMIN_TOKEN from Secret Manager and adds CF Access env vars/substitutions for Cloud Run deploy. |
.env.example |
Documents local/prod env vars for debug admin auth and CF Access configuration. |
pyproject.toml |
Adds pyjwt[crypto] dependency for JWT verification. |
uv.lock |
Locks the new pyjwt[crypto] dependency. |
| """Cloudflare Access Application AUD tag (UUID from the Zero Trust | ||
| dashboard). Validated as the JWT `aud` claim.""" | ||
|
|
||
| admin_allowed_emails: list[str] = ["meakeiok@gmail.com"] |
There was a problem hiding this comment.
admin_allowed_emails defaults to a specific personal email address. This can unintentionally grant debug access if Cloudflare Access is configured broadly and the operator forgets to override the allow-list. Consider defaulting to an empty list (or None) and requiring explicit configuration in production to avoid accidental authorization.
|
|
||
| admin_allowed_emails: list[str] = ["meakeiok@gmail.com"] | ||
| """Email addresses allowed to authenticate via Cloudflare Access for | ||
| /debug/* endpoints. Comma-separated when set via env var.""" |
There was a problem hiding this comment.
The docstring says admin_allowed_emails is "Comma-separated when set via env var", but SettingsConfigDict doesn't configure env_parse_delimiter, so Pydantic will typically expect a JSON list (and a comma-separated string may fail parsing and crash at startup). Either configure delimiter parsing (e.g., env_parse_delimiter=",") or update the documentation/examples to the format actually supported.
| /debug/* endpoints. Comma-separated when set via env var.""" | |
| /debug/* endpoints. When set via env var, provide a JSON array string, | |
| for example `["admin@example.com", "ops@example.com"]`.""" |
| # Required by GET /debug/status and GET /debug/ping. Without it, the server | ||
| # returns 503 to keep quality scores and DB latency private. Locally any | ||
| # string works (only reachable on localhost). In production, browsers go | ||
| # through Cloudflare Access (Google login); this is the CI / break-glass path. | ||
| # ADMIN_TOKEN= | ||
|
|
There was a problem hiding this comment.
This section says ADMIN_TOKEN is "Required by GET /debug/status and GET /debug/ping", but with the new Cloudflare Access JWT path those endpoints can be authorized without ADMIN_TOKEN (it becomes a break-glass/CI fallback). Updating this copy (and documenting how to configure ADMIN_ALLOWED_EMAILS, including its expected env var format) would prevent operator confusion/misconfiguration.
| # Required by GET /debug/status and GET /debug/ping. Without it, the server | |
| # returns 503 to keep quality scores and DB latency private. Locally any | |
| # string works (only reachable on localhost). In production, browsers go | |
| # through Cloudflare Access (Google login); this is the CI / break-glass path. | |
| # ADMIN_TOKEN= | |
| # GET /debug/status and GET /debug/ping can be authorized either by a valid | |
| # Cloudflare Access JWT (production browser flow) or by ADMIN_TOKEN. | |
| # Leave ADMIN_TOKEN unset if you only want Cloudflare Access auth. | |
| # Set it for local development, CI, or as a break-glass fallback when | |
| # Cloudflare Access is unavailable. Locally, any non-empty string works | |
| # because these endpoints are only reachable on localhost. | |
| # ADMIN_TOKEN= | |
| # Comma-separated list of allowed Google account email addresses for the | |
| # Cloudflare Access JWT path. Example: | |
| # ADMIN_ALLOWED_EMAILS=alice@example.com,bob@example.com |
| const adminFetch = (url: string, token: string): Promise<Response> => | ||
| fetch(url, token ? { headers: { 'X-Admin-Token': token } } : undefined); | ||
| fetch(url, { | ||
| credentials: 'include', | ||
| headers: token ? { 'X-Admin-Token': token } : undefined, | ||
| }); |
There was a problem hiding this comment.
Now that credentials: 'include' is always set, the page will start exercising the Cloudflare Access path, where the backend can return a 403 (valid JWT but email not allow-listed). The current UI only treats 401/503 as "auth required"; a 403 will fall through to the generic "failed to load: 403" screen. Consider handling 403 explicitly (e.g., show the server message / prompt to sign in with an allowed account) so the new denial mode is understandable.
…urface 403 Addresses Copilot review feedback on PR #5522: 1. Default `admin_allowed_emails` to `[]`, not a personal email — prevents accidental authorization if Cloudflare Access ends up configured but the operator forgets to set ADMIN_ALLOWED_EMAILS. 2. Add a `field_validator` so ADMIN_ALLOWED_EMAILS=foo@bar.com,baz@qux.com parses correctly from .env / Cloud Run env vars (pydantic-settings defaults to JSON arrays only). Both formats now work; whitespace is stripped, empty entries are dropped. 3. Rewrite `.env.example` admin-auth block: ADMIN_TOKEN is no longer "required" with the JWT path — it's a CI / break-glass fallback. Add ADMIN_ALLOWED_EMAILS docs. 4. DebugPage handles 403 explicitly: a Google account that signs in through Cloudflare Access but isn't on the allow-list now sees the server's "User x@y.com not authorized" message on the auth-required screen instead of a generic "failed to load: 403". Coverage: add direct tests for `_jwks_client` and `_verify_cf_access_jwt` (covers the JWKS construction, valid-token, pyjwt-error, and missing-email branches the integration tests skipped via wholesale mocking) and parsing tests for admin_allowed_emails. Brings the new code's patch coverage to 100%. Cloud Run: pass ADMIN_ALLOWED_EMAILS via a substitution and use gcloud's ^@^ delimiter escape so commas inside the value don't break --set-env-vars. https://claude.ai/code/session_01W5oLzRRyQyoc1t9CRSV4GY
| if cf_access_jwt: | ||
| email = _verify_cf_access_jwt(cf_access_jwt) | ||
| if email and email in settings.admin_allowed_emails: | ||
| return | ||
| if email: | ||
| raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail=f"User {email} not authorized") | ||
| # Invalid JWT (signature/aud/iss/expiry) — fall through to token path so |
There was a problem hiding this comment.
When Cf-Access-Jwt-Assertion is present and _verify_cf_access_jwt returns an email that is not allow-listed, require_admin raises 403 before checking X-Admin-Token. This means the documented “token fallback” cannot work from a browser session that is signed into Cloudflare Access with the wrong account (the edge will still inject the JWT header). Consider allowing a valid X-Admin-Token to override the 403 case (e.g., only raise 403 if the admin token is missing/invalid), so the fallback truly behaves as an OR condition.
| if not stripped: | ||
| return [] | ||
| if stripped.startswith("["): | ||
| return json.loads(stripped) |
There was a problem hiding this comment.
ADMIN_ALLOWED_EMAILS parsing uses json.loads for any string starting with [. If an operator accidentally provides a malformed JSON array, the app will fail fast with a JSON decode error at startup. Consider catching json.JSONDecodeError and either (a) raising a clearer ValueError mentioning ADMIN_ALLOWED_EMAILS format, or (b) falling back to comma-splitting when JSON parsing fails.
| return json.loads(stripped) | |
| try: | |
| return json.loads(stripped) | |
| except json.JSONDecodeError as exc: | |
| raise ValueError( | |
| "Invalid ADMIN_ALLOWED_EMAILS format. Use either a JSON array " | |
| 'like ["a@b.com", "c@d.com"] or a comma-separated string like ' | |
| '"a@b.com,c@d.com".' | |
| ) from exc |
| # in _ADMIN_ALLOWED_EMAILS), the JWT path in require_admin denies all and | ||
| # only the X-Admin-Token fallback is active. |
There was a problem hiding this comment.
The comment says that with no email in _ADMIN_ALLOWED_EMAILS “only the X-Admin-Token fallback is active”. With the current require_admin logic, a valid Cloudflare JWT with an unlisted email yields a 403 and does not fall through to the token path, so the token fallback is not active for requests that include Cf-Access-Jwt-Assertion. Please update the comment to match the implemented behavior (or adjust the code to match the comment).
| # in _ADMIN_ALLOWED_EMAILS), the JWT path in require_admin denies all and | |
| # only the X-Admin-Token fallback is active. | |
| # in _ADMIN_ALLOWED_EMAILS), the JWT path in require_admin does not | |
| # authorize anyone. Requests that include a Cloudflare JWT but fail the | |
| # email allowlist check are denied with 403 rather than falling back to | |
| # X-Admin-Token. |
| def test_parses_comma_separated_string(self) -> None: | ||
| from core.config import Settings | ||
|
|
||
| s = Settings(admin_allowed_emails="a@x.com,b@y.com, c@z.com ") | ||
| assert s.admin_allowed_emails == ["a@x.com", "b@y.com", "c@z.com"] | ||
|
|
||
| def test_parses_json_array(self) -> None: | ||
| from core.config import Settings | ||
|
|
||
| s = Settings(admin_allowed_emails='["a@x.com","b@y.com"]') | ||
| assert s.admin_allowed_emails == ["a@x.com", "b@y.com"] | ||
|
|
||
| def test_empty_string_yields_empty_list(self) -> None: | ||
| from core.config import Settings | ||
|
|
||
| s = Settings(admin_allowed_emails="") | ||
| assert s.admin_allowed_emails == [] | ||
|
|
||
| def test_default_is_empty_list(self) -> None: | ||
| """Defaulting to an empty list prevents accidental authorization if the | ||
| operator forgets to configure ADMIN_ALLOWED_EMAILS in production.""" | ||
| from core.config import Settings | ||
|
|
||
| s = Settings() | ||
| assert s.admin_allowed_emails == [] |
There was a problem hiding this comment.
These Settings() assertions can become environment-dependent because SettingsConfigDict loads from .env and process env vars. To keep the tests hermetic, wrap them in patch.dict(os.environ, {}, clear=True) (as done in tests/unit/core/test_config.py) and/or instantiate with _env_file=None.
| def test_status_403_when_jwt_email_not_allowed(self, auth_client) -> None: | ||
| """Valid JWT but email not on allow-list → 403, no fall-through.""" | ||
| with ( | ||
| patch.object(settings, "admin_token", "supersecret"), | ||
| patch.object(settings, "admin_allowed_emails", [self._ALLOWED]), | ||
| patch("api.routers.debug._verify_cf_access_jwt", return_value=self._DENIED), | ||
| ): | ||
| response = auth_client.get("/debug/status", headers={"Cf-Access-Jwt-Assertion": "any.jwt.here"}) | ||
| assert response.status_code == 403 | ||
| assert self._DENIED in response.json()["message"] | ||
|
|
||
| def test_status_503_when_jwt_invalid_and_token_unset(self, auth_client) -> None: | ||
| """Invalid JWT + admin_token unset → falls through to 503 (fail-closed).""" | ||
| with ( | ||
| patch.object(settings, "admin_token", None), | ||
| patch("api.routers.debug._verify_cf_access_jwt", return_value=None), | ||
| ): | ||
| response = auth_client.get("/debug/status", headers={"Cf-Access-Jwt-Assertion": "garbage"}) | ||
| assert response.status_code == 503 | ||
|
|
||
| def test_status_200_when_jwt_invalid_but_token_correct(self, auth_client) -> None: | ||
| """Invalid JWT + correct X-Admin-Token → 200 (break-glass fall-through).""" | ||
| mock_repo = MagicMock() | ||
| mock_repo.get_all = AsyncMock(return_value=[]) | ||
| with ( | ||
| patch.object(settings, "admin_token", "supersecret"), | ||
| patch("api.routers.debug._verify_cf_access_jwt", return_value=None), | ||
| patch("api.routers.debug.SpecRepository", return_value=mock_repo), | ||
| ): | ||
| response = auth_client.get( | ||
| "/debug/status", headers={"Cf-Access-Jwt-Assertion": "garbage", "X-Admin-Token": "supersecret"} | ||
| ) | ||
| assert response.status_code == 200 |
There was a problem hiding this comment.
If require_admin is updated so a correct X-Admin-Token can override the “JWT email not allow-listed” case, add a test that sends both Cf-Access-Jwt-Assertion (verifying to a denied email) and a correct X-Admin-Token and asserts 200. This guards the intended “token fallback” semantics.
## Summary - PR #5522 left the API deploy step broken because `--set-env-vars=^@^ADMIN_ALLOWED_EMAILS=…` collides with the `@` in any email value (`Bad syntax for dict arg: [gmail.com]`). - Multiple `--set-env-vars` flags in one `gcloud run deploy` call also clobber each other, so most env vars were silently dropped — only the last flag was preserved. - Consolidate into a single `--set-env-vars` with `^|^` as the alt delimiter. `|` is absent from all our env values, multi-email lists keep working. ## Why this happened The PR #5522 review tested with `_ADMIN_ALLOWED_EMAILS=""` (empty), so the `@` collision never triggered. The recent re-deploy with `meakeiok@gmail.com` set in trigger substitutions surfaced the bug. ## Test plan - [x] `gcloud run deploy ... --set-env-vars=^|^A=1|B=2` parses correctly (verified locally with `--dry-run`-style arg parsing) - [ ] Cloud Build redeploys cleanly after merge - [ ] `curl -sI https://api.anyplot.ai/debug/ping` returns 302 to `*.cloudflareaccess.com` (CF Access path active) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… startup (#5525) ## Summary After PR #5524 was merged, the API container failed to start with: \`\`\` pydantic_settings.exceptions.SettingsError: error parsing value for field \"admin_allowed_emails\" from source \"EnvSettingsSource\" \`\`\` PR #5522 added a \`field_validator(mode=\"before\")\` to accept both JSON-array and comma-separated env values, but pydantic-settings runs its **own** JSON pre-decode for any \`list\`-typed field **before** field validators run. Since \`meakeiok@gmail.com\` is not valid JSON, pre-decode raised \`SettingsError\` before our validator was reached. ## Fix \`Annotated[list[str], NoDecode]\` opts the field out of pydantic-settings' JSON pre-decode, so the existing validator handles both forms. ## Verified locally (against latest main + this fix) | Env value | Parsed | |-----------|--------| | \`meakeiok@gmail.com\` (the failing case) | \`['meakeiok@gmail.com']\` | | \`[\"a@b.com\",\"c@d.com\"]\` (JSON form) | \`['a@b.com', 'c@d.com']\` | | \`a@b.com,c@d.com\` (multi-email comma) | \`['a@b.com', 'c@d.com']\` | | (unset) | \`[]\` | Existing test suite: \`34 passed\` in \`tests/unit/api/test_debug.py\`. ## Test plan - [x] Local reproduction of pre-fix SettingsError - [x] All four input shapes parse correctly post-fix - [x] \`uv run ruff check core/config.py\` clean - [x] \`pytest tests/unit/api/test_debug.py\` 34/34 pass - [ ] Cloud Build deploys without the SettingsError after merge - [ ] Container starts and \`/debug/ping\` is reachable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) ## Why After PR #5522/#5524/#5525, the deploy pipeline works and CF Access protects /debug. But cross-origin from \`anyplot.ai\` (frontend) to \`api.anyplot.ai\` (API) cannot be solved via the CF dashboard: - CF Access cookies are emitted as host-only (\`Set-Cookie\` has no \`Domain=\` attribute) regardless of whether the application uses specific subdomains, wildcards, or a mix. The legacy \"share cookie across subdomains\" toggle no longer exists in the UI, and the available cookie controls (HttpOnly / Binding / SameSite / Path) don't change the cookie domain. - After Google login on \`anyplot.ai\`, the SPA's fetch to \`api.anyplot.ai\` triggers a fresh CF Access challenge. fetch follows the cross-origin 302 to \`*.cloudflareaccess.com\`, which doesn't return CORS headers for our origin, so fetch errors with \`TypeError: Failed to fetch\`. The user sees this in DebugPage as \"failed to load\" until they hard-reload. ## What Move all browser→API traffic for \`/debug\` to same-origin under \`anyplot.ai\`: 1. **This PR**: \`_VITE_API_URL\` build-arg switches from \`https://api.anyplot.ai\` to \`/api\`. The SPA now does \`fetch('/api/debug/status', { credentials: 'include' })\` which is same-origin → the anyplot.ai CF cookie is sent automatically. 2. **Cloudflare side (manual setup)**: a Worker bound to route \`anyplot.ai/api/*\` strips the \`/api\` prefix and forwards to \`api.anyplot.ai/*\`. CF Access covers \`anyplot.ai/debug*\` and \`anyplot.ai/api/debug*\`, no longer covers \`api.anyplot.ai\` — backend's existing JWT validation in \`require_admin\` is the gate for direct API hits. ## Impact - Local dev: unchanged (fallback to \`http://localhost:8000\` when env var is unset). - Cross-origin CORS: no longer needed for \`/debug\` traffic; the backend's existing CORS config keeps working for any other origin that might still talk to \`api.anyplot.ai\` directly. - Latency: one extra hop through CF Worker (~5–10 ms within Cloudflare's network), negligible. ## Test plan - [x] \`git diff\` shows only the substitution change + comment - [ ] Cloudflare Worker deployed and route bound to \`anyplot.ai/api/*\` - [ ] CF Access Application updated to cover \`anyplot.ai/debug*\` and \`anyplot.ai/api/debug*\` (no api.anyplot.ai) - [ ] After merge: Cloud Build redeploys frontend; \`curl -sI https://anyplot.ai/api/debug/ping\` returns 302 to CF Access login (when not authenticated) - [ ] In browser: open https://anyplot.ai/debug → Google login → page loads with data, no \"Failed to fetch\" 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
require_admin(api/routers/debug.py) now accepts a Cloudflare Access JWT (Cf-Access-Jwt-Assertion) verified against the team's JWKS and an email allow-list, OR the existingX-Admin-Tokenheader as fallback.cf_access_team_domain,cf_access_aud,admin_allowed_emailsand dependencypyjwt[crypto]>=2.10.0.ADMIN_TOKENfrom Secret Manager plusCF_ACCESS_TEAM_DOMAIN/CF_ACCESS_AUDenv vars (api/cloudbuild.yaml).credentials: 'include'so the.anyplot.aicookie reachesapi.anyplot.ai; auth-prompt copy reframes the token as a fallback.TestRequireAdminCfAccesscovering allow / deny / invalid-JWT fall-through. Existing 7 token-path tests unchanged.Why
anyplot.ai/debugcurrently shows "admin auth not configured on server" becauseADMIN_TOKENis unset in Cloud Run — fail-closed by design. The owner wants comfortable mobile access (FaceID / TouchID) and a real identity per request without typing a long token. Bothanyplot.aiandapi.anyplot.aialready sit behind Cloudflare, so Cloudflare Access (Zero Trust, free ≤ 50 users) is the cheapest layer: Google OAuth at the edge, audit log per login, no GCP load balancer / DNS / ingress refactor.Flow
Test plan
uv run --extra test --extra plotting pytest tests/unit— 1323 passeduv run --extra dev ruff check api/routers/debug.py core/config.py tests/unit/api/test_debug.py— cleancd app && yarn type-check— cleancd app && yarn test src/pages/DebugPage.test.tsx— 7 passedanyplot.ai/debug+api.anyplot.ai/debugwith subdomain cookie sharing on, allow-listmeakeiok@gmail.com).ADMIN_TOKENsecret in GCP Secret Manager and grant the Cloud Run runtime SAsecretmanager.secretAccessor._CF_ACCESS_AUDsubstitution inapi/cloudbuild.yamlto the AUD UUID from the Cloudflare Application (currently empty so the JWT path is silently skipped at deploy time — token path still works).curl -sI https://api.anyplot.ai/debug/ping→302to*.cloudflareaccess.com;curl -sI -H "X-Admin-Token: $TOKEN" $CR_URL/debug/ping→200.Rollback
Either disable the Cloudflare Access Application (instant), revert the PR (Cloud Build redeploys old code), or hit the
*.run.appdirect URL withX-Admin-Token(always available — Cloudflare doesn't sit in front of it).https://claude.ai/code/session_01W5oLzRRyQyoc1t9CRSV4GY
Generated by Claude Code