Problem
Right now everything we develop and test gets pushed to main, which is the production branch. Production points at the only Supabase project we have and the only Google OAuth client we have, so any QA we do — uploading a document, signing in with Google, exercising the chat tutor — runs against real prod data and the same ENCRYPTION_KEY. Once we have real users this becomes risky: a bad migration, a half-shipped feature, or a buggy ingestion path could corrupt user data or leak it across rows.
We need a staging tier that mirrors production closely enough to catch issues, but is fully isolated so we can sign into Google with a real account, hit a real Postgres, and watch the full deployed behavior without touching user data.
Goal
Stand up a dev → staging → production promotion flow with environment-separated config, so:
- Local dev and PRs can be QA'd against staging credentials, never prod.
- Schema changes are version-controlled and applied to staging first.
- Prod-encrypted rows can't be decrypted by staging (separate
ENCRYPTION_KEY) and vice versa.
- Staging is reachable only by the dev team, not the public.
Proposed plan
1. Capture the production schema as migrations
- Install Supabase CLI.
supabase db pull against the prod project to dump the current schema.
- Commit the output under
supabase/migrations/ so the schema is finally in version control.
- Worth doing even before the rest of this issue lands — it's the prerequisite for everything else (Branching needs migrations to apply) and removes a single point of failure (the schema only existing in the prod dashboard).
2. Use Supabase Branching for the staging database (primary approach)
- Enable Branching on the existing Supabase project.
- Create a long-lived
staging branch. Migrations from supabase/migrations/ get auto-applied on branch creation.
- Optionally enable the GitHub integration so PRs spawn ephemeral preview branches per PR.
- Staging gets its own connection string + service key — same project, different DB.
- Add a
scripts/seed_staging.py (uses db/connection.py::table()) to insert fake users / documents / sessions. Idempotent.
Caveats to verify before committing:
- Branching is a paid-plan feature (Pro+) with limited free preview-branch quota — confirm this fits the current plan.
- Auth provider config, storage buckets, and edge function secrets don't always copy cleanly across branches; may need to be re-set per-branch.
Fallback: if Branching turns out not to fit (plan cost, missing parity), fall back to a fully separate Supabase project (sapling-staging). Same migrations, same seed script, just a separate project in the dashboard.
3. Duplicate (or extend) the Google OAuth client
Two options:
- Cheap: add staging redirect URIs (
https://staging.<domain>/api/auth/google/callback, …/api/calendar/callback, plus localhost) to the existing client. Mixes scopes but is fastest.
- Clean: create a second OAuth 2.0 client for staging and put its
client_id/client_secret in the staging env file.
4. Split env files
Currently there is one backend/.env (see backend/.env.example) and one frontend/.env.example. Add:
backend/.env.staging — staging Supabase URL + service key (from the staging branch), staging Google client, different ENCRYPTION_KEY, different SESSION_SECRET.
frontend/.env.staging — staging NEXT_PUBLIC_SUPABASE_URL, NEXT_PUBLIC_SUPABASE_ANON_KEY, BACKEND_URL.
- Either parameterize
docker-compose.yml with an ENV_FILE arg, or add a docker-compose.staging.yml override that swaps env_file.
5. Staging needs its own origin
Staging must be served from a different origin than production. Three reasons:
- Google OAuth redirect URIs are exact-match: separate URIs prevent a misconfigured prod build from completing a staging OAuth flow (and vice versa).
- Cookies are scoped by domain: sharing a domain risks the staging
SESSION_SECRET-signed cookie being readable in prod.
- The frontend bakes
BACKEND_URL at build time: staging needs its own frontend deploy pointing at the staging backend.
Use a subdomain (staging.<domain>) once DNS is set up; until then, the Cloudflare Pages preview URL (staging.<project>.pages.dev) works fine. Path-prefix staging (/staging/...) is not a substitute because of cookie scoping.
6. Lock staging behind Cloudflare Access (Zero Trust)
The staging hostname must not be publicly reachable. Approach:
- Add the staging hostname (frontend and backend, both) as an Access application in Cloudflare Zero Trust.
- Policy: require Google SSO + an allowlist of dev emails (or a Google Workspace group).
- Free tier covers up to 50 users — sufficient for the dev team.
- The check happens at Cloudflare's edge, so unauthorized requests never reach the app.
Gotcha: Cloudflare Access intercepts every request to the protected hostname, including the Google OAuth callback (/api/auth/google/callback). Either configure Access to bypass that path, or use Access's Google identity provider so the user is already signed in to Google by the time the OAuth callback fires (usually no added friction).
What we are explicitly not doing for access control: relying on an unguessable URL, robots.txt, an app-level email allowlist as the only gate, or "we just don't share the link." Those are obscurity, not access control.
7. Branch + deploy flow
- Create a long-lived
staging git branch.
- Wire Cloudflare Pages (frontend) and the backend host to also deploy
staging to the staging origin using the staging env files.
- New flow: feature branch → PR into
staging → manual QA on staging (signed in via Cloudflare Access) with a real Google login against the staging Supabase branch → PR staging into main.
8. Local "real" mode
For local dev that hits real services with zero blast radius on prod, point backend/.env at the staging Supabase branch + staging Google client. (NEXT_PUBLIC_LOCAL_MODE=true in the frontend is mock-only and not a substitute for this.)
Suggested first PR
A small, mergeable starting point:
supabase/migrations/0001_initial.sql from the prod pull.
backend/.env.staging.example and frontend/.env.staging.example (committed, no secrets).
docker-compose.staging.yml override.
- A short
docs/decisions/0017-staging-environment.md ADR recording the split (Supabase Branching as primary, Cloudflare Access as the gate).
Subsequent PRs handle: enabling Branching + creating the staging branch, OAuth wiring, Cloudflare Access policy, and deploy config.
Out of scope
- Full IaC (Terraform/Pulumi) for the Supabase project — useful eventually, overkill for now.
- Secrets manager (Doppler / 1Password / sops) — nice to have, not blocking.
- Automated end-to-end tests against staging — separate issue.
Acceptance criteria
Problem
Right now everything we develop and test gets pushed to
main, which is the production branch. Production points at the only Supabase project we have and the only Google OAuth client we have, so any QA we do — uploading a document, signing in with Google, exercising the chat tutor — runs against real prod data and the sameENCRYPTION_KEY. Once we have real users this becomes risky: a bad migration, a half-shipped feature, or a buggy ingestion path could corrupt user data or leak it across rows.We need a staging tier that mirrors production closely enough to catch issues, but is fully isolated so we can sign into Google with a real account, hit a real Postgres, and watch the full deployed behavior without touching user data.
Goal
Stand up a
dev→staging→productionpromotion flow with environment-separated config, so:ENCRYPTION_KEY) and vice versa.Proposed plan
1. Capture the production schema as migrations
supabase db pullagainst the prod project to dump the current schema.supabase/migrations/so the schema is finally in version control.2. Use Supabase Branching for the staging database (primary approach)
stagingbranch. Migrations fromsupabase/migrations/get auto-applied on branch creation.scripts/seed_staging.py(usesdb/connection.py::table()) to insert fake users / documents / sessions. Idempotent.Caveats to verify before committing:
Fallback: if Branching turns out not to fit (plan cost, missing parity), fall back to a fully separate Supabase project (
sapling-staging). Same migrations, same seed script, just a separate project in the dashboard.3. Duplicate (or extend) the Google OAuth client
Two options:
https://staging.<domain>/api/auth/google/callback,…/api/calendar/callback, plus localhost) to the existing client. Mixes scopes but is fastest.client_id/client_secretin the staging env file.4. Split env files
Currently there is one
backend/.env(seebackend/.env.example) and onefrontend/.env.example. Add:backend/.env.staging— staging Supabase URL + service key (from the staging branch), staging Google client, differentENCRYPTION_KEY, differentSESSION_SECRET.frontend/.env.staging— stagingNEXT_PUBLIC_SUPABASE_URL,NEXT_PUBLIC_SUPABASE_ANON_KEY,BACKEND_URL.docker-compose.ymlwith anENV_FILEarg, or add adocker-compose.staging.ymloverride that swapsenv_file.5. Staging needs its own origin
Staging must be served from a different origin than production. Three reasons:
SESSION_SECRET-signed cookie being readable in prod.BACKEND_URLat build time: staging needs its own frontend deploy pointing at the staging backend.Use a subdomain (
staging.<domain>) once DNS is set up; until then, the Cloudflare Pages preview URL (staging.<project>.pages.dev) works fine. Path-prefix staging (/staging/...) is not a substitute because of cookie scoping.6. Lock staging behind Cloudflare Access (Zero Trust)
The staging hostname must not be publicly reachable. Approach:
Gotcha: Cloudflare Access intercepts every request to the protected hostname, including the Google OAuth callback (
/api/auth/google/callback). Either configure Access to bypass that path, or use Access's Google identity provider so the user is already signed in to Google by the time the OAuth callback fires (usually no added friction).What we are explicitly not doing for access control: relying on an unguessable URL,
robots.txt, an app-level email allowlist as the only gate, or "we just don't share the link." Those are obscurity, not access control.7. Branch + deploy flow
staginggit branch.stagingto the staging origin using the staging env files.staging→ manual QA on staging (signed in via Cloudflare Access) with a real Google login against the staging Supabase branch → PRstagingintomain.8. Local "real" mode
For local dev that hits real services with zero blast radius on prod, point
backend/.envat the staging Supabase branch + staging Google client. (NEXT_PUBLIC_LOCAL_MODE=truein the frontend is mock-only and not a substitute for this.)Suggested first PR
A small, mergeable starting point:
supabase/migrations/0001_initial.sqlfrom the prod pull.backend/.env.staging.exampleandfrontend/.env.staging.example(committed, no secrets).docker-compose.staging.ymloverride.docs/decisions/0017-staging-environment.mdADR recording the split (Supabase Branching as primary, Cloudflare Access as the gate).Subsequent PRs handle: enabling Branching + creating the
stagingbranch, OAuth wiring, Cloudflare Access policy, and deploy config.Out of scope
Acceptance criteria
supabase/migrations/and applied to a Supabasestagingbranch (or fallback staging project).ENCRYPTION_KEY,SESSION_SECRET, Supabase connection, and Google OAuth redirect URIs.mainno longer receives un-QA'd changes; merges go viastaging.