Set up staging environment to stop testing on production

## Problem

Right now everything we develop and test gets pushed to `main`, which is the production branch. Production points at the only Supabase project we have and the only Google OAuth client we have, so any QA we do — uploading a document, signing in with Google, exercising the chat tutor — runs against real prod data and the same `ENCRYPTION_KEY`. Once we have real users this becomes risky: a bad migration, a half-shipped feature, or a buggy ingestion path could corrupt user data or leak it across rows.

We need a staging tier that mirrors production closely enough to catch issues, but is fully isolated so we can sign into Google with a real account, hit a real Postgres, and watch the full deployed behavior without touching user data.

## Goal

Stand up a `dev` → `staging` → `production` promotion flow with environment-separated config, so:
- Local dev and PRs can be QA'd against staging credentials, never prod.
- Schema changes are version-controlled and applied to staging first.
- Prod-encrypted rows can't be decrypted by staging (separate `ENCRYPTION_KEY`) and vice versa.
- Staging is reachable only by the dev team, not the public.

## Proposed plan

### 1. Capture the production schema as migrations
- Install Supabase CLI.
- `supabase db pull` against the prod project to dump the current schema.
- Commit the output under `supabase/migrations/` so the schema is finally in version control.
- Worth doing even before the rest of this issue lands — it's the prerequisite for everything else (Branching needs migrations to apply) and removes a single point of failure (the schema only existing in the prod dashboard).

### 2. Use Supabase Branching for the staging database (primary approach)
- Enable Branching on the existing Supabase project.
- Create a long-lived `staging` branch. Migrations from `supabase/migrations/` get auto-applied on branch creation.
- Optionally enable the GitHub integration so PRs spawn ephemeral preview branches per PR.
- Staging gets its own connection string + service key — same project, different DB.
- Add a `scripts/seed_staging.py` (uses `db/connection.py::table()`) to insert fake users / documents / sessions. Idempotent.

**Caveats to verify before committing:**
- Branching is a paid-plan feature (Pro+) with limited free preview-branch quota — confirm this fits the current plan.
- Auth provider config, storage buckets, and edge function secrets don't always copy cleanly across branches; may need to be re-set per-branch.

**Fallback:** if Branching turns out not to fit (plan cost, missing parity), fall back to a fully separate Supabase project (`sapling-staging`). Same migrations, same seed script, just a separate project in the dashboard.

### 3. Duplicate (or extend) the Google OAuth client
Two options:
- **Cheap**: add staging redirect URIs (`https://staging.<domain>/api/auth/google/callback`, `…/api/calendar/callback`, plus localhost) to the existing client. Mixes scopes but is fastest.
- **Clean**: create a second OAuth 2.0 client for staging and put its `client_id`/`client_secret` in the staging env file.

### 4. Split env files
Currently there is one `backend/.env` (see `backend/.env.example`) and one `frontend/.env.example`. Add:
- `backend/.env.staging` — staging Supabase URL + service key (from the staging branch), staging Google client, **different** `ENCRYPTION_KEY`, **different** `SESSION_SECRET`.
- `frontend/.env.staging` — staging `NEXT_PUBLIC_SUPABASE_URL`, `NEXT_PUBLIC_SUPABASE_ANON_KEY`, `BACKEND_URL`.
- Either parameterize `docker-compose.yml` with an `ENV_FILE` arg, or add a `docker-compose.staging.yml` override that swaps `env_file`.

### 5. Staging needs its own origin
Staging must be served from a different origin than production. Three reasons:
- Google OAuth redirect URIs are exact-match: separate URIs prevent a misconfigured prod build from completing a staging OAuth flow (and vice versa).
- Cookies are scoped by domain: sharing a domain risks the staging `SESSION_SECRET`-signed cookie being readable in prod.
- The frontend bakes `BACKEND_URL` at build time: staging needs its own frontend deploy pointing at the staging backend.

Use a subdomain (`staging.<domain>`) once DNS is set up; until then, the Cloudflare Pages preview URL (`staging.<project>.pages.dev`) works fine. **Path-prefix staging (`/staging/...`) is not a substitute** because of cookie scoping.

### 6. Lock staging behind Cloudflare Access (Zero Trust)
The staging hostname must not be publicly reachable. Approach:
- Add the staging hostname (frontend and backend, both) as an Access application in Cloudflare Zero Trust.
- Policy: require Google SSO + an allowlist of dev emails (or a Google Workspace group).
- Free tier covers up to 50 users — sufficient for the dev team.
- The check happens at Cloudflare's edge, so unauthorized requests never reach the app.

**Gotcha:** Cloudflare Access intercepts every request to the protected hostname, including the Google OAuth callback (`/api/auth/google/callback`). Either configure Access to bypass that path, or use Access's Google identity provider so the user is already signed in to Google by the time the OAuth callback fires (usually no added friction).

What we are explicitly *not* doing for access control: relying on an unguessable URL, `robots.txt`, an app-level email allowlist as the only gate, or "we just don't share the link." Those are obscurity, not access control.

### 7. Branch + deploy flow
- Create a long-lived `staging` git branch.
- Wire Cloudflare Pages (frontend) and the backend host to also deploy `staging` to the staging origin using the staging env files.
- New flow: feature branch → PR into `staging` → manual QA on staging (signed in via Cloudflare Access) with a real Google login against the staging Supabase branch → PR `staging` into `main`.

### 8. Local "real" mode
For local dev that hits real services with zero blast radius on prod, point `backend/.env` at the staging Supabase branch + staging Google client. (`NEXT_PUBLIC_LOCAL_MODE=true` in the frontend is mock-only and not a substitute for this.)

## Suggested first PR

A small, mergeable starting point:
- `supabase/migrations/0001_initial.sql` from the prod pull.
- `backend/.env.staging.example` and `frontend/.env.staging.example` (committed, no secrets).
- `docker-compose.staging.yml` override.
- A short `docs/decisions/0017-staging-environment.md` ADR recording the split (Supabase Branching as primary, Cloudflare Access as the gate).

Subsequent PRs handle: enabling Branching + creating the `staging` branch, OAuth wiring, Cloudflare Access policy, and deploy config.

## Out of scope

- Full IaC (Terraform/Pulumi) for the Supabase project — useful eventually, overkill for now.
- Secrets manager (Doppler / 1Password / sops) — nice to have, not blocking.
- Automated end-to-end tests against staging — separate issue.

## Acceptance criteria

- [ ] Production schema is captured under `supabase/migrations/` and applied to a Supabase `staging` branch (or fallback staging project).
- [ ] Staging is reachable at its own origin (subdomain or Pages preview URL), separate from production.
- [ ] Staging origin is gated by Cloudflare Access — only allowlisted dev emails / Workspace group can reach it.
- [ ] Staging deploys use a separate `ENCRYPTION_KEY`, `SESSION_SECRET`, Supabase connection, and Google OAuth redirect URIs.
- [ ] A developer can sign in with a real Google account against staging and exercise upload → chat without touching prod data.
- [ ] `main` no longer receives un-QA'd changes; merges go via `staging`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set up staging environment to stop testing on production #100

Problem

Goal

Proposed plan

1. Capture the production schema as migrations

2. Use Supabase Branching for the staging database (primary approach)

3. Duplicate (or extend) the Google OAuth client

4. Split env files

5. Staging needs its own origin

6. Lock staging behind Cloudflare Access (Zero Trust)

7. Branch + deploy flow

8. Local "real" mode

Suggested first PR

Out of scope

Acceptance criteria

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Set up staging environment to stop testing on production #100

Description

Problem

Goal

Proposed plan

1. Capture the production schema as migrations

2. Use Supabase Branching for the staging database (primary approach)

3. Duplicate (or extend) the Google OAuth client

4. Split env files

5. Staging needs its own origin

6. Lock staging behind Cloudflare Access (Zero Trust)

7. Branch + deploy flow

8. Local "real" mode

Suggested first PR

Out of scope

Acceptance criteria

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions