Skip to content

ci(migrations): bump CLI pin to 2.95.4 + drop atomic-parser workaround#43

Merged
fyoosh merged 1 commit intomainfrom
ci/bump-cli-2.95-and-drop-parser-workaround
Apr 30, 2026
Merged

ci(migrations): bump CLI pin to 2.95.4 + drop atomic-parser workaround#43
fyoosh merged 1 commit intomainfrom
ci/bump-cli-2.95-and-drop-parser-workaround

Conversation

@fyoosh
Copy link
Copy Markdown
Contributor

@fyoosh fyoosh commented Apr 30, 2026

Summary

Reconciles the CI pin with local + prod-push CLI (all now 2.95.4) and removes the DO-block REVOKE+GRANT wrappers from migrations 20260430000002 and 20260430000004. Wrappers were added in PR #41 to dodge a CLI v2.90 parser bug (supabase/cli#5064 — splitter mishandled the substring "atomic" in claim_pairing_atomic); upstream fix shipped in v2.91.1, so the workaround is now dead weight at the pinned version.

The pin bump and the unwrap ship together because the unwrap is only safe at CLI ≥ 2.91.1. Downgrading the pin below that re-opens the SQLSTATE 42601 path.

Test plan

  • supabase db reset --local at CLI 2.95.4 — 5/5 clean runs after unwrap, no 42601
  • YAML lint of workflow file
  • This PR's own CI run exercises the bumped pin against the unwrapped migrations end-to-end

Notes for reviewer

  • Squash will carry both the pin bump and the unwrap as one historical event — see commit message for full archeology (deterministic trigger, fix reference, downgrade hazard).
  • Audit issue P7 originally hypothesised an "inconsistent" parser trigger; investigation shows it was deterministic and keyed on the substring "atomic". The CLAUDE.md note now reflects that.
  • Background: the previous PR (ci(migrations): gate PRs on supabase db reset at pinned CLI 2.90 #42) was supposed to ship at 2.95.4 but the squash carried only the first commit on the branch. This PR closes that gap.

…karound

Two changes that belong together because the second is only safe given the first:

1. **Bump `.github/workflows/migration-smoke.yml` pin from `2.90.0` to `2.95.4`.** PR #42's squash-merge carried only the first commit on its branch (the original 2.90.0 ship) and dropped the in-branch bump to 2.95.4, leaving the CI pin at 2.90.0 while the local Homebrew install and the laptop running `supabase db push` had already moved to 2.95.4. This commit reconciles the three contexts; CLI parser behaviour is now uniform across local / CI / prod.

2. **Drop the DO-block REVOKE+GRANT wrappers from migrations 20260430000002 and 20260430000004**, restoring the bare-statement form. Header comments rewritten as a historical note (the CLI bug, the substring "atomic" trigger, the fix in v2.91.1 `supabase/cli#5064`, why the wrapper is no longer needed, why the >= 2.91.1 pin is now load-bearing).

## Why the workaround is dead weight

The workaround was added in PR #41 against a CLI v2.90 parser bug that hit production via PR #40. The audit (`docs/audits/2026-04-29-server-helpers.md` issue P7) hypothesised an "inconsistent" parser trigger and treated the DO-block as defence-in-depth.

Investigation against `supabase/cli#5064` shows the trigger was actually **deterministic**: the SQL splitter treated any occurrence of the substring `atomic` inside an identifier as the start of a `BEGIN ATOMIC ... END;` block, then bundled subsequent statements into a single prepared-statement Parse message and tripped Postgres SQLSTATE 42601. Our `claim_pairing_atomic` function name is the trigger; the "inconsistency" was actually whether a multi-statement file was present, which only became true after branch-review added REVOKE statements to the originally single-GRANT file.

Verification at CLI 2.95.4 against current main migrations: `supabase db reset --local` run 5x consecutively after unwrapping both DO blocks. All 5 runs applied 38/38 migrations cleanly with no 42601. Combined with the explicit upstream fix in v2.91.1, this is conclusive — the workaround has zero remaining value at the pinned CLI version.

## Why combine with the pin bump

The DO-block drop is only safe at CLI >= 2.91.1. Shipping the unwrap without also pinning >= 2.91.1 would re-open the production-failure path the moment a contributor on a stale CLI runs the migrations. Pinning the bumped version in the same commit makes the dependency unambiguous and keeps `git log` clean: one diff, one rationale.

## Test plan

- [x] `supabase db reset --local` at CLI 2.95.4 — 5/5 clean runs after unwrap, no 42601.
- [x] YAML lint of workflow file (`python3 -c "import yaml; yaml.safe_load(...)"`).
- [ ] CI run on this PR will exercise the bumped pin against the unwrapped migrations against the current full migration history.

## CLI version downgrade hazard

If a contributor downgrades local CLI below 2.91.1 (or a future workflow refactor accidentally bumps the pin DOWN), migrations 02 and 04 will start failing again with 42601. The header comments call this out explicitly, and CLAUDE.md "Build Commands" now says "pin must stay >= v2.91.1". Worth a future hook that asserts CLI >= 2.91.1 at `supabase` invocation time, but out of scope here.

Refs: supabase/cli#5064, PR #41 (workaround), PR #42 (initial CI gate)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
web Ready Ready Preview, Comment Apr 30, 2026 3:39pm

@fyoosh fyoosh merged commit b00b946 into main Apr 30, 2026
3 checks passed
@fyoosh fyoosh deleted the ci/bump-cli-2.95-and-drop-parser-workaround branch April 30, 2026 15:44
fyoosh added a commit that referenced this pull request Apr 30, 2026
Replaces the per-row `Promise.all(UPDATE)` loop in `processSync()` with
a single Postgres function call. Same-batch deletes now collapse to one
round-trip regardless of count.

## The problem

The previous shape fired N round-trip UPDATEs per sync, where N = count
of highlights the device flagged as deleted. The validator caps N at
500 deletes per book × 50 books = 25,000 statements per request worst
case. Realistic per-sync N is much smaller (tens), but at the project's
1k concurrent user scaling target (CLAUDE.md "Scaling Target") even a
modest N saturates the connection pool: 10 deletes × 1k concurrent
users = 10,000 concurrent UPDATEs in flight. Free-tier Supabase pool
exhausts; paid-tier degrades. This is the architectural class of
trade-off the scaling target is designed to reject — the design must
survive 2-10x scale without rewrite.

## The fix

Single function `public.soft_delete_highlights(p_user_id, p_now,
p_rows jsonb)` that takes the delete set as JSONB and runs ONE UPDATE
joined against `jsonb_array_elements(p_rows)`. Mirrors the existing
RPC patterns: `claim_pairing_atomic` (PR #40) and
`increment_transfer_attempt` (transfer.ts).

Function attributes:
  - SECURITY INVOKER — caller (service_role) carries the privilege,
    not the function definer. Matches repo convention for RPCs that
    don't need to escalate.
  - SET search_path = public — hardens against search_path injection
    (same pattern as the audit's harden_handle_new_user_search_path
    migration).
  - REVOKE-from-PUBLIC/anon/authenticated, GRANT-to-service_role —
    repo standard for RPCs only the device API path should reach.
  - Returns ROW_COUNT — currently unused, but cheap to expose for
    future observability and lets the test suite assert behaviour.

The pre-RPC semantics are preserved: rows already in the soft-deleted
state (deleted_at IS NOT NULL) are skipped via the same predicate the
old loop used. No resurrection of server-side deletes.

## Why no DO-block wrapper on the GRANT

CLI is pinned >= 2.95.4 (PR #43) and the supabase/cli#5064 "atomic
substring" parser bug was fixed in v2.91.1. The migration's REVOKE+
GRANT block is bare statements; CI gate proves it parses. Function
name `soft_delete_highlights` doesn't even contain the "atomic"
substring, so it would have parsed cleanly at v2.90 too — but the
pin floor is the load-bearing guarantee, not the function name.

## Test plan

- [x] `npx vitest run` — 267/267 pass. New test group
  `processSync soft-delete RPC batching` adds 4 tests:
    - exactly one RPC call per sync regardless of book/delete count,
      AND zero `.from("highlights").update(...)` calls (proves the
      loop is gone, not just shadowed)
    - RPC args match `{ p_user_id, p_now, p_rows: [...] }` shape
    - no RPC fires when payload has no deletes
    - error response from RPC propagates as `Failed to soft-delete
      highlights: …`
- [x] `npm run check` — 0 errors (14 pre-existing rune warnings,
  unrelated to this change)
- [x] `supabase db reset --local` at CLI 2.95.4 — 39 migrations apply
  clean including 20260430000005
- [ ] EXPLAIN ANALYZE against a seeded user with 100 deletes — left
  for ops verification on a stage env; behaviour is asserted by
  test #1 above

## Mock instrumentation

`tests/helpers.ts` gains `_rpcCalls` and `_updateCalls` arrays so
tests can assert call count + payload shape. Mirrors the existing
`_upsertCalls` instrumentation. No behavioural change to existing
tests; the new arrays are additive.

Refs: docs/audits/2026-04-29-server-helpers.md issue P1, audit PR 3

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fyoosh added a commit that referenced this pull request Apr 30, 2026
* feat(pairing): denormalize user_email onto pairing_codes

Removes a gotrue round-trip from the device-status poll path. Devices
poll /api/pair/status/* every 3 s; on the success path the prior shape
did:

  1. SELECT claimed/expires_at/user_id FROM pairing_codes
  2. supabase.auth.admin.getUserById(user_id)  -- gotrue round-trip

Step (2) existed solely to fetch users.email so the device could show
the claimer's address in the Cloud submenu for confirmation. Stamping
user_email onto pairing_codes at claim time collapses (2) into (1).

## Why the denorm is safe

Stale only if the user mutates their auth email AFTER claim but BEFORE
the device finishes polling — bounded window (max 5 min, the pairing
TTL) and harmless for the one-time confirmation UX (the user sees the
email they had at claim time, which is the email they expect).

## Why the email comes from the session, not auth.admin

Audit P4 originally suggested an `auth.admin.getUserById` call inside
claim_pairing_atomic (claim-time, not poll-time). The route handler
already has `user.email` from `safeGetSession()`, so we feed it through
directly — saves the gotrue round-trip entirely instead of just
shifting it. `user.email` can be null/undefined for some auth flows;
we coerce to "" at the route boundary so the column gets a stable
string (NOT NULL violation can't fire because the column is nullable
anyway, but the contract is cleaner this way).

## RPC signature change

`claim_pairing_atomic(p_user_id, p_pairing_id, p_token_hash)` →
`claim_pairing_atomic(p_user_id, p_pairing_id, p_token_hash, p_user_email)`.
Postgres treats overloaded signatures as distinct objects, so the
migration DROPs the 3-arg version explicitly before CREATE-ing the
4-arg version. Existing GRANT in 20260430000002 was tied to the 3-arg
signature and dies with the DROP; new GRANT for 4-arg lives in this
migration. CLI is pinned >= 2.95.4 (PR #43) so the function name's
"atomic" substring no longer trips the v2.90 parser bug — bare
REVOKE+GRANT statements are safe.

## Rollback symmetry

`rollback_claim_pairing` updated to clear `user_email` alongside
`claimed` and `user_id` so the row's denorm fields stay consistent with
the claim flag. Signature unchanged → existing GRANT in
20260430000004 survives.

## Test plan

- [x] `npx vitest run` — 268/268 pass (claimPairingCode signature
  update propagated through 12 tests; new defensive test for the
  `user_email IS NULL` case in checkPairingStatus)
- [x] `npm run check` — 0 errors
- [x] `supabase db reset --local` at CLI 2.95.4 — 40 migrations apply
  clean including 20260430000006

Refs: docs/audits/2026-04-29-server-helpers.md issue P4

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(pairing): options-bag claimPairingCode + nullable userEmail

Address review feedback on the email denorm:

- Switch claimPairingCode to an options-bag signature
  ({ userId, userEmail, code }) so adjacent same-type string args
  cannot be silently swapped at the call site. WS-B will add more
  denorm fields here; the bag scales without further re-ordering.
- Type userEmail as `string | null` and pass `user.email ?? null`
  from the route. Empty string is a bad sentinel for "unknown
  email"; NULL expresses it honestly through the stack. Operators
  wiring phone-only or OAuth-without-email-scope auth keep working.
- Collapse the dual `?? ""` fallback to a single conversion site
  in checkPairingStatus; pairing_codes.user_email being NULL is
  now a documented, valid state rather than a defensive defence.
- Note in claim/+server.ts that `user.email` must come from the
  server-validated session, never from the request body — locks
  the security-adjacent wiring against future refactors.
- Trim two over-long comments to align with the project rule of
  comments only when WHY is non-obvious.
- Test polish: introduce a callClaim helper to defang signature
  drift across 12 repeated invocations (the wire-shape test keeps
  its explicit literals so a future arg-order change is loud); set
  user_email: null on the unclaimed mock to mirror real DB shape.

* feat(supabase): backfill + harden pairing user_email migration

- Backfill user_email for any pairing_codes rows already claimed
  before the migration applies. Self-heals within 5 minutes via
  TTL regardless, but the UPDATE closes the seam for a deploy
  that lands during an active pairing window.
- Replace `CREATE OR REPLACE FUNCTION claim_pairing_atomic` with
  plain `CREATE FUNCTION` after the `DROP IF EXISTS` of the
  3-arg signature. The 4-arg version is brand new — `OR REPLACE`
  obscured intent and would silently swallow a re-run that found
  the function still present.
- Update the user_email COMMENT to acknowledge that NULL post-claim
  is a valid steady state (operator may wire phone or OAuth-no-email
  auth flows). The read side continues to surface NULL as empty
  string at the API boundary.
- Point the rollback_claim_pairing redeclaration at the canonical
  asymmetry rationale in 20260430000004_create_rollback_claim_pairing_fn.sql
  so a contributor reading 000006 first via grep does not reinvent
  the "do NOT delete the device row" reasoning.

* test(pairing): route-level wiring test for /api/pair/claim

Cover the two wiring guarantees that the unit-tested business
logic in src/lib/server/pairing.ts cannot enforce alone:

- The route handler MUST forward `user.email` from the
  server-validated session, NEVER from the request body. A future
  refactor that read body.email would let a malicious browser
  display any email on the device's pairing-confirmation screen
  (security-adjacent UX). The test passes a body with a different
  email and asserts the session email wins.
- A session user without an email (phone signup, OAuth without
  email scope) flows through as `null`, not as `""`. This is the
  hand-off point for the read-side NULL → "" conversion in
  checkPairingStatus.
- Plus a 401 baseline so the auth gate is regression-proof.

Adopts the existing tests/routes/transfer-confirm.test.ts mocking
pattern (vi.mock the route's deps before importing it).

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant