Skip to content

fix(auth): bootstrap auth state from getSession() before subscribing#1

Merged
ByteStreams-AI merged 2 commits intomainfrom
fix/auth-init-bootstrap
May 2, 2026
Merged

fix(auth): bootstrap auth state from getSession() before subscribing#1
ByteStreams-AI merged 2 commits intomainfrom
fix/auth-init-bootstrap

Conversation

@Bytes0211
Copy link
Copy Markdown
Collaborator

@Bytes0211 Bytes0211 commented May 2, 2026

Summary

Both apps were relying solely on onAuthStateChange firing INITIAL_SESSION on mount to flip the loading state to false. In some browser/SDK combinations a restored session never produces that event, leaving the spinner stuck forever after a refresh — observed today on admin.dialtone.menu and kitchen.dialtone.menu after a fresh paid order.

Fix: extract the session handler so it can be called from both getSession() (immediate, fires once on mount) and onAuthStateChange (reactive, fires on future changes). Handler is idempotent, so a redundant INITIAL_SESSION fire is harmless.

Affects:

  • apps/admin/src/lib/auth-context.tsx
  • apps/kitchen/src/app.tsx

Test plan

  • pnpm ci:fast passes (lint + typecheck + 188 unit tests)
  • Sign in to admin.dialtone.menu, refresh page — confirm board renders without sticking on the spinner
  • Same for kitchen.dialtone.menu
  • Confirm normal sign-in / sign-out flow still works (handler called on SIGNED_IN and SIGNED_OUT events)

🤖 Generated with Claude Code

Greptile Summary

This PR fixes a stuck-spinner regression on page refresh by bootstrapping auth state from getSession() on mount (instead of relying solely on onAuthStateChange firing INITIAL_SESSION), then skipping INITIAL_SESSION in the listener to prevent duplicate DB queries. The pattern is applied consistently across both apps/admin and apps/kitchen.

Confidence Score: 5/5

Safe to merge — fixes a real UX regression with no new logic errors introduced.

Only P2 findings present (missing error logging on getSession()). The core fix is correct: getSession() is idempotent from localStorage, INITIAL_SESSION is safely filtered out to avoid double-fetching, and sign-in/sign-out flows through onAuthStateChange are unaffected.

No files require special attention beyond the minor getSession() error-logging suggestion.

Important Files Changed

Filename Overview
apps/admin/src/lib/auth-context.tsx Extracts inline handler into handleSession, bootstraps via getSession() on mount, and filters INITIAL_SESSION from the listener to prevent duplicate DB queries — error field from getSession() is silently dropped.
apps/kitchen/src/app.tsx Applies the same bootstrap pattern as admin, adds a local Session type inferred from the Supabase client, removes the erroneous setAuthLoading(true) at handler entry, and filters INITIAL_SESSION — same silent getSession() error drop as admin.

Sequence Diagram

sequenceDiagram
    participant Browser
    participant getSession
    participant handleSession
    participant onAuthStateChange
    participant DB

    Browser->>getSession: mount (useEffect)
    getSession-->>handleSession: session | null (immediate, from localStorage)
    handleSession->>DB: loadRestaurantContext / loadStaffAccess
    DB-->>handleSession: staff + restaurant rows
    handleSession->>Browser: setLoading(false)

    onAuthStateChange-->>Browser: INITIAL_SESSION (filtered out — skipped)

    note over Browser,onAuthStateChange: Future auth changes (SIGNED_IN, SIGNED_OUT, TOKEN_REFRESHED)
    onAuthStateChange-->>handleSession: event + session
    handleSession->>DB: loadRestaurantContext / loadStaffAccess
    DB-->>handleSession: updated rows
    handleSession->>Browser: state updated
Loading

Reviews (2): Last reviewed commit: "fix(auth): address Greptile P1+P2 — skip..." | Re-trigger Greptile

Both apps were relying solely on onAuthStateChange firing INITIAL_SESSION
on mount to flip the loading state to false. In some browser/SDK
combinations a restored session never produces an event, leaving the
spinner stuck forever after a refresh.

Extract the session handler so it can be called from getSession()
(immediate, fires once) and onAuthStateChange (reactive, fires on
changes). The handler is idempotent, so a redundant INITIAL_SESSION
fire is harmless.

Affects both apps/admin/src/lib/auth-context.tsx and
apps/kitchen/src/app.tsx.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread apps/kitchen/src/app.tsx Outdated
Comment thread apps/admin/src/lib/auth-context.tsx
…ant loading reset

Two issues called out on PR #1:

P1 (kitchen): setAuthLoading(true) at handler entry could cause a
brief "board → spinner → board" flash on mount when getSession() and
INITIAL_SESSION raced and the first invocation finished its async work
before the second's synchronous preamble. Removing the redundant
setter matches the admin handler pattern (initial useState already
starts at true; only ever flip to false).

P2 (both): When the SDK fires INITIAL_SESSION normally, both
getSession() and the listener handler run concurrently — doubling DB
round-trips (admin: 6 queries instead of 3; kitchen: 4 instead of 2).
Skip INITIAL_SESSION in the listener since getSession() already
bootstraps the initial state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Bytes0211
Copy link
Copy Markdown
Collaborator Author

Addressed both Greptile findings in e3d56a4:

P1 (kitchen flash): Removed setAuthLoading(true) from handleSession entry. The initial useState(true) already starts the spinner; we only ever flip to false. Matches the admin handler pattern. Eliminates the board → spinner → board flash possibility entirely.

P2 (duplicate DB queries): Listener now early-returns when event === 'INITIAL_SESSION'. getSession() bootstraps the initial state authoritatively; the listener only handles subsequent events (SIGNED_IN, SIGNED_OUT, TOKEN_REFRESHED, USER_UPDATED). Reduces mount-time queries back to baseline (admin: 3, kitchen: 2). The original bug we set out to fix still resolves: if INITIAL_SESSION never fires, getSession() covers it.

pnpm ci:fast still green (188 unit tests pass, lint + typecheck clean).

@ByteStreams-AI ByteStreams-AI merged commit 726656c into main May 2, 2026
2 checks passed
@Bytes0211 Bytes0211 deleted the fix/auth-init-bootstrap branch May 2, 2026 17:25
ByteStreams-AI pushed a commit that referenced this pull request May 2, 2026
* docs: capture M8 live-deploy lessons learned + Path A demo proven

Records what actually happened during the live deploy on May 2, 2026, so
the next person doing a fresh setup doesn't re-discover the same gotchas.

m8-live-demo-checklist.md:
- Mark Path A demo items as completed; Vapi item flagged as Path B deferral
- New "Lessons learned" section covering:
  - Three parallel Stripe environments (live / legacy test / sandbox);
    use Stripe Workbench Shell to register webhooks in the right env
  - CORS preflight handling missing from _shared/http.ts (M6 was Vapi-only);
    fixed by adding handlePreflight() and wiring it in admin_create_manual_order
  - Auth init bootstrap pattern (PR #1) — getSession() then onAuthStateChange
    skipping INITIAL_SESSION
  - Stripe success_url 404 on the marketing site (open issue)
  - Path B (Vapi voice integration) deferred — vapi_call_start response
    shape doesn't match Vapi's assistant-request contract

project-status.md:
- Update Current Status header to reflect Path A demo proven live
- Recently Completed: detailed M8 entry covering cloud Supabase provision,
  Stripe Connect, Telnyx 10DLC, demo path, and the two ship-blockers
  found and fixed (CORS, auth init)
- Current Focus: M8 fast-follows (success_url fix, Path B) before M9

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(journal): May 2 entry — M8 Path A demo proven live

Adds the dated developer-journal entry covering:
- Cloud stack provisioned + Path A demo executed end-to-end
- Two ship-blockers fixed in flight (CORS preflight, auth init bootstrap)
- The three-Stripe-environment confusion and Workbench Shell resolution
- Path A vs Path B split decision (Vapi voice deferred)
- Open follow-ups: success_url 404 (sibling dialtone_menu repo), Path B
  rewrite, orphaned pending_payment test orders, Telnyx provisioning
  resolved on its own

Same theme as the lessons-learned + project-status updates on this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: address Greptile P2 — replace concrete production IDs with placeholders

Greptile flagged on PR #2 that real production identifiers — Stripe
Connect account `acct_1TSRbH2ME93XgAX1` and Telnyx TCR brand/campaign
IDs `B0YDBOE` / `CNZVYD6` — were being committed to docs and would
become permanently discoverable in git history.

Replaced with angle-bracket placeholders across:
- developer/m8-live-demo-checklist.md
- docs/project-status.md
- developer/developer-journal.md

Kept as-is (not actually sensitive in our context):
- Supabase project ref `klzznfagrtormretqsgb` — already baked into the
  deployed admin/kitchen JS bundles as VITE_SUPABASE_URL; obfuscating
  in docs while leaking in the bundle is theater
- Telnyx DID `+16296001047` — the customer-facing public phone number
  (will be on Sui's Sushi's published menu and marketing)

The original concrete values are now only in this PR's earlier commits'
git history; future references in docs use the placeholder convention.
Actual values live in Stripe dashboard / Telnyx Mission Control / 1Password.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ByteStreams-AI pushed a commit that referenced this pull request May 2, 2026
Captures the May 2 work after PRs #2/#3/#4 merged so picking up cold in
2-3 days doesn't require re-discovering anything.

README.md
- Status bumped from "M1–M7 complete... Next: M8" to "M1–M8 complete,
  Path A demo proven live"
- Stack updated (Telnyx replaces Twilio; Stripe Connect routing now
  M11 not M7)
- "Where things live" table refreshed: planned-migrations now empty,
  added Edge Function + payment + branding entries, called out that
  customer-facing post-payment pages live in admin (not the marketing
  site at dialtone_menu)
- Added pointer to developer/m8-live-demo-checklist.md for the deploy
  runbook + lessons learned

AGENTS.md
- Current state header rewritten — single paragraph covering today's
  four PRs and where to look next
- Conventions extended with five new bullets agents commonly miss:
  - Browser-callable Edge Functions need handlePreflight() (M8 lesson)
  - Auth init must bootstrap from getSession() before subscribing (PR #1)
  - Per-restaurant branding location + access patterns (RPC vs
    useAuth-loaded row) (PRs #3 + #4)
  - Customer-facing /orders/:id/{paid,cancel} live in admin app, not
    marketing site (PR #4)
  - Three-Stripe-environment gotcha (M8 lesson)
- New "Open follow-ups" section at the bottom with three items —
  admin chrome branding, kitchen Ready SMS, Path B Vapi — each with
  scope, estimated effort, and where to start

developer/developer-journal.md
- New dated entry "May 2 (later)" picking up after the existing
  May 2 entry (which only covered through Path A demo). Documents
  PR #3 + PR #4, the dialtone_menu PR #27 reversal, the SQL cleanup
  of orphaned test orders, and the doc refresh itself
- Decisions section: why columns on restaurants over a separate
  table, why RPC scoped by order_id over slug, why helpers extracted
  for testability, why hexToRgba falls back to opaque on bad input
- Follow-ups section mirrors AGENTS.md "Open follow-ups" — same
  three items so journal readers + agent readers see the same plan

220 unit tests still pass; lint + typecheck green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ByteStreams-AI added a commit that referenced this pull request May 4, 2026
PR #24's full lane started failing on
  vapi_check_availability > returns alternative slots when the
  requested time is unavailable
even though the diff is prompt text only. Same code paths passed on
PR #23 an hour earlier. Two compounding test-setup bugs:

1. Sub-minute drift between iso and date+time. `localDateTimeFromNow`
   returned a millisecond-precision iso but minute-precision date+time.
   The blocked reservation was inserted with iso (e.g., 00:14:35.123Z)
   while the availability call used date+time (00:14:00). The 35-sec
   offset turned the +90-min candidate (which exactly abuts the
   reservation's end) into an apparent overlap, killing the only
   candidate that should have been a valid alternative.

2. UTC midnight boundary. When CI runs late in Chicago afternoon,
   now+5h crosses the UTC date boundary. The DB function's
   alternatives walk constrains candidates to the same UTC day as the
   requested slot, so the negative offsets (-90/-60/-30) all skip,
   leaving only +30/+60/+90 — and with bug #1, none of those produce
   an alternative.

Fix:
- Round localDateTimeFromNow down to the minute so iso/date/time
  align; eliminates bug #1 for every caller of this helper.
- Add tomorrowAtHour(12) helper for tests that need a deterministic
  slot fully inside one UTC day and well inside operating hours.
  Switch the alternatives test to use it.

The DB function is correct; the test setup was the bug. Marks the
AGENTS.md follow-up as fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ByteStreams-AI added a commit that referenced this pull request May 4, 2026
* fix(voice): mandate "anything else?" + broaden finalize trigger

M11 ramp surfaced a wedge: caller orders two items, both land in cart
correctly via add_item_to_order, but finalize_order is never called.
Cart sits at $16.98 untouched, the call goes silent.

Root cause: the prompt told the LLM to call finalize_order "when the
customer says they're done" but never instructed it to ASK if they
were done. The model confirmed each item, called add_item_to_order,
then waited indefinitely for the caller to volunteer "that's it" —
which most callers don't say.

Two prompt fixes (no code path changes):

- Step 4f: ALWAYS ask "Anything else?" after each add_item_to_order.
  Not optional, not tone-dependent. Going silent wedges the call.
- Step 5: broaden the done-signal recognizer from explicit "that's
  it" / "I'm done" to also include implicit no-answers ("no thanks",
  "nope", "no that's all"). Treats any "no" answer to "anything
  else?" as a done signal.

Two new prompt-template tests lock both behaviors so a future prompt
edit can't silently regress.

Captured live during a Sui's call where call_id
019df454-3639-7000-8ed1-acce72d91492 had Cucumber Roll +
California Roll w/ white rice in raw_payload.cart but tool_log ended
at the second add_item_to_order.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: log reservation alternative-slot test as time-of-day flaky

Adds open follow-up #6 capturing the flake observed on PR #24's full
lane. Same test passed on PR #23's run an hour earlier; the failing
PR #24 diff is prompt text only — zero impact on the reservation
availability path. The flake is real and worth tracking, but not
caused by the prompt change.

Captures: where it lives, what triggers it, the proposed real fix
(deterministic slot anchor instead of `now + 5h`).

* test(reservations): pin alternative-slot test to deterministic slot

PR #24's full lane started failing on
  vapi_check_availability > returns alternative slots when the
  requested time is unavailable
even though the diff is prompt text only. Same code paths passed on
PR #23 an hour earlier. Two compounding test-setup bugs:

1. Sub-minute drift between iso and date+time. `localDateTimeFromNow`
   returned a millisecond-precision iso but minute-precision date+time.
   The blocked reservation was inserted with iso (e.g., 00:14:35.123Z)
   while the availability call used date+time (00:14:00). The 35-sec
   offset turned the +90-min candidate (which exactly abuts the
   reservation's end) into an apparent overlap, killing the only
   candidate that should have been a valid alternative.

2. UTC midnight boundary. When CI runs late in Chicago afternoon,
   now+5h crosses the UTC date boundary. The DB function's
   alternatives walk constrains candidates to the same UTC day as the
   requested slot, so the negative offsets (-90/-60/-30) all skip,
   leaving only +30/+60/+90 — and with bug #1, none of those produce
   an alternative.

Fix:
- Round localDateTimeFromNow down to the minute so iso/date/time
  align; eliminates bug #1 for every caller of this helper.
- Add tomorrowAtHour(12) helper for tests that need a deterministic
  slot fully inside one UTC day and well inside operating hours.
  Switch the alternatives test to use it.

The DB function is correct; the test setup was the bug. Marks the
AGENTS.md follow-up as fixed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants