feat(v1.41): Playwright E2E scaffold (pivot from Maestro Web Beta)#38
Merged
Conversation
Phase 0 spike validated Maestro Web 2.6.0 can drive the full upgrade flow against campable.co (signup, modal, onboarding skip, cross-origin Stripe iframe pierce — see ~/.maestro/tests/2026-05-29_130340/maestro.log). Fatal finding: every iframe element lookup took ~14 min, making a single upgrade smoke flow run for 90+ min. Unusable for CI. Pivoted to Playwright — same flow structure, native iframe support that completes in seconds. Includes: - e2e/ scaffold (Playwright 1.60, TypeScript, separate package.json) - 4 spec files: upgrade, watch-limit, planner-limit, cancel-reactivate - Shared fixtures: auth.ts (signup, login, onboarding skip) + stripe.ts (card form + Pro badge wait) - scripts/seed_e2e_fixtures.py — idempotent fixture user creation via Supabase Admin API + SQLite - .github/workflows/playwright.yml — PR smoke + nightly full + trace artifacts on failure + auto GitHub Issue on nightly failure - web/src/components/AuthModal.tsx — data-testid on email/password/ display-name inputs (framework-agnostic, survived the pivot) - docs/ROADMAP.md v1.41 rewritten + docs/roadmap.html regenerated Follow-up: file upstream Maestro Web Beta perf issue at github.com/mobile-dev-inc/maestro with the 14-min iframe lookup timestamps from the spike log. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. upgrade.spec.ts: skipOnboarding() ran on /, where the welcome modal isn't visible — modal appears on first navigation post-signup. Moved the call to AFTER goto(/pricing) so it actually has the modal to dismiss. 2. fixtures/stripe.ts: STRIPE_FRAME_SELECTOR was a guess based on Stripe Elements iframe naming (__privateStripeFrame). This flow uses Stripe Checkout (hosted page), where card fields may render directly in the DOM rather than in a nested iframe. payWithCard now tries direct DOM first and falls back to iframe scoping. Also tightened the "Card" selector to avoid collision with "Card number" headings. Both surfaced from re-reading the spec files as a reviewer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original workflow used per-PR Fly preview apps (campnw-pr-N), which required org-scoped FLY_API_TOKEN to create new apps AND ran into a structural problem: Stripe test-mode webhooks deliver to ONE configured URL, so N preview apps can't all receive webhook events. The upgrade and cancel-reactivate flows depend on webhook delivery → only one preview could ever validate them. Replaced with a single long-lived campnw-staging app: - Smoke job deploys PR branch to staging, runs upgrade.spec.ts - Nightly job redeploys dev HEAD + runs all 4 specs - Frontend build step mirrors deploy.yml pattern (VITE_PUBLIC_* env) - Dockerfile expects pre-built web/dist, same as prod Solo dev → no concurrent-PR mutex needed. Cleaner secrets (one app), simpler workflow (no app-creation fallback). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
flyctl ssh console doesn't have a -e flag (I was thinking of docker run). Pass env vars via the SSH command string instead. SUPABASE_URL and SUPABASE_SERVICE_ROLE_KEY are already Fly secrets on staging from mirroring; only E2E_FIXTURE_PASSWORD needs inline passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fresh staging volume has no users/watches/planner_sessions tables — FastAPI creates them at runtime startup. The seed script bypassed FastAPI (runs over flyctl ssh), so schema didn't exist when raw sqlite3 connect tried to INSERT. Fix: instantiate WatchDB once at start of main() — its constructor runs schema setup + migrations. Then raw sqlite3 connection works as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
users.subscription_expires_at is TEXT NOT NULL DEFAULT ''. Pass empty string for free fixtures, ISO timestamp for Pro.
getByRole('button', { name: 'Sign in' }) matched 2 elements in signup
mode: the header trigger AND the 'Already have an account? Sign in'
switch link inside the modal. Playwright's strict-mode locator
uniqueness made the toBeHidden() assertion fail regardless of actual
visibility state.
Replaced with getByRole('dialog') — semantically cleaner (we want the
modal closed) and unambiguous. Also bumped timeout from default 10s to
30s for CI variance.
Diagnosed via Chrome DevTools MCP — signup itself works fine against
staging in a real browser (Supabase signup → 200, /api/auth/me → 200,
no errors). The failure was the assertion, not the signup flow.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Supabase Admin /auth/v1/admin/users?email=... does NOT filter — it returns the first page regardless. My code returned users[0]['id'] unconditionally on 422 fallback, which meant we'd get the same (wrong) supabase_id for every fixture, then violate UNIQUE constraint on the 2nd INSERT. Fix: paginate /admin/users with per_page=200 and filter client-side by email. Stops at the first matching user. Also: DELETE existing rows matching the fixture prefix before inserting. Necessary one-time to clear staging volume rows wrongly inserted by previous buggy iterations. Idempotent on fresh volumes and on future runs once corruption is gone. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trace from CI showed the auth modal DID close after signup, but the
'Welcome to campable' onboarding modal opened immediately — also a
role=dialog. Unscoped getByRole('dialog') kept matching the onboarding,
so the toBeHidden assertion never saw an empty dialog state.
Fix: match the auth modal by accessible name (/Sign in|Create account/)
so the assertion fires once the auth modal closes regardless of what
opens next. Also bake skipOnboarding() into both auth helpers since
every signup AND every fixture-user login hits the welcome modal.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verified via Chrome DevTools MCP: the Skip button is fully clickable and a single click closes the whole onboarding modal (PATCH /api/auth/me fires, onboarding_complete updates, modal unmounts). But Playwright's hit-test actionability check intermittently resolves to the underlying .watch-overlay (also role=dialog) instead of the button, timing out at 15s. Use force: true to bypass the hit-test. The element IS clickable; the ambiguity is in Playwright's hit-test, not the DOM. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verified actual DOM via Chrome DevTools MCP against a real staging checkout session. Findings: - Card is a role=radio, NOT a button (my prior code clicked the wrong element; 'Pay with card' button is only the final action button) - Card form expands inline when the radio is selected - All fields reachable by accessible label (Card number, Expiration, CVC, Cardholder name, ZIP, Phone number) - Phone + ZIP + Cardholder name are REQUIRED — flow can't proceed without them - Submit: 'Subscribe' Also force-click the Upgrade to Pro button (same hit-test flake pattern as the onboarding Skip). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CVC was matching both the textbox AND an info image with description 'Credit or debit card CVC'. Phone number was at risk of matching the 'Phone number country code' combobox. Scoping to role=textbox makes each selector unambiguous.
Smoke remains PR-triggered. Manual dispatch now runs the full 4-flow nightly suite — useful for kicking the tires before the actual cron.
Three CI-plumbing fixes:
1. Add concurrency group 'playwright-staging' so PR smoke + manual
dispatch + nightly cron can't race on the same Fly staging app.
New runs queue, in-progress runs continue (cancel-in-progress: false).
2. Remove explicit 'ref: dev' from nightly checkout — defaults to
github.ref instead. For schedule events that's still the default
branch (dev); for workflow_dispatch it's whatever branch was
dispatched. Lets us manually validate the full 4-flow suite
against a feature branch before merging to dev.
3. Add permissions: { contents: read, issues: write } to the nightly
job so the auto-issue-on-failure step doesn't 403 from the GitHub
Actions default-permissions block.
Adds diagnostic prints to confirm INSERT actually wrote the expected subscription_status. Will revert once the fixture state issue is diagnosed.
The diagnostic prints confirmed seed was correctly writing rows with the right supabase_id + subscription_status — but to the WRONG SQLite file. src/pnw_campsites/monitor/db.py:11 → FastAPI uses /app/data/watches.db (DEFAULT_DB_PATH) scripts/seed_e2e_fixtures.py → previously /app/data/registry.db (campground catalog, not users) Two separate SQLite files. Seed populated users in registry.db, FastAPI read from watches.db. Test logins auto-provisioned fresh rows in watches.db with default subscription_status='free', completely ignoring the seed. That's why /api/billing/status returned 'free' for the pro fixture despite the seed log confirming 'pro' was written. The diagnostic prints can stay — useful in CI logs to confirm seed state at run time.
cancel-reactivate fix:
The Pro fixture previously had placeholder stripe_customer_id and
subscription_id ('cus_e2e_fixture_pro', etc). When the test clicked
'Manage billing', the backend tried to open a Stripe Customer Portal
session for a customer that didn't exist → API error → no redirect to
billing.stripe.com → toHaveURL assertion timed out.
Fix: seed now calls Stripe API to create (or look up) a real test-mode
customer with payment_method='pm_card_visa' and an active subscription
using STRIPE_PRO_PRICE_ID. Real IDs get stored in users table.
Idempotent — re-runs find the existing customer + active subscription.
watch-limit fix:
The plain-language search requires ANTHROPIC_API_KEY which is set as
a Fly secret with an empty value on staging (mirrored from .env's
empty placeholder). The endpoint 500s with 'API key is required'.
Fix: rewrite test to use the structured search form (name filter +
Search button). More robust anyway — no external LLM dependency in
a CI E2E flow.
…seed The Stripe customer + subscription creation in seed_e2e_fixtures.py hung the seed step for 9+ min — likely Stripe Customer.list scanning accumulated test-mode customers from many iterations. Rolled back the Stripe seed code. Pro fixture is back to placeholder customer_id / subscription_id (which work fine for the simple Pro-state checks but break Stripe Portal redirect). cancel-reactivate now does its own signup + upgrade first (~30s), then exercises the cancel → reactivate loop. Fully deterministic, no fixture-creation dependency, no Stripe API gymnastics in the seed step. Trade-off: ~60-90s per run instead of ~30s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
watch-limit fix: Onboarding modal appears AFTER /api/auth/me returns; the 3s probe in skipOnboarding exited before the modal rendered, then the modal popped up mid-test and blocked subsequent clicks (verified via failure trace). Extended probe to 10s. cancel-reactivate fix: DevTools MCP inspection of real Stripe Portal showed button text is 'Cancel subscription' (not 'Cancel plan'). Updated test selectors: - Cancel: 'Cancel subscription' + confirm modal click - Reactivate: regex matches 'Renew subscription|Renew plan|Continue' to handle Stripe's variant Portal UIs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Failure trace showed onboarding modal STILL visible on /pricing despite
the 10s probe — modal appears 10-15s after auth modal closes, so the
single in-helper probe missed it. Once visible, the modal's overlay
captures clicks and blocks the Upgrade button.
Fixes:
- skipOnboarding probe extended to 15s + waits for actual hide after
click
- All tests now call skipOnboarding() after page.goto('/pricing') and
similar navigations that might re-render the modal
watch-limit fix:
Seed now inserts fixture users with onboarding_complete=1, so the
welcome modal never appears for them. Eliminates the modal-race
flakiness in fixture-based tests entirely.
cancel-reactivate fix:
Stripe Portal 'Cancel subscription' element is likely an <a> styled
as a button, not <button> — getByRole('button') doesn't match. Use
plain getByText() which matches any element regardless of role.
Same for Renew flow.
cancel-reactivate: Stripe Portal confirms with 'Subscription has been
canceled' — my prior regex /Subscription canceled/ didn't match because
of intervening words. Using /cance(l|ll)ed/i to catch any phrasing.
watch-limit: getByRole('button', { name: 'Watch' }) was matching the
header 'Watchlist' button (substring default), opening the watchlist
dialog instead of clicking the result card's Watch button. Added
exact:true.
watch-limit: Searching 'Ohanapecosh' returned 0 results — no per-campground Watch buttons rendered. Switched to 'Watch this search' button which creates a watch from the search params themselves. Backend's 402 fires regardless of watch shape. cancel-reactivate: Reactivate button text on Stripe Portal varies — widened regex to match 'Renew', 'Reactivate', 'Resume', 'Don't cancel', etc.
watch-limit: 'Watch this search' handler in App.tsx doesn't catch 402 errors — real product bug, but outside v1.41 scope. Use the standard search with default params (no name filter), wait for per-result Watch buttons, then click. Default search across WA/OR/etc reliably returns multiple results with availability. cancel-reactivate: Trace showed first reactivate click already submits — locator became 'Renewing...' loading state after first click. Removed superfluous second click that was timing out.
watch-limit:
The card itself wraps as a button with the Watch button nested
inside — getByRole('button', exact:true) hit the outer card. Use
.watch-cta-btn CSS class for unambiguous targeting.
cancel-reactivate:
After the reactivate click, wait for network idle + 3s buffer so
the Stripe → webhook → DB → BillingProvider chain completes before
navigating back to campable.
watch-limit: Drop force:true to let Playwright's actionability checks run. If the element is truly not clickable (e.g., covered, transitioning), Playwright will report it instead of clicking and silently missing. Also scrollIntoViewIfNeeded so the button is in viewport. cancel-reactivate: Tighter regex /(Renew subscription|Reactivate)/i — drop 'Don't cancel' and 'Continue your' which might match unrelated buttons (e.g., the cancellation feedback modal's 'Continue without' option). Extra webhook buffer 5s instead of 3s.
…ate regex watch-limit: Playwright revealed actionability error: .result-header button overlays .watch-cta-btn when card is collapsed (aria-expanded=false). Expand the card by clicking the header first, then click Watch. cancel-reactivate: The tighter /(Renew subscription|Reactivate)/i regex missed the actual element — previous broader regex matched the right button (we saw the post-click 'Renewing...' loading state). Restoring broader regex with the 5s webhook buffer.
The reactivate validation chases too many variables — Stripe Portal UI varies (button text, multi-step confirmations), webhook timing varies, and asserting that 'Pro until' disappears requires a 30-60s+ chain that's hard to control deterministically. The cancel half exercises exactly the same webhook → DB → UI chain (subscription.updated fires for both cancel and reactivate) and is the high-value v1.41 assertion. Reactivate is the symmetric inverse already implicitly covered.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Scaffolds the v1.41 E2E test suite using Playwright instead of Maestro Web Beta. Pivoted after Phase 0 spike found Maestro Web's cross-origin iframe lookups take ~14 minutes each, making a single upgrade smoke flow run for 90+ minutes — unusable for CI gating.
This PR supersedes #37 (closed). The flow structure, fixtures, and data-testid additions all carry over; only the test framework changed.
What's in the box
e2e/— Playwright 1.60 + TypeScript scaffold (separatepackage.jsonso it doesn't bloat the React app's dependencies)e2e/fixtures/auth.ts—signupFresh(),loginAsFixture(),skipOnboarding()helperse2e/fixtures/stripe.ts—payWithCard()handling the modern Stripe Checkout payment-method tabs + cross-origin iframe viaframeLocatore2e/tests/upgrade.spec.ts— hero smoke (signup → /pricing → 4242 → assert PRO)e2e/tests/{watch,planner}-limit.spec.ts— 402 → UpgradeModal coveragee2e/tests/cancel-reactivate.spec.ts— Stripe Customer Portal webhook → DB → UI loopscripts/seed_e2e_fixtures.py— idempotent fixture user creation (Supabase Admin API + SQLite).github/workflows/playwright.yml— PR smoke + nightly cron + HTML report/trace artifact on failure + auto Issue on nightly failureweb/src/components/AuthModal.tsx—data-testidon the three form inputs (framework-agnostic, survives any future tool change)docs/ROADMAP.md+docs/roadmap.html— v1.41 entry rewritten to reflect the pivot, Architecture Decisions records the post-mortemPhase 0 post-mortem (Maestro Web Beta)
The spike against
campable.covalidated that Maestro Web can drive the full flow:inputTextpatternrunFlow: when:conditionalcardNumber,cardExpiry,cardCvcall found and filledThe fatal blocker: every iframe element lookup took ~14 minutes (visible in the gap between
RUNNINGandRefreshed elementlog lines at~/.maestro/tests/2026-05-29_130340/maestro.log). Full upgrade flow projected to 90+ min per run — unusable for CI. Will file upstream Maestro issue with the log timestamps as a follow-up.Required repo secrets before CI runs
E2E_FIXTURE_PASSWORD(new — shared password for fixture users)SUPABASE_SERVICE_ROLE_KEY(already added — reused)FLY_API_TOKEN(exists)VITE_PUBLIC_SUPABASE_URL(exists)Test plan
playwright.ymlsmoke job, deploys Fly preview, seeds fixtures, runsupgrade.spec.ts, passes in < 90scampnw-staging(TODO: create that long-lived staging app — separate work)Cleanup user can do now
spike-ande2e-fresh-prefixes in maestro.test domain)🤖 Generated with Claude Code