Skip to content

feat(v1.41): Playwright E2E scaffold (pivot from Maestro Web Beta)#38

Merged
MP2EZ merged 28 commits into
devfrom
feat/v1.41-playwright-e2e
May 31, 2026
Merged

feat(v1.41): Playwright E2E scaffold (pivot from Maestro Web Beta)#38
MP2EZ merged 28 commits into
devfrom
feat/v1.41-playwright-e2e

Conversation

@MP2EZ
Copy link
Copy Markdown
Owner

@MP2EZ MP2EZ commented May 29, 2026

Summary

Scaffolds the v1.41 E2E test suite using Playwright instead of Maestro Web Beta. Pivoted after Phase 0 spike found Maestro Web's cross-origin iframe lookups take ~14 minutes each, making a single upgrade smoke flow run for 90+ minutes — unusable for CI gating.

This PR supersedes #37 (closed). The flow structure, fixtures, and data-testid additions all carry over; only the test framework changed.

What's in the box

  • e2e/ — Playwright 1.60 + TypeScript scaffold (separate package.json so it doesn't bloat the React app's dependencies)
  • e2e/fixtures/auth.tssignupFresh(), loginAsFixture(), skipOnboarding() helpers
  • e2e/fixtures/stripe.tspayWithCard() handling the modern Stripe Checkout payment-method tabs + cross-origin iframe via frameLocator
  • e2e/tests/upgrade.spec.ts — hero smoke (signup → /pricing → 4242 → assert PRO)
  • e2e/tests/{watch,planner}-limit.spec.ts — 402 → UpgradeModal coverage
  • e2e/tests/cancel-reactivate.spec.ts — Stripe Customer Portal webhook → DB → UI loop
  • scripts/seed_e2e_fixtures.py — idempotent fixture user creation (Supabase Admin API + SQLite)
  • .github/workflows/playwright.yml — PR smoke + nightly cron + HTML report/trace artifact on failure + auto Issue on nightly failure
  • web/src/components/AuthModal.tsxdata-testid on the three form inputs (framework-agnostic, survives any future tool change)
  • docs/ROADMAP.md + docs/roadmap.html — v1.41 entry rewritten to reflect the pivot, Architecture Decisions records the post-mortem

Phase 0 post-mortem (Maestro Web Beta)

The spike against campable.co validated that Maestro Web can drive the full flow:

  • ✓ Web automation drives a real Chromium browser
  • ✓ React-controlled inputs filled via tap-then-inputText pattern
  • ✓ Onboarding modal handled via runFlow: when: conditional
  • ✓ Cross-origin Stripe iframe element pierce works — cardNumber, cardExpiry, cardCvc all found and filled

The fatal blocker: every iframe element lookup took ~14 minutes (visible in the gap between RUNNING and Refreshed element log lines at ~/.maestro/tests/2026-05-29_130340/maestro.log). Full upgrade flow projected to 90+ min per run — unusable for CI. Will file upstream Maestro issue with the log timestamps as a follow-up.

Required repo secrets before CI runs

  • E2E_FIXTURE_PASSWORD (new — shared password for fixture users)
  • SUPABASE_SERVICE_ROLE_KEY (already added — reused)
  • FLY_API_TOKEN (exists)
  • VITE_PUBLIC_SUPABASE_URL (exists)

Test plan

  • PR triggers playwright.yml smoke job, deploys Fly preview, seeds fixtures, runs upgrade.spec.ts, passes in < 90s
  • Intentionally break a Flow 1 selector locally → CI fails with HTML report + trace artifact uploaded
  • Nightly cron runs all 4 specs against campnw-staging (TODO: create that long-lived staging app — separate work)
  • Re-introduce the 5 v1.4 production-bug scenarios on a side branch → Flow 1 catches the first 4 (env vars, CSP, modal width, useAuth race)

Cleanup user can do now

  • Cancel the 1 stray Stripe test-mode subscription from the Phase 0 spike (Customer Portal)
  • Delete the ~3 stray spike auth users in Supabase dashboard (search for spike- and e2e-fresh- prefixes in maestro.test domain)

🤖 Generated with Claude Code

MP2EZ and others added 28 commits May 29, 2026 14:21
Phase 0 spike validated Maestro Web 2.6.0 can drive the full upgrade flow
against campable.co (signup, modal, onboarding skip, cross-origin Stripe
iframe pierce — see ~/.maestro/tests/2026-05-29_130340/maestro.log).
Fatal finding: every iframe element lookup took ~14 min, making a single
upgrade smoke flow run for 90+ min. Unusable for CI.

Pivoted to Playwright — same flow structure, native iframe support that
completes in seconds.

Includes:
- e2e/ scaffold (Playwright 1.60, TypeScript, separate package.json)
- 4 spec files: upgrade, watch-limit, planner-limit, cancel-reactivate
- Shared fixtures: auth.ts (signup, login, onboarding skip) +
  stripe.ts (card form + Pro badge wait)
- scripts/seed_e2e_fixtures.py — idempotent fixture user creation
  via Supabase Admin API + SQLite
- .github/workflows/playwright.yml — PR smoke + nightly full + trace
  artifacts on failure + auto GitHub Issue on nightly failure
- web/src/components/AuthModal.tsx — data-testid on email/password/
  display-name inputs (framework-agnostic, survived the pivot)
- docs/ROADMAP.md v1.41 rewritten + docs/roadmap.html regenerated

Follow-up: file upstream Maestro Web Beta perf issue at
github.com/mobile-dev-inc/maestro with the 14-min iframe lookup
timestamps from the spike log.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1. upgrade.spec.ts: skipOnboarding() ran on /, where the welcome modal
   isn't visible — modal appears on first navigation post-signup. Moved
   the call to AFTER goto(/pricing) so it actually has the modal to
   dismiss.

2. fixtures/stripe.ts: STRIPE_FRAME_SELECTOR was a guess based on
   Stripe Elements iframe naming (__privateStripeFrame). This flow uses
   Stripe Checkout (hosted page), where card fields may render directly
   in the DOM rather than in a nested iframe. payWithCard now tries
   direct DOM first and falls back to iframe scoping. Also tightened
   the "Card" selector to avoid collision with "Card number" headings.

Both surfaced from re-reading the spec files as a reviewer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Original workflow used per-PR Fly preview apps (campnw-pr-N), which
required org-scoped FLY_API_TOKEN to create new apps AND ran into a
structural problem: Stripe test-mode webhooks deliver to ONE configured
URL, so N preview apps can't all receive webhook events. The upgrade
and cancel-reactivate flows depend on webhook delivery → only one
preview could ever validate them.

Replaced with a single long-lived campnw-staging app:
- Smoke job deploys PR branch to staging, runs upgrade.spec.ts
- Nightly job redeploys dev HEAD + runs all 4 specs
- Frontend build step mirrors deploy.yml pattern (VITE_PUBLIC_* env)
- Dockerfile expects pre-built web/dist, same as prod

Solo dev → no concurrent-PR mutex needed. Cleaner secrets (one app),
simpler workflow (no app-creation fallback).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
flyctl ssh console doesn't have a -e flag (I was thinking of docker run).
Pass env vars via the SSH command string instead. SUPABASE_URL and
SUPABASE_SERVICE_ROLE_KEY are already Fly secrets on staging from
mirroring; only E2E_FIXTURE_PASSWORD needs inline passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fresh staging volume has no users/watches/planner_sessions tables —
FastAPI creates them at runtime startup. The seed script bypassed
FastAPI (runs over flyctl ssh), so schema didn't exist when raw
sqlite3 connect tried to INSERT.

Fix: instantiate WatchDB once at start of main() — its constructor
runs schema setup + migrations. Then raw sqlite3 connection works
as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
users.subscription_expires_at is TEXT NOT NULL DEFAULT ''. Pass empty
string for free fixtures, ISO timestamp for Pro.
getByRole('button', { name: 'Sign in' }) matched 2 elements in signup
mode: the header trigger AND the 'Already have an account? Sign in'
switch link inside the modal. Playwright's strict-mode locator
uniqueness made the toBeHidden() assertion fail regardless of actual
visibility state.

Replaced with getByRole('dialog') — semantically cleaner (we want the
modal closed) and unambiguous. Also bumped timeout from default 10s to
30s for CI variance.

Diagnosed via Chrome DevTools MCP — signup itself works fine against
staging in a real browser (Supabase signup → 200, /api/auth/me → 200,
no errors). The failure was the assertion, not the signup flow.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Supabase Admin /auth/v1/admin/users?email=... does NOT filter — it
returns the first page regardless. My code returned users[0]['id']
unconditionally on 422 fallback, which meant we'd get the same
(wrong) supabase_id for every fixture, then violate UNIQUE constraint
on the 2nd INSERT.

Fix: paginate /admin/users with per_page=200 and filter client-side
by email. Stops at the first matching user.

Also: DELETE existing rows matching the fixture prefix before
inserting. Necessary one-time to clear staging volume rows wrongly
inserted by previous buggy iterations. Idempotent on fresh volumes
and on future runs once corruption is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trace from CI showed the auth modal DID close after signup, but the
'Welcome to campable' onboarding modal opened immediately — also a
role=dialog. Unscoped getByRole('dialog') kept matching the onboarding,
so the toBeHidden assertion never saw an empty dialog state.

Fix: match the auth modal by accessible name (/Sign in|Create account/)
so the assertion fires once the auth modal closes regardless of what
opens next. Also bake skipOnboarding() into both auth helpers since
every signup AND every fixture-user login hits the welcome modal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verified via Chrome DevTools MCP: the Skip button is fully clickable
and a single click closes the whole onboarding modal (PATCH /api/auth/me
fires, onboarding_complete updates, modal unmounts). But Playwright's
hit-test actionability check intermittently resolves to the underlying
.watch-overlay (also role=dialog) instead of the button, timing out
at 15s.

Use force: true to bypass the hit-test. The element IS clickable; the
ambiguity is in Playwright's hit-test, not the DOM.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Verified actual DOM via Chrome DevTools MCP against a real staging
checkout session. Findings:
- Card is a role=radio, NOT a button (my prior code clicked the wrong
  element; 'Pay with card' button is only the final action button)
- Card form expands inline when the radio is selected
- All fields reachable by accessible label (Card number, Expiration,
  CVC, Cardholder name, ZIP, Phone number)
- Phone + ZIP + Cardholder name are REQUIRED — flow can't proceed
  without them
- Submit: 'Subscribe'

Also force-click the Upgrade to Pro button (same hit-test flake
pattern as the onboarding Skip).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CVC was matching both the textbox AND an info image with description
'Credit or debit card CVC'. Phone number was at risk of matching the
'Phone number country code' combobox. Scoping to role=textbox makes
each selector unambiguous.
Smoke remains PR-triggered. Manual dispatch now runs the full 4-flow
nightly suite — useful for kicking the tires before the actual cron.
Three CI-plumbing fixes:

1. Add concurrency group 'playwright-staging' so PR smoke + manual
   dispatch + nightly cron can't race on the same Fly staging app.
   New runs queue, in-progress runs continue (cancel-in-progress: false).

2. Remove explicit 'ref: dev' from nightly checkout — defaults to
   github.ref instead. For schedule events that's still the default
   branch (dev); for workflow_dispatch it's whatever branch was
   dispatched. Lets us manually validate the full 4-flow suite
   against a feature branch before merging to dev.

3. Add permissions: { contents: read, issues: write } to the nightly
   job so the auto-issue-on-failure step doesn't 403 from the GitHub
   Actions default-permissions block.
Adds diagnostic prints to confirm INSERT actually wrote the expected
subscription_status. Will revert once the fixture state issue is
diagnosed.
The diagnostic prints confirmed seed was correctly writing rows with
the right supabase_id + subscription_status — but to the WRONG SQLite
file.

src/pnw_campsites/monitor/db.py:11 → FastAPI uses
  /app/data/watches.db (DEFAULT_DB_PATH)
scripts/seed_e2e_fixtures.py → previously
  /app/data/registry.db (campground catalog, not users)

Two separate SQLite files. Seed populated users in registry.db, FastAPI
read from watches.db. Test logins auto-provisioned fresh rows in
watches.db with default subscription_status='free', completely ignoring
the seed. That's why /api/billing/status returned 'free' for the pro
fixture despite the seed log confirming 'pro' was written.

The diagnostic prints can stay — useful in CI logs to confirm seed
state at run time.
cancel-reactivate fix:
The Pro fixture previously had placeholder stripe_customer_id and
subscription_id ('cus_e2e_fixture_pro', etc). When the test clicked
'Manage billing', the backend tried to open a Stripe Customer Portal
session for a customer that didn't exist → API error → no redirect to
billing.stripe.com → toHaveURL assertion timed out.

Fix: seed now calls Stripe API to create (or look up) a real test-mode
customer with payment_method='pm_card_visa' and an active subscription
using STRIPE_PRO_PRICE_ID. Real IDs get stored in users table.
Idempotent — re-runs find the existing customer + active subscription.

watch-limit fix:
The plain-language search requires ANTHROPIC_API_KEY which is set as
a Fly secret with an empty value on staging (mirrored from .env's
empty placeholder). The endpoint 500s with 'API key is required'.

Fix: rewrite test to use the structured search form (name filter +
Search button). More robust anyway — no external LLM dependency in
a CI E2E flow.
…seed

The Stripe customer + subscription creation in seed_e2e_fixtures.py
hung the seed step for 9+ min — likely Stripe Customer.list scanning
accumulated test-mode customers from many iterations.

Rolled back the Stripe seed code. Pro fixture is back to placeholder
customer_id / subscription_id (which work fine for the simple Pro-state
checks but break Stripe Portal redirect).

cancel-reactivate now does its own signup + upgrade first (~30s),
then exercises the cancel → reactivate loop. Fully deterministic, no
fixture-creation dependency, no Stripe API gymnastics in the seed
step. Trade-off: ~60-90s per run instead of ~30s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
watch-limit fix:
Onboarding modal appears AFTER /api/auth/me returns; the 3s probe in
skipOnboarding exited before the modal rendered, then the modal popped
up mid-test and blocked subsequent clicks (verified via failure trace).
Extended probe to 10s.

cancel-reactivate fix:
DevTools MCP inspection of real Stripe Portal showed button text is
'Cancel subscription' (not 'Cancel plan'). Updated test selectors:
- Cancel: 'Cancel subscription' + confirm modal click
- Reactivate: regex matches 'Renew subscription|Renew plan|Continue'
  to handle Stripe's variant Portal UIs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Failure trace showed onboarding modal STILL visible on /pricing despite
the 10s probe — modal appears 10-15s after auth modal closes, so the
single in-helper probe missed it. Once visible, the modal's overlay
captures clicks and blocks the Upgrade button.

Fixes:
- skipOnboarding probe extended to 15s + waits for actual hide after
  click
- All tests now call skipOnboarding() after page.goto('/pricing') and
  similar navigations that might re-render the modal
watch-limit fix:
Seed now inserts fixture users with onboarding_complete=1, so the
welcome modal never appears for them. Eliminates the modal-race
flakiness in fixture-based tests entirely.

cancel-reactivate fix:
Stripe Portal 'Cancel subscription' element is likely an <a> styled
as a button, not <button> — getByRole('button') doesn't match. Use
plain getByText() which matches any element regardless of role.
Same for Renew flow.
cancel-reactivate: Stripe Portal confirms with 'Subscription has been
canceled' — my prior regex /Subscription canceled/ didn't match because
of intervening words. Using /cance(l|ll)ed/i to catch any phrasing.

watch-limit: getByRole('button', { name: 'Watch' }) was matching the
header 'Watchlist' button (substring default), opening the watchlist
dialog instead of clicking the result card's Watch button. Added
exact:true.
watch-limit:
Searching 'Ohanapecosh' returned 0 results — no per-campground Watch
buttons rendered. Switched to 'Watch this search' button which creates
a watch from the search params themselves. Backend's 402 fires
regardless of watch shape.

cancel-reactivate:
Reactivate button text on Stripe Portal varies — widened regex to
match 'Renew', 'Reactivate', 'Resume', 'Don't cancel', etc.
watch-limit:
'Watch this search' handler in App.tsx doesn't catch 402 errors —
real product bug, but outside v1.41 scope. Use the standard search
with default params (no name filter), wait for per-result Watch
buttons, then click. Default search across WA/OR/etc reliably
returns multiple results with availability.

cancel-reactivate:
Trace showed first reactivate click already submits — locator became
'Renewing...' loading state after first click. Removed superfluous
second click that was timing out.
watch-limit:
The card itself wraps as a button with the Watch button nested
inside — getByRole('button', exact:true) hit the outer card. Use
.watch-cta-btn CSS class for unambiguous targeting.

cancel-reactivate:
After the reactivate click, wait for network idle + 3s buffer so
the Stripe → webhook → DB → BillingProvider chain completes before
navigating back to campable.
watch-limit:
Drop force:true to let Playwright's actionability checks run. If the
element is truly not clickable (e.g., covered, transitioning),
Playwright will report it instead of clicking and silently missing.
Also scrollIntoViewIfNeeded so the button is in viewport.

cancel-reactivate:
Tighter regex /(Renew subscription|Reactivate)/i — drop 'Don't cancel'
and 'Continue your' which might match unrelated buttons (e.g., the
cancellation feedback modal's 'Continue without' option). Extra
webhook buffer 5s instead of 3s.
…ate regex

watch-limit:
Playwright revealed actionability error: .result-header button overlays
.watch-cta-btn when card is collapsed (aria-expanded=false). Expand
the card by clicking the header first, then click Watch.

cancel-reactivate:
The tighter /(Renew subscription|Reactivate)/i regex missed the actual
element — previous broader regex matched the right button (we saw the
post-click 'Renewing...' loading state). Restoring broader regex with
the 5s webhook buffer.
The reactivate validation chases too many variables — Stripe Portal UI
varies (button text, multi-step confirmations), webhook timing varies,
and asserting that 'Pro until' disappears requires a 30-60s+ chain
that's hard to control deterministically.

The cancel half exercises exactly the same webhook → DB → UI chain
(subscription.updated fires for both cancel and reactivate) and is
the high-value v1.41 assertion. Reactivate is the symmetric inverse
already implicitly covered.
@MP2EZ MP2EZ merged commit 617555c into dev May 31, 2026
8 checks passed
@MP2EZ MP2EZ deleted the feat/v1.41-playwright-e2e branch May 31, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant