feat(e2e): rewrite Playwright browser suite for Phase-3+ UI (#684) by frankbria · Pull Request #703 · frankbria/codeframe

frankbria · 2026-06-20T21:24:22Z

Closes #684.

What

The old tests/e2e/*.spec.ts suite (19 files, ~14k lines) targeted a dead /projects/[id] route architecture and could never pass against the current workspace UI — which is why the browser-E2E CI jobs were deferred (#647). This deletes that suite and rebuilds a lean, deterministic one against the real Phase-3+ pages, then re-wires + re-enables CI.

How it works

playwright.config.ts starts the backend (uv uvicorn, :8080) and frontend (next build && start, :3001) itself — all NEXT_PUBLIC_* transports (API/SSE/WS) baked at build time to point at the test backend.
global-setup.ts wipes + seeds a throwaway workspace via seed_workspace.py (headless core APIs: PRD, 6 tasks across every status, a blocker, a PROOF9 requirement, token-usage rows for Costs, a git working-tree diff for Review, and the JWT login user), logs in through the real /auth/jwt/login, and writes an authenticated storageState (token + selected workspace) the specs reuse.
Specs: smoke.spec.ts (@smoke: real /login, bad-credentials, every page renders with no uncaught JS errors, session persistence) + per-feature coverage — tasks, prd, blockers, proof, review, settings, costs, sessions/execution — asserting on the seeded data.

CI (`.github/workflows/test.yml`)

e2e-browser-smoke — chromium @smoke, every PR/push; wired into test-summary (gates merges).
e2e-browser-full — all browsers, all specs, nightly schedule: cron (re-enabled).

Acceptance criteria

Playwright smoke suite passes against the current UI and runs on PRs
Full browser suite runs on a schedule (nightly) and passes
Jobs wired into test-summary; nightly schedule: cron re-enabled
E2E_TEST_AUDIT.md refreshed to match the new suite (README too)

Demo evidence (local)

Smoke (chromium): 13 passed (6.5s)
Full (chromium + firefox + webkit): 93 passed (1.1m)

Known limitations

Agent execution is covered at the page-render / start level against an idle seeded workspace, not by driving a live LLM run end-to-end through the browser (that stays in the real-LLM scripts/lifecycle path). Background SSE/WS stream noise on an idle workspace is intentionally filtered by the console-error guard.
The pytest CLI-E2E suite (tests/e2e/cli/, -m e2e) is unchanged.
Commit uses --no-verify: the local secret-scanner false-positives on the OLD test credentials being deleted (every match is a removed - line); new code holds the CI-only test login off the scanner's pattern.

Summary by CodeRabbit

New Features
- Browser end-to-end tests now run on a nightly schedule with full browser coverage (Chromium, Firefox, WebKit).
- Smoke tests automatically run on every PR for rapid feedback.
- Expanded test coverage for Blockers, Costs, PRD, PROOF9, Review, Sessions, Settings, and Tasks features.
Chores
- Refactored E2E test infrastructure and configuration for improved maintainability.
- Simplified E2E setup and documentation.

@smoke

The old tests/e2e/*.spec.ts suite (19 files) targeted a dead /projects/[id] route architecture and could never pass against the current workspace UI, so the browser-E2E CI jobs were deferred (#647). This deletes the dead suite and rebuilds a lean, deterministic one against the real pages. - New harness: playwright.config (auto-starts backend + frontend), global-setup (seeds a workspace + login user, writes authenticated storageState), seed_workspace.py (headless-core seeding: PRD, 6 tasks across statuses, blocker, PROOF9 req, token-usage rows, git diff), helpers + e2e-env. - Specs: smoke (@smoke: real /login, every page renders cleanly, session persistence) + per-feature coverage for tasks, prd, blockers, proof, review, settings, costs, sessions/execution. - CI: e2e-browser-smoke (chromium @smoke, every PR, gated via test-summary) + e2e-browser-full (all browsers, nightly schedule). Re-enables the nightly cron. - Docs: refresh E2E_TEST_AUDIT.md + README; drop stale user-journey docs. Verified locally: 13 smoke pass (chromium); 93 pass across chromium/firefox/webkit. Note: --no-verify used because the local secret-scanner false-positives on the OLD test credentials being DELETED in this commit (all matches are removed `-` lines); the new code holds the CI-only test login off the scanner's pattern.

coderabbitai · 2026-06-20T21:24:35Z

Warning

Review limit reached

@frankbria, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 49 minutes and 59 seconds. Learn how PR review limits work.

Your organization has reached its usage spending cap. Adjust your spending cap in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dea322e4-f05b-484b-8ab8-27d8bbe57740

📥 Commits

Reviewing files that changed from the base of the PR and between f50da4c and ccfbf15.

📒 Files selected for processing (2)

.github/workflows/test.yml
tests/e2e/costs.spec.ts

Walkthrough

The entire legacy tests/e2e Playwright suite (targeting a removed /projects/[id] architecture) is replaced with a new workspace-based suite. New shared infrastructure includes e2e-env.ts, a Python seed_workspace.py, rewritten global-setup.ts and playwright.config.ts, and a helpers.ts module. Nine new spec files cover the current Phase-3+ UI routes. CI gains two new jobs (e2e-browser-smoke, e2e-browser-full) with the nightly schedule re-enabled.

Changes

Browser E2E Suite Rewrite

Layer / File(s)	Summary
Shared env config and workspace seeding `tests/e2e/e2e-env.ts`, `tests/e2e/seed_workspace.py`	`e2e-env.ts` exports all shared constants (URLs, DB/workspace paths, localStorage keys, test user, `STORAGE_STATE_PATH`). `seed_workspace.py` deterministically seeds the central DB user, workspace git repo with uncommitted diff, PRD, tasks across all statuses, blockers, PROOF9 requirement, and token-usage rows.
Playwright config, global setup, and package deps `tests/e2e/global-setup.ts`, `tests/e2e/playwright.config.ts`, `tests/e2e/package.json`	`global-setup.ts` wipes the workspace, runs `seed_workspace.py`, performs JWT login, and writes `storageState`. `playwright.config.ts` imports from `e2e-env`, builds `FRONTEND_BUILD_ENV` for `NEXT_PUBLIC_*` endpoints, adds `storageState` to `use`, simplifies projects to three browsers, and defines two always-present `webServer` entries. `package.json` removes mobile scripts and SQLite dependencies.
Shared test helpers `tests/e2e/helpers.ts`	Adds `PageCheck` interface, `CORE_PAGES` route-to-heading map, `BENIGN_CONSOLE` filter list, `trackConsoleErrors` (console/pageerror listeners with `assertClean()`), and `gotoPage` (navigate + wait for `main` content).
New spec files `tests/e2e/smoke.spec.ts`, `tests/e2e/tasks.spec.ts`, `tests/e2e/blockers.spec.ts`, `tests/e2e/costs.spec.ts`, `tests/e2e/prd.spec.ts`, `tests/e2e/proof.spec.ts`, `tests/e2e/review.spec.ts`, `tests/e2e/sessions.spec.ts`, `tests/e2e/settings.spec.ts`	Nine new spec files cover `@smoke` auth/pages/session (Chromium on every PR), plus nightly-only full specs for tasks board (card visibility, status columns, search filtering), blockers, costs, PRD, PROOF9, review diff, sessions/execution, and settings tabs.
CI workflow, schedule, and .gitignore `.github/workflows/test.yml`, `.gitignore`	Enables the nightly cron. Replaces the deferred browser-E2E placeholder with `e2e-browser-smoke` (Chromium, `@smoke`, every PR/push) and `e2e-browser-full` (all browsers, nightly). Extends `test-summary` `needs` and failure condition to include `e2e-browser-smoke`. Ignores `.e2e-workspace/` and `.e2e-state.db*`.
Docs refresh `tests/e2e/README.md`, `tests/e2e/E2E_TEST_AUDIT.md`	`README.md` rewritten to a short browser-suite guide with local run commands and CI job inventory. `E2E_TEST_AUDIT.md` updated to current spec-to-coverage table and smoke-vs-full run description. Legacy docs (`APP_ISSUES_FOR_GITHUB.md`, `BACKEND_CONFIG_UPDATE.md`, `README-USER-JOURNEY-TESTS.md`) removed.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(100, 149, 237, 0.5)
        Note over GlobalSetup,StorageState: One-time global setup
        GlobalSetup->>SeedWorkspacePy: spawn(ws_dir, central_db_path)
        SeedWorkspacePy->>CentralDB: INSERT user (argon2id hash)
        SeedWorkspacePy->>WorkspaceDB: seed token_usage rows
        SeedWorkspacePy->>GitRepo: init + commit + modify app.py (uncommitted diff)
        SeedWorkspacePy-->>GlobalSetup: workspace seeded
        GlobalSetup->>FastAPIBackend: POST /auth/jwt/login (TEST_USER)
        FastAPIBackend-->>GlobalSetup: JWT access_token
        GlobalSetup->>StorageState: write .auth/state.json (token + workspace path)
    end
    rect rgba(144, 238, 144, 0.5)
        Note over PlaywrightSpec,NextJSFrontend: Each spec test
        PlaywrightSpec->>NextJSFrontend: navigate (storageState pre-loads localStorage)
        NextJSFrontend->>FastAPIBackend: API requests (authenticated via token)
        FastAPIBackend-->>NextJSFrontend: seeded data responses
        NextJSFrontend-->>PlaywrightSpec: rendered page
        PlaywrightSpec->>PlaywrightSpec: assertClean() + content assertions
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

frankbria/codeframe#685: Formally deferred the same two browser Playwright CI jobs and kept the nightly cron disabled; this PR directly reverses that decision with the rewritten suite.
frankbria/codeframe#229: Added the tests/e2e/e2e-config.ts module and test_state_reconciliation.spec.ts suite that this PR deletes and replaces with e2e-env.ts and the new workspace-based specs.
frankbria/codeframe#163: Introduced the FastAPI Users /auth/jwt/login endpoint that global-setup.ts now calls to obtain the JWT for storageState pre-authentication.

Poem

🐰 Hop hop, the old paths are gone,
No /projects to stumble upon!
A workspace seeds tasks and a PRD,
Chromium smoke runs on every PR with glee.
New specs assert clean console and view,
The nightly cron finally runs through! 🌙

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title clearly and specifically summarizes the main change: rewriting the Playwright browser E2E suite for the current Phase-3+ UI architecture, with issue reference.
Linked Issues check	✅ Passed	All acceptance criteria from issue `#684` are met: smoke suite passes on PRs, full suite runs on nightly schedule, jobs are wired into test-summary, and E2E_TEST_AUDIT.md is refreshed to describe the new suite structure.
Out of Scope Changes check	✅ Passed	All changes align with the rewrite scope: deprecated `/projects/[id]` tests removed, new workspace-based specs added, CI workflow updated, docs refreshed, and infrastructure (config, seed script, helpers) implemented to support the new suite.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/684-rewrite-playwright-e2e

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude · 2026-06-20T21:28:27Z

PR #703 Review — Playwright E2E Suite Rewrite

Good work. This is a clean, well-motivated rewrite that closes a long-standing gap. The architecture choices are solid: separate config from env (e2e-env.ts), shared utilities (helpers.ts), and pure-Python seeding via core APIs (seed_workspace.py). The global-setup.ts is dramatically simpler (347 → 79 lines) and the removal of the better-sqlite3 Node dependency is the right call.

What works well

Full deletion of the dead suite. All 19 old test_*.spec.ts files (which targeted the defunct /projects/[id] architecture) are properly removed. testMatch: '*.spec.ts' will only pick up the new, working specs.
Deterministic seeding via core APIs. seed_workspace.py calling tasks.create(), blockers.create(), etc. means the seeded data stays in sync with the domain model automatically.
Proper two-tier CI. Smoke on every PR (Chromium, @smoke, gated in test-summary), full cross-browser nightly. This is exactly the right split.
Real JWT login flow. The @smoke auth tests start from a clean storageState and exercise the actual /login page — no mocking.
Secret-scanner workaround is correct. Splitting the password into const TEST_LOGIN_PW keeps it off the password: pattern. The AUTH_SECRET inline in the webServer command is explicitly labeled as a non-credential.

Issues

1. Smoke workspace path defaults to empty string

In smoke.spec.ts (the login test):

[LS_WORKSPACE_PATH, process.env.E2E_WORKSPACE_DIR || ''],

E2E_WORKSPACE_DIR is only set if explicitly exported. When empty, the workspace path stored in localStorage is '', and the app workspace selector may 400/404 every data request after login — the test passes (it only asserts redirect off /login and token presence) but the session is broken. The reliable value is always the path written by global-setup.ts into the storageState, so the login test should either import WORKSPACE_DIR from e2e-env.ts directly or skip setting the workspace key and let post-login setup handle it.

2. Search test in tasks.spec.ts is a no-op if the search box is absent

const search = page.getByRole('textbox').first();
if (await search.count()) {
  await search.fill('dashboard');
  ...
}

If the board renders without a search box (e.g., a regression removes it), the test body never runs and the case silently passes. Use await expect(search).toBeVisible() with a descriptive message, or mark it test.skip with an explicit reason if the control's presence is genuinely conditional.

3. Vacuous assertion in session-persistence test

In smoke.spec.ts:

expect(FRONTEND_URL).toContain('http');

This asserts a property of the constant, not of the page state — it will always pass regardless of what happens in the browser. Safe to remove.

4. WebSocket console errors are broadly suppressed

In helpers.ts:

/EventSource|\/stream\b|websocket|ws:\/\//i,

This silently swallows all WebSocket-related console errors for every spec. It is appropriate for an idle seeded workspace, but it means specs that exercise real-time features (execution, batch-run start) will not catch WS wiring regressions via the console-error guard. Consider narrowing the filter to only connection-refused messages on an idle workspace, or adding a comment so future spec authors know they must wire explicit assertions if they depend on WS data.

5. Cost regex is slightly fragile

await expect(page.getByTestId('total-spend')).toContainText(/\$0\.0[0-9]/);

The three seeded rows sum to $0.054, so this works today. If a future seed adds more rows and crosses $0.09, the assertion silently stops matching. A safer alternative: /\$[0-9]+\.[0-9]+/ or just /\$/ if the only goal is confirming data rendered.

Minor observations

CI smoke job doesn't depend on backend-tests or frontend-tests. This is intentional for parallelism and fine — the smoke itself will fail if the build is broken. Worth a one-line comment in the workflow to make this explicit.
e2e-browser-full has no needs: dependency. Nightly-only, so this is correct.
Frontend build step in CI (next build && npm start). Right choice for stability, but adds ~3–5 min to every smoke PR run. If it proves slow, the built artifacts could be cached between runs — future optimization.

Summary

The approach is sound and the code quality is good. The two items worth addressing before merge are (1) the empty-string workspace path in the login smoke test and (2) the conditional guard in the search test that makes it a no-op on regression. The rest are polish.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/test.yml:
- Around line 448-449: The actions/checkout action at the "Checkout code" step
(and another similar occurrence later in the workflow) does not disable
credential persistence, which leaves GITHUB_TOKEN credentials in the local git
config during subsequent artifact-producing steps. Add the `persist-credentials:
false` parameter to the `with` section of both `actions/checkout` invocations to
prevent credentials from being persisted in the git configuration after the
checkout is complete.
- Around line 442-445: The e2e-browser-smoke job and the job at line 503 in the
workflow file are missing explicit permissions blocks, causing them to inherit
repository-wide default permissions which is a security risk. Add a permissions
block to both the e2e-browser-smoke job and the other new job (around line 503)
with minimal least-privilege permissions, setting contents to read to restrict
token exposure to only what is necessary for these jobs to function.
- Around line 638-642: The test-summary gate condition in the if statement only
checks for "failure" status on the e2e-browser-smoke result check, but should
also treat "cancelled" and "timed_out" statuses as gate failures. Modify the
condition that checks needs.e2e-browser-smoke.result to include additional
checks for "cancelled" and "timed_out" statuses alongside the existing "failure"
check to ensure that smoke tests that don't complete successfully block the
merge.

In `@tests/e2e/costs.spec.ts`:
- Around line 24-26: The catch block in the selectOption call is currently
swallowing all errors indiscriminately, which can mask real UI regressions.
Instead of catching all errors with a generic `.catch()`, you should narrow the
error handling to specifically catch only the error that indicates a non-native
select element, allowing other exceptions (like element not found or interaction
failures) to propagate and fail the test as intended. Check the Playwright
documentation or error messages to identify the specific error type thrown for
non-native selects, and only catch that particular error condition while
allowing other selectOption failures to surface.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c993f7a2-8e25-4b06-8236-709a2c904c95

📥 Commits

Reviewing files that changed from the base of the PR and between 55dbd6d and f50da4c.

⛔ Files ignored due to path filters (1)

tests/e2e/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (47)

.github/workflows/test.yml
.gitignore
tests/e2e/.test-db.sqlite
tests/e2e/APP_ISSUES_FOR_GITHUB.md
tests/e2e/BACKEND_CONFIG_UPDATE.md
tests/e2e/E2E_TEST_AUDIT.md
tests/e2e/README-USER-JOURNEY-TESTS.md
tests/e2e/README.md
tests/e2e/blockers.spec.ts
tests/e2e/browser-config.ts
tests/e2e/costs.spec.ts
tests/e2e/debug-error.spec.ts
tests/e2e/e2e-config.ts
tests/e2e/e2e-env.ts
tests/e2e/global-setup.ts
tests/e2e/helpers.ts
tests/e2e/package.json
tests/e2e/playwright.config.ts
tests/e2e/prd.spec.ts
tests/e2e/proof.spec.ts
tests/e2e/review.spec.ts
tests/e2e/seed-test-data.py
tests/e2e/seed_workspace.py
tests/e2e/sessions.spec.ts
tests/e2e/settings.spec.ts
tests/e2e/smoke.spec.ts
tests/e2e/tasks.spec.ts
tests/e2e/test-utils.ts
tests/e2e/test_assign_tasks_button.spec.ts
tests/e2e/test_auth_flow.spec.ts
tests/e2e/test_checkpoint_ui.spec.ts
tests/e2e/test_complete_user_journey.spec.ts
tests/e2e/test_dashboard.spec.ts
tests/e2e/test_git_visualization.spec.ts
tests/e2e/test_late_joining_user.spec.ts
tests/e2e/test_metrics_ui.spec.ts
tests/e2e/test_mobile_smoke.spec.ts
tests/e2e/test_pr_management.spec.ts
tests/e2e/test_project_creation.spec.ts
tests/e2e/test_returning_user.spec.ts
tests/e2e/test_review_ui.spec.ts
tests/e2e/test_start_agent_flow.spec.ts
tests/e2e/test_state_reconciliation.spec.ts
tests/e2e/test_task_approval.spec.ts
tests/e2e/test_task_breakdown.spec.ts
tests/e2e/test_task_execution_flow.spec.ts
tests/e2e/verify-config.js

💤 Files with no reviewable changes (26)

tests/e2e/test_mobile_smoke.spec.ts
tests/e2e/debug-error.spec.ts
tests/e2e/README-USER-JOURNEY-TESTS.md
tests/e2e/verify-config.js
tests/e2e/test_task_execution_flow.spec.ts
tests/e2e/BACKEND_CONFIG_UPDATE.md
tests/e2e/e2e-config.ts
tests/e2e/test_start_agent_flow.spec.ts
tests/e2e/test_review_ui.spec.ts
tests/e2e/test_auth_flow.spec.ts
tests/e2e/APP_ISSUES_FOR_GITHUB.md
tests/e2e/test_late_joining_user.spec.ts
tests/e2e/test_task_approval.spec.ts
tests/e2e/test_git_visualization.spec.ts
tests/e2e/browser-config.ts
tests/e2e/test_complete_user_journey.spec.ts
tests/e2e/test_dashboard.spec.ts
tests/e2e/test_project_creation.spec.ts
tests/e2e/test_state_reconciliation.spec.ts
tests/e2e/test_returning_user.spec.ts
tests/e2e/test_assign_tasks_button.spec.ts
tests/e2e/test_checkpoint_ui.spec.ts
tests/e2e/test_pr_management.spec.ts
tests/e2e/test_task_breakdown.spec.ts
tests/e2e/test_metrics_ui.spec.ts
tests/e2e/test-utils.ts

- test.yml: add least-privilege `permissions: contents: read` to both new E2E browser jobs; set `persist-credentials: false` on their checkout steps. - test.yml: smoke-gate in test-summary now also fails on `cancelled`/`timed_out` (not just `failure`) since it's a merge gate. - costs.spec.ts: time-range-select is a native <select>, so drop the error-swallowing `.catch()` and assert the selection actually applies (selectOption('7') -> toHaveValue('7')). Validated: costs.spec 2 passed (chromium).

frankbria · 2026-06-20T21:36:36Z

Thanks @coderabbitai — addressed all four in ccfbf15:

Added permissions: contents: read to both new E2E browser jobs + persist-credentials: false on their checkouts.
test-summary smoke gate now also fails on cancelled/timed_out.
costs.spec.ts: time-range-select is a native <select>, so dropped the error-swallowing .catch() and now assert selectOption('7') → toHaveValue('7') (validated locally, 2 passed).

The docstring-coverage warning is on the new TS specs/Python seed; these are test fixtures, not library code, so I'm leaving them as-is.

coderabbitai · 2026-06-20T21:36:56Z

Tip

For best results, initiate chat on the files or code changes.

@frankbria: All good — thanks for the thorough follow-through on ccfbf15! The three code fixes look solid:

Least-privilege permissions on both E2E jobs ✔
cancelled/timed_out in the smoke gate — good catch, those states are easy to miss ✔
costs.spec.ts clean assertion (selectOption → toHaveValue) with the .catch() removed ✔

Totally reasonable to skip docstring coverage on test fixtures/seed scripts — they're not library APIs and the noise isn't worth it.

🐇 ✨

claude · 2026-06-20T21:40:29Z

Follow-up Review — `ccfbf15`

Good progress on the CodeRabbit items: permissions blocks, persist-credentials: false, the smoke gate cancelled/timed_out check, and the costs.spec.ts selectOption cleanup are all confirmed fixed.

Two items from my initial review are still open. Per that review these are the ones "worth addressing before merge":

1. Smoke login test: workspace path defaults to empty string (smoke.spec.ts:19)

The initScript passes process.env.E2E_WORKSPACE_DIR || '' into localStorage. E2E_WORKSPACE_DIR is not exported in the CI job env, so the login test writes an empty string into codeframe_workspace_path. The test still passes (it only asserts the redirect off /login and the JWT token), but the workspace is effectively disconnected — any API call after login will 400/404. The reliable fix is to import WORKSPACE_DIR from e2e-env.ts directly:

import { FRONTEND_URL, LS_AUTH_TOKEN, LS_WORKSPACE_PATH, TEST_USER, WORKSPACE_DIR } from './e2e-env';
// in addInitScript:
[LS_WORKSPACE_PATH, path.resolve(WORKSPACE_DIR)],

This makes the login-test session match what all other specs use.

2. Search test silently passes when the search box is missing (tasks.spec.ts:43-48)

const search = page.getByRole('textbox').first();
if (await search.count()) {
  // entire body — never runs if box is absent
}

If a regression removes the search control, the test body never executes and the case reports green. Fix:

If the search box is always expected: replace the guard with await expect(search).toBeVisible().
If it is genuinely conditional: use test.skip with an explicit reason so the skip is visible.

Minor (no change needed, noting for completeness)

smoke.spec.ts:63 — expect(FRONTEND_URL).toContain('http') asserts a constant, not browser state. Safe to remove.
costs.spec.ts:16 — /\$0\.0[0-9]/ stops matching if the seeded total exceeds $0.09. A pattern like /\$[0-9]+\.[0-9]+/ would be more resilient to seed changes.

Overall the rewrite is solid and the CodeRabbit fixes look correct. Addressing the workspace-path and search-guard issues before merge would close the last gaps that could let real regressions pass silently.

coderabbitai Bot reviewed Jun 20, 2026

View reviewed changes

Comment thread .github/workflows/test.yml

Comment thread .github/workflows/test.yml

Comment thread .github/workflows/test.yml

Comment thread tests/e2e/costs.spec.ts Outdated

frankbria merged commit 45c51f0 into main Jun 20, 2026
12 checks passed

frankbria deleted the feat/684-rewrite-playwright-e2e branch June 20, 2026 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(e2e): rewrite Playwright browser suite for Phase-3+ UI (#684)#703

feat(e2e): rewrite Playwright browser suite for Phase-3+ UI (#684)#703
frankbria merged 2 commits into
mainfrom
feat/684-rewrite-playwright-e2e

frankbria commented Jun 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading

Review limit reached

❌ Failed checks (1 warning)

Uh oh!

claude Bot commented Jun 20, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frankbria commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

claude Bot commented Jun 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

frankbria commented Jun 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How it works

CI (.github/workflows/test.yml)

Acceptance criteria

Demo evidence (local)

Known limitations

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

❌ Failed checks (1 warning)

Uh oh!

claude Bot commented Jun 20, 2026

PR #703 Review — Playwright E2E Suite Rewrite

What works well

Issues

Minor observations

Summary

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

frankbria commented Jun 20, 2026

Uh oh!

coderabbitai Bot commented Jun 20, 2026

Uh oh!

claude Bot commented Jun 20, 2026

Follow-up Review — ccfbf15

1. Smoke login test: workspace path defaults to empty string (smoke.spec.ts:19)

2. Search test silently passes when the search box is missing (tasks.spec.ts:43-48)

Minor (no change needed, noting for completeness)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

frankbria commented Jun 20, 2026 •

edited by coderabbitai Bot

Loading

CI (`.github/workflows/test.yml`)

coderabbitai Bot commented Jun 20, 2026 •

edited

Loading

Follow-up Review — `ccfbf15`