Problem statement
No end-to-end test coverage for the question delivery and response flow. Layers 1–4 test structure, components, mock Claude, and real Claude API, but none spin up a real server with a real storage backend
and verify the web UI renders correctly. Broken rendering, broken magic links, or broken question type layouts would only be caught in production.
Proposed outcome
New test script tests/Test-E2E-Playwright.ps1 (Layer 4):
- Starts the dotbot server (core/hooks/dev/Start-Dev.ps1 or equivalent)
- Starts Azurite in a Docker container as the storage backend
- Uses dotbot to publish a question instance for each question type (singleChoice, approval, documentReview)
- Captures the magic link from the delivery output
- Playwright navigates to the magic link and asserts:
- Question title is visible
- Correct UI elements render per question type (options, approve/reject buttons, file review links)
- Submit action completes without errors
- Response payload in storage contains correct SelectedOptionId/ApprovalDecision/FreeText after submit
- Tears down server and Docker container after run
Affected users / use case
Developers and CI pipelines catches rendering and delivery regressions that unit tests miss.
Rough size
M
Additional context, links, mockups
- This is Layer 4 testing the Mothership web surface specifically, distinct from the existing Layer 4 which tests the Claude API workflow
- Requires Docker (Azurite) and Node.js (Playwright) as additional CI dependencies
- Should run on manual trigger or schedule, same as existing Layer 4
- Related: CLAUDE.md test pyramid documentation will need updating
Problem statement
No end-to-end test coverage for the question delivery and response flow. Layers 1–4 test structure, components, mock Claude, and real Claude API, but none spin up a real server with a real storage backend
and verify the web UI renders correctly. Broken rendering, broken magic links, or broken question type layouts would only be caught in production.
Proposed outcome
New test script tests/Test-E2E-Playwright.ps1 (Layer 4):
Affected users / use case
Developers and CI pipelines catches rendering and delivery regressions that unit tests miss.
Rough size
M
Additional context, links, mockups