PRD: Add feature QA state
Executive Summary
Shipcode needs a first-class testing state for each feature so frontend QA can be automated without forcing full-suite local Playwright runs. The system should let a feature declare its routes, critical flows, required UI states, selectors, seeded data expectations, and latest QA results in a machine-readable way. This gives the pipeline and external QA agents a bounded contract for focused verification on PRs, nightly runs, and pre-ship checks.
Implementation Checklist
Problem Statement
Frontend-heavy features currently create a manual QA bottleneck: a human has to repeatedly exercise the same issue flows while implementation changes. Running all browser tests locally is too CPU-heavy, and generic AI QA lacks enough product context to know which feature, states, data, and flows matter. Shipcode needs to encode the QA target per feature so agents can run narrow, useful checks continuously.
Goals
- Let each feature expose a machine-readable QA state that names the feature, routes, critical flows, expected UI states, and test-data assumptions.
- Allow QA runs to execute only the affected feature instead of the entire frontend test surface.
- Record the latest QA run result, failure evidence, and human-readable summary on the feature/issue surface.
- Give the planner, executor, reviewer, and verifier enough QA context to generate and run focused tests for the changed feature.
- Reduce manual frontend QA for issue-oriented workflows by making automated checks reproducible and bounded.
Non-Goals
- This does not replace deterministic unit, integration, or component tests.
- This does not require full autonomous release approval.
- This does not require every existing feature to be backfilled in the first release.
- This does not require a paid hosted QA provider.
- This does not make local full-suite Playwright runs mandatory.
User Stories
-
As a founder shipping frontend features, I want Shipcode to know the QA contract for the current feature so that agents can test the relevant flows without making me manually click through the app.
Acceptance:
- A feature can declare its route scope, critical flows, expected states, and test data assumptions.
- The declared QA state is visible from the issue or feature context.
- The QA state can be consumed by verification without re-asking the user what to test.
-
As a pipeline verifier, I want a scoped QA contract so that verification can run only the changed feature and produce useful pass/fail evidence.
Acceptance:
- Verification can select a feature-scoped browser test target from the QA state.
- Verification reports screenshots, traces, logs, and failure summaries when a scoped QA run fails.
- The QA result is associated with the issue/feature that triggered it.
-
As an AI QA agent, I want explicit scenarios and success criteria so that exploratory browser testing stays bounded and reproducible.
Acceptance:
- The agent receives a list of critical flows and states to inspect.
- The agent is constrained to the feature's declared routes and data assumptions.
- The agent reports failures against named flows instead of a generic app crawl.
System Specification
The system must model QA state as a first-class feature-level contract. A QA state includes a stable feature identifier, route or surface scope, critical user flows, expected UI states, seeded data requirements, selector readiness, run modes, and the latest QA result. QA state may be attached to GitHub issue PRDs, generated from plans, displayed in the desktop app, and consumed by verification runs.
The system must support at least three run modes: local focused runs for the active feature, PR runs for changed or affected features, and scheduled broad runs for regression coverage. QA results must preserve enough evidence for debugging, including status, failed flow names, concise failure summaries, and links or paths to artifacts when available. If no QA state exists for a feature, the system must fall back gracefully to the existing verification behavior and clearly report that feature-scoped QA is unavailable.
The QA contract must be explicit enough for deterministic test generation and AI exploratory testing. It must not require the user to manually maintain brittle implementation details. Stable selectors and seeded data expectations must be encouraged where needed so agents can interact with the UI reliably.
Functional Requirements
- The system must let a feature declare a QA state with a feature identifier, route or surface scope, critical flows, expected states, test-data assumptions, and selector readiness.
- The system must expose QA state from the issue or feature context used by Shipcode's planning and verification phases.
- The system must allow verification to run a focused QA target for a single feature.
- The system must support recording the latest QA result with status, flow-level failures, evidence references, and a concise summary.
- The system must distinguish deterministic browser checks from AI exploratory QA missions while letting both consume the same feature contract.
- The system must let QA runs be triggered for local focused verification, PR verification, and scheduled regression verification.
- The system must surface missing QA state as an actionable gap instead of silently running unrelated tests.
- The system must encourage stable UI selectors for feature-critical controls and states.
- The system must support incremental adoption so newly created features can opt in before legacy features are backfilled.
- The system must preserve existing unit/component test workflows.
Non-Functional Requirements
- Local focused QA runs should minimize CPU usage by defaulting to a single browser project and constrained worker count outside CI.
- QA evidence must be concise enough to review quickly from the issue surface.
- The contract format must be deterministic and easy for agents to parse.
- Missing or stale QA state must not block unrelated pipeline work unless the PRD or plan explicitly requires it.
- The design must avoid tying Shipcode to one paid AI QA vendor.
Feature Phase Breakdown
- Foundation/spec plumbing — define the feature QA state contract, where it lives, how it is associated with an issue or feature, and how missing QA state is represented. In scope: schema, persistence/display contract, and planner/verifier context availability. Out of scope: full browser automation execution. Completion signal: a feature can expose QA state and the verifier can read it.
- Primary feature behavior — use QA state to drive focused frontend verification. In scope: feature-scoped run selection, deterministic browser target support, AI mission prompt/context generation, and QA result recording. Out of scope: replacing all existing tests or requiring a hosted QA vendor. Completion signal: a QA run for one feature records pass/fail evidence against named flows.
- Hardening/verification/shipping polish — make the workflow reliable enough for real issue implementation. In scope: CPU-safe local defaults, missing-state messages, artifact summaries, docs or PRD guidance, and tests around the QA contract. Out of scope: mandatory legacy backfill. Completion signal: a new frontend feature can declare QA state, run focused QA, and show results from the issue surface.
Success Criteria
- A feature can define a QA state with route scope, critical flows, expected states, test-data assumptions, and selector readiness.
- The verifier can consume QA state and run or request verification for only that feature.
- QA results are attached to the relevant issue/feature with pass/fail status, failed flow names, summaries, and evidence references.
- Local focused QA defaults avoid high CPU full-suite behavior.
- The system supports both deterministic browser checks and AI exploratory QA from the same contract.
- Missing QA state produces a clear actionable message and does not masquerade as successful coverage.
- Existing unit/component test behavior remains unchanged.
Out of Scope
- Backfilling QA state for every existing feature in the first implementation.
- Building a complete hosted QA SaaS inside Shipcode.
- Requiring Playwright to run all browser projects locally by default.
- Replacing human release judgment for high-risk product changes.
- Hard-coding a single external AI provider as the only QA execution path.
Dependencies
- Existing GitHub issue PRD workflow and issue detail surfaces.
- Existing planning, execution, review, and verification pipeline phases.
- Existing test infrastructure and frontend component test conventions.
- Any browser automation runner selected by the implementation plan.
- Stable selector conventions for critical frontend interactions.
Verification Plan
- tests: schema/contract tests for QA state, planner/verifier context tests, QA result persistence tests, and renderer tests for displaying missing/passing/failing QA state.
- manual: create or update a frontend issue PRD with QA state, run a focused QA pass for that feature, confirm only the scoped feature is targeted, confirm failure evidence appears on the issue surface, and confirm existing non-QA verification still works when QA state is absent.
Risks & Open Questions
- The contract may become too implementation-specific if it stores brittle selectors instead of stable user-facing flows plus explicit selector readiness.
- Browser automation can still be expensive if CI/local defaults are not conservative.
- AI exploratory QA quality depends on the model and browser agent used, so deterministic checks must remain the baseline for critical flows.
- Need to decide whether QA state is authored directly in PRDs, generated from plans, stored as structured metadata, or a combination of those surfaces.
- Need to decide how scheduled/nightly QA runs map back to issues when no active issue triggered the run.
PRD: Add feature QA state
Executive Summary
Shipcode needs a first-class testing state for each feature so frontend QA can be automated without forcing full-suite local Playwright runs. The system should let a feature declare its routes, critical flows, required UI states, selectors, seeded data expectations, and latest QA results in a machine-readable way. This gives the pipeline and external QA agents a bounded contract for focused verification on PRs, nightly runs, and pre-ship checks.
Implementation Checklist
Problem Statement
Frontend-heavy features currently create a manual QA bottleneck: a human has to repeatedly exercise the same issue flows while implementation changes. Running all browser tests locally is too CPU-heavy, and generic AI QA lacks enough product context to know which feature, states, data, and flows matter. Shipcode needs to encode the QA target per feature so agents can run narrow, useful checks continuously.
Goals
Non-Goals
User Stories
As a founder shipping frontend features, I want Shipcode to know the QA contract for the current feature so that agents can test the relevant flows without making me manually click through the app.
Acceptance:
As a pipeline verifier, I want a scoped QA contract so that verification can run only the changed feature and produce useful pass/fail evidence.
Acceptance:
As an AI QA agent, I want explicit scenarios and success criteria so that exploratory browser testing stays bounded and reproducible.
Acceptance:
System Specification
The system must model QA state as a first-class feature-level contract. A QA state includes a stable feature identifier, route or surface scope, critical user flows, expected UI states, seeded data requirements, selector readiness, run modes, and the latest QA result. QA state may be attached to GitHub issue PRDs, generated from plans, displayed in the desktop app, and consumed by verification runs.
The system must support at least three run modes: local focused runs for the active feature, PR runs for changed or affected features, and scheduled broad runs for regression coverage. QA results must preserve enough evidence for debugging, including status, failed flow names, concise failure summaries, and links or paths to artifacts when available. If no QA state exists for a feature, the system must fall back gracefully to the existing verification behavior and clearly report that feature-scoped QA is unavailable.
The QA contract must be explicit enough for deterministic test generation and AI exploratory testing. It must not require the user to manually maintain brittle implementation details. Stable selectors and seeded data expectations must be encouraged where needed so agents can interact with the UI reliably.
Functional Requirements
Non-Functional Requirements
Feature Phase Breakdown
Success Criteria
Out of Scope
Dependencies
Verification Plan
Risks & Open Questions