Autonomous browser-testing application that generates user workflows from a short app description, executes them in a real browser, detects issues, and improves future runs using historical memory.
FlowScout runs a sequential agent pipeline for each test run:
Recon → Plan → Orchestrate → Observe → Report
- Reconnaissance — opens the target URL, extracts real DOM structure (headings, buttons, inputs, forms) so agents work with actual selectors, not guesses. If the app requires login, it authenticates first.
- Planning (Planner Agent) — sends the DOM snapshot, project description, priority flows, and historical memory context to Claude to generate 3-8 test workflows with concrete steps and selectors.
- Orchestration (Orchestrator Agent) — Playwright executes each workflow step-by-step, capturing a screenshot, console errors, and network failures after every action.
- Observation (Observer Agent) — analyzes all captured artifacts, generates findings with severity/confidence, fingerprints each issue for deduplication against past runs, and records route risk + flaky step data to memory.
- Reporting (Reporter Agent) — synthesizes a structured report (severity counts, top findings, suggestions) and updates route risk scores in the database.
Agents communicate via PostgreSQL — each reads state from the DB, does its work, and persists results. No direct inter-agent messaging.
Aerospike provides low-latency memory that makes runs smarter over time:
- Issue fingerprints — character n-gram vectors (128-dim) with cosine similarity. New findings are compared against stored fingerprints to mark them as "new" or "known" and link to similar past issues.
- Route risk scores — per-project route pass/fail history. High-risk routes get prioritized in future planning.
- Flaky step detection — step-level pass/fail history. The planner avoids or flags steps that are historically unreliable.
The planner queries get_workflow_priorities() before generating workflows, so each run focuses on what matters most.
FlowScout uses an Airbyte-based connector framework to query external systems:
- GitHub — search repos, list issues, query pull requests. Used to auto-detect and link repos to projects, and to enrich findings with related GitHub issues.
- Jira — search tickets, list issues. Used to cross-reference findings with existing bug reports.
Connectors are registered at startup based on available credentials. The Observer agent calls connectors during _enrich_findings() to add external context to each finding.
apps/web → Next.js 15 frontend dashboard
apps/api → FastAPI backend + REST API
services/agent_runner → Planner, Orchestrator, Observer, Reporter agents
services/browser_worker → Playwright-based browser automation
services/connectors → Airbyte agent connectors (GitHub, Jira)
services/memory → Aerospike-based agent memory layer
services/auth → Auth providers (email/password, extensible to OAuth)
packages/shared → Shared schemas, types, constants
| Layer | Technology |
|---|---|
| Frontend | Next.js 15, React 19, TypeScript, Tailwind 4, shadcn/ui |
| Backend | FastAPI, Pydantic, SQLAlchemy (async) |
| Browser | Playwright |
| Database | PostgreSQL 16 |
| Memory | Aerospike |
| LLM | Claude Sonnet 4 (Anthropic API) |
| Integrations | Airbyte Agent Connectors |
| Auth | Email/password provider (extensible to OAuth/Auth0) |
- Python 3.11+
- Node.js 20+
- Docker (for PostgreSQL and Aerospike)
cp .env.example .envEdit .env with your values:
Required:
| Variable | Description | Example |
|---|---|---|
DATABASE_URL |
PostgreSQL connection string | postgresql+asyncpg://flowscout:flowscout@localhost:5432/flowscout |
AEROSPIKE_HOST |
Aerospike server host | localhost |
AEROSPIKE_PORT |
Aerospike server port | 3000 |
AEROSPIKE_NAMESPACE |
Aerospike namespace | flowscout |
ANTHROPIC_API_KEY |
Anthropic API key for Claude | sk-ant-... |
Optional — Connectors:
| Variable | Description |
|---|---|
GITHUB_TOKEN |
GitHub personal access token (enables GitHub connector) |
JIRA_API_TOKEN |
Jira API token |
JIRA_EMAIL |
Jira account email |
JIRA_DOMAIN |
Jira domain (e.g. mycompany.atlassian.net) |
Optional — Auth / App:
| Variable | Default | Description |
|---|---|---|
API_HOST |
0.0.0.0 |
API bind address |
API_PORT |
8000 |
API port |
FRONTEND_URL |
http://localhost:3000 |
Frontend URL (for CORS) |
ARTIFACTS_DIR |
./artifacts |
Where screenshots are saved |
AUTH0_DOMAIN |
— | Auth0 domain (for OAuth flows) |
AUTH0_CLIENT_ID |
— | Auth0 client ID |
AUTH0_CLIENT_SECRET |
— | Auth0 client secret |
docker compose up -dThis starts:
- PostgreSQL on
localhost:5432 - Aerospike on
localhost:3100(mapped from container port 3000)
cd apps/api
python -m venv .venv && source .venv/bin/activate
pip install -e .
alembic upgrade head # apply database migrations
uvicorn apps.api.main:app --reloadThe API runs at http://localhost:8000. Tables are auto-created on startup; Alembic handles schema migrations.
cd apps/web
npm install
npm run devThe dashboard runs at http://localhost:3000.
| Method | Path | Description |
|---|---|---|
| POST | /projects |
Create a project |
| GET | /projects |
List projects |
| GET | /projects/{id} |
Get project details |
| POST | /projects/{id}/detect-repo |
Search GitHub for matching repos |
| POST | /projects/{id}/link-repo |
Link a GitHub repo to the project |
| POST | /projects/{id}/unlink-repo |
Remove linked repo |
| GET | /projects/{id}/risk-routes |
Get route risk scores |
| Method | Path | Description |
|---|---|---|
| POST | /projects/{id}/runs |
Start a test run (auto_approve: bool) |
| GET | /runs |
List runs (filter by status, project_id) |
| GET | /runs/{id} |
Get run status and summary |
| POST | /runs/{id}/approve |
Approve an awaiting run |
| POST | /runs/{id}/reject |
Cancel an awaiting run |
| GET | /runs/{id}/workflows |
List workflows in a run |
| GET | /runs/{id}/findings |
List findings from a run |
| Method | Path | Description |
|---|---|---|
| GET | /findings/{id} |
Get finding details |
| POST | /integrations/query |
Query external system via connector |
| GET | /health |
Health check |