A local web app for generating VPAT 2.5 Accessibility Conformance Reports (ACR). It combines automated accessibility scanning with AI-assisted drafting to evaluate WCAG / Section 508 criteria and produce a .docx report ready for submission.
Everything runs locally — no database, no login, no data leaves your machine (except Anthropic API calls for AI drafting). Projects are saved to ~/.a11ybot/projects/ so you can pick up where you left off across sessions.
- Project Hub — saved projects are listed on launch so you can resume any previous VPAT or start a new one; one active project at a time with a confirmation prompt before switching
- File-backed persistence — projects auto-save to
~/.a11ybot/projects/on every change so you never lose work between sessions - Scope pre-filtering — select which component types your product includes (SaaS/Web, Desktop/Mobile, Hardware, Documentation, Support Services); criteria that don't apply are immediately marked N/A so you only review what's relevant
- Full VPAT 2.5 criteria coverage — all criteria for both editions: Section 508 (~124 criteria) and International/EN 301 549 (~160 criteria)
- Source scan — runs ESLint +
jsx-a11yrules against a local React/JSX/TSX codebase and maps violations to VPAT criteria - Runtime scan — runs Lighthouse against a live URL and maps audit failures to VPAT criteria
- Interview mode — guided Q&A so a PM can answer plain-language questions for each criterion without needing to read the scanner output
- AI drafting — sends criterion definition + evidence to Claude, which produces a conformance level (
Supports,Partially Supports,Does Not Support,Not Applicable) and the formal vendor remarks paragraph - Dynamic model list — connects to OpenRouter to show all currently available models that meet the minimum context length; falls back to a curated static list when offline or no key is configured
- Export — produces a
.docxVPAT 2.5 file including a Compliance Standards table - Editions — Section 508 (WCAG 2.0 + 36 CFR Part 1194) and International (EN 301 549 V3.2.1 + WCAG 2.0/2.1/2.2)
| Requirement | Version |
|---|---|
| Node.js | 20 LTS or later |
| npm | 9+ |
| Anthropic API key | Required for AI drafting (get one) |
No clone, no install — run the published package directly:
npx @richsharples/a11ybot@betaThis starts the app locally and opens it in your browser. Pick a port with --port 6000, or skip the auto-open with --no-browser.
git clone https://github.com/richsharples/A11yBot.git
cd A11yBot
npm installnpm run devOpen http://localhost:5173 in your browser.
If you have existing projects, the Project Hub opens on launch — select one to resume or click + New project. On first launch (no saved projects), the setup wizard opens directly.
The three-step setup wizard asks for:
Step 1 — Product & Contact
- Product name, version, description — appear in the VPAT header
- Contact name and email — appear in the VPAT header
- Anthropic API key — used for AI drafting; leave blank for interview-only mode
Step 2 — Product Scope
- Select every component type your product includes; out-of-scope criteria are pre-marked N/A
- Options: SaaS / Web · Desktop / Mobile App · Hardware · Documentation · Support Services
Step 3 — Edition & Input Mode
- Section 508 — US federal procurement (WCAG 2.0 + 36 CFR Part 1194)
- International (INT) — Combined 508 + EN 301 549 V3.2.1 + WCAG 2.1/2.2
- Input mode: Interview only · Source scan · Runtime scan · Hybrid (recommended)
Use the ← Projects button in the top-left of the review page to return to the Project Hub at any time. The active project is saved automatically before you leave.
To delete a project, hover its card in the Project Hub and click the trash icon.
Once a project is created, use the action bar at the top of the review page:
- Run source scan — scans the local repo path you provided; maps ESLint/jsx-a11y violations to criteria
- Run runtime scan — runs Lighthouse against the URL you provided
- AI draft all — sends all unevaluated criteria to Claude for drafting in parallel (max 5 concurrent)
Scans clear previous results before adding new ones, so re-running after fixing issues shows clean results. Re-scanning also resets any criteria that were at the ai-inferred confidence level (scanner-only assessments with no PM review) so stale inferences don't persist after the underlying code changes.
- Select a chapter in the left sidebar (chapters with scanner findings show an orange badge)
- Select a criterion to open the detail panel
- Review the evidence (scanner findings with file locations, interview responses)
- Use AI Draft in the evidence banner to generate a conformance assessment, or answer the interview question and click Answer + AI Draft
- Adjust the level and remarks if needed, then Confirm & save
Each scanner finding has two action buttons:
- Copy — copies the finding as plain text for pasting into an email
- GitHub issue ↗ — copies a pre-formatted GitHub issue body and opens the new-issue page in your repo
Click Export .docx in the header to download the completed VPAT. The document includes a Compliance Standards table listing the applicable standards and their versions. Criteria still marked Not Evaluated are written as such — acceptable in a draft VPAT.
npm run dev # Start dev server on http://localhost:5173
npm run build # Production build
npm run start # Start production server
npm run lint # Run ESLint
npm run typecheck # TypeScript type check (tsc --noEmit)The regression suite detects scanner and export regressions before each release. It runs the tool against three targets and compares results to committed JSON baselines.
| Target | Match mode | Purpose |
|---|---|---|
| Synthetic fixture | Exact | Committed JSX with 8 deliberate a11y violations — any scanner drift fails immediately |
| A11yBot (self) | ±20% tolerance | Scans the production codebase — catches if the scanner stops finding things |
| cmdk (OSS) | ±20% tolerance | External React library — baseline against code we don't control |
Clone the OSS target (only needed once per machine):
npm run regression:cloneThe dev server must be running (npm run dev in a separate terminal).
npm run regression # compare against baselines — exits 1 on regressionRun this before every release. If it passes, tag. If it regresses, fix first.
After intentional changes to the scanner or criteria (expected findings change), update the baselines:
npm run regression:update # update all targets
npm run regression -- --target fixture # update a single targetThen commit the updated files in tests/regression/baselines/.
- Regression (❌) — export failed, criteria count changed, or scanner found significantly fewer issues than baseline
- Changed (
⚠️ ) — counts drifted outside tolerance but no hard failure (investigate before releasing) - Improved (📈) — scanner found more issues than baseline; run
--update-baselineto accept - Pass (✅) — all checks within tolerance
Run reports are saved to .regression-reports/ (gitignored) for post-mortem inspection.
app/
page.tsx # SPA shell (hub / setup / review routing)
api/
projects/route.ts # GET: list projects; POST: create project
projects/[id]/route.ts # GET/PATCH/DELETE a project by ID (or "active")
criteria/route.ts # GET: criteria structure for edition
criterion/route.ts # PATCH: update criterion state
scan/source/route.ts # POST: source scan (ESLint + jsx-a11y)
scan/runtime/route.ts # POST: Lighthouse scan
ai/draft/route.ts # POST: AI-draft one or all criteria
ai/models/route.ts # GET: live OpenRouter model list (with fallback)
export/route.ts # POST: build .docx
components/
hub/
ProjectHub.tsx # Project list with resume/delete/new actions
review/
CriteriaReview.tsx # Main review shell + orchestrator
CriterionDetail.tsx # Per-criterion editing panel
StatusBar.tsx # Status log at bottom of screen
FindingActions.tsx # Copy/GitHub-issue actions per finding
ProgressBar.tsx # Evaluated/confirmed progress bar
Tooltip.tsx # Lightweight tooltip wrapper
types.ts # Shared types, constants, helpers
setup/
SetupWizard.tsx # Three-step project creation wizard
src/
criteria/
vpat-2.5-508.json # Section 508 criteria (~124 criteria)
vpat-2.5-int.json # International criteria (~160 criteria)
scanners/
source-jsx.ts # ESLint/jsx-a11y scanner
rule-mapping.ts # Builds ESLint rule → criterion ID mapping
lighthouse.ts # Lighthouse runner
ai/
client.ts # Anthropic client
draft.ts # Per-criterion AI orchestration + deriveConfidence()
state/
project.ts # In-memory active project + mutation helpers
project-store.ts # File-backed persistence (~/.a11ybot/projects/)
types.ts # Zod schemas + TypeScript types
| Concern | Choice |
|---|---|
| Framework | Next.js 14 App Router |
| Language | TypeScript |
| Styling | Tailwind CSS |
| LLM | Anthropic Claude (claude-sonnet-4-6) via @anthropic-ai/sdk |
| Source a11y | ESLint 8 + eslint-plugin-jsx-a11y |
| Runtime scan | Lighthouse 12 |
| Export | docx npm package |
| Validation | Zod |
| Logging | Pino → console + vpat-run.log.json |
A11yBot supports local models via Ollama. AI drafting performance varies significantly by model size. Each criterion requires one LLM call; a full 508 run (59 applicable criteria for a web product) can take anywhere from 2 minutes to over an hour depending on the model and hardware.
Tested on Apple Silicon (M-series) and NVIDIA consumer GPUs. Cloud models via OpenRouter are always faster for bulk drafting.
| Model | Size | Speed (tok/s) | Full run estimate | Quality | Notes |
|---|---|---|---|---|---|
qwen2.5:3b |
2 GB | ~35–50 | ~5 min | Good | Best starting point; fast, reliable JSON |
phi4-mini:3.8b |
2.5 GB | ~30–40 | ~7 min | Very good | Excellent instruction following; recommended |
gemma3:4b |
3 GB | ~25–35 | ~8 min | Good | Strong structured output |
qwen2.5:7b |
4.7 GB | ~15–25 | ~15 min | Very good | Better remarks quality than 3b |
mistral:7b |
4.1 GB | ~15–25 | ~15 min | Good | Solid JSON compliance |
qwen2.5-coder:14b |
9 GB | ~8–12 | ~30 min | Excellent | Overkill for VPAT; use 7b instead |
Tips for faster local drafting:
- A11yBot automatically uses serial requests (batch size 1) for Ollama, since parallel requests to a single GPU cause VRAM thrashing and are slower
- If inference is slow, run
ollama psto confirm the model is running on GPU (100% GPU). CPU inference is 10–50× slower - The
:codermodel variants are optimised for code generation; the base variants (qwen2.5:7bnotqwen2.5-coder:7b) produce better natural-language VPAT remarks - Pull a model with
ollama pull <model>, then select it in Settings
A11yBot is open source software licensed under the Apache License 2.0.
Note on the Anthropic API: This tool integrates with the Anthropic API for AI drafting. The code license grants rights to the software only — each user must obtain their own Anthropic API key and is individually bound by Anthropic's Terms of Use.
- Auto-save — projects save automatically on every change (1.5 s debounce) to
~/.a11ybot/projects/. Restarting the server or refreshing the page does not lose work; just return to the Project Hub and resume. - Config file —
~/.a11ybot/config.jsonpersists contact info, AI defaults, and saved products across sessions. Contact info is automatically saved on first project creation. - Scope pre-filtering — criteria marked N/A at creation can be overridden manually in the review. Creating a new project is the only way to change the component scope selection.
- One active project at a time — loading a different project while one is active prompts for confirmation. The previous project is already saved so nothing is lost.
- Project Hub — first screen when saved projects exist; shows all projects with name, version, last-modified time, and a progress bar (% criteria confirmed); one active project at a time with a confirmation prompt before switching
- File-backed persistence — projects auto-save to
~/.a11ybot/projects/{id}.jsonon every change (1.5 s debounce);~/.a11ybot/index.jsontracks the project list; resuming after a restart requires no action - Delete project — hover a project card in the hub and click the trash icon; confirmation dialog prevents accidental deletion
- ← Projects button — back button in the review header returns to the hub without losing project state
- No auto-scan on load — opening a saved project no longer triggers an automatic source+runtime scan; scans are user-initiated
- Rescan resets AI-inferred criteria — re-running a scan clears any criteria that were only at
ai-inferredconfidence (scanner-only, no PM review);ai-draftedandpm-confirmedcriteria are never touched - API routes migrated —
POST /api/projects,GET /api/projects/active,DELETE /api/projects/activereplace the old/api/projectroutes
- Dynamic OpenRouter model list — Settings panel fetches live models from OpenRouter on open; filters to models with
context_length ≥ 32768that produce text output; falls back to a curated static list when no key is configured or the API is unreachable - OpenRouter model grouping — models shown grouped by provider in the selector (
anthropic,google,openai, etc.)
- Review mode — "Confirm & Next →" primary button with optimistic save (instant, no round-trip delay); Space shortcut advances through AI-drafted criteria; "Confirm all N/A" bulk button per chapter
- axe-core runtime scanner — runs alongside Lighthouse in parallel with zero new dependencies (both already transitive deps); ~3× more accessibility rule coverage
- AI draft improvements —
ai-attemptedconfidence prevents re-drafting criteria the AI already reviewed; prompt relaxed to make best-effort assessments; Ollamakeep_alive=10mfixes 10-criteria cutoff; silent API failures now shown as errors - OpenRouter — model list updated to current IDs (
claude-sonnet-4.6,gemini-2.5-flash, etc.); switching provider resets model to a valid default - User config —
~/.a11ybot/config.jsonpersists contact info, AI defaults, and saved products; contact name/email saved on first project creation; product quick-select in wizard - Performance — pino-pretty worker thread removed (10–50ms saved per save); optimistic confirm removes API wait from review hot path; axe + Lighthouse run in parallel per URL
- UX polish — dark header with second action row; auto-scan on project creation; Re-scan button; scanner evidence grouped by rule; N/A chapters dimmed; AI draft progress counter; AppScan elapsed timer; .docx table column widths fixed for Apple Pages/LibreOffice
- Ollama JSON reliability fix and logo panel theme update
- Codebase reorganised for second-developer onboarding:
CriteriaReview.tsx(1106 lines) split intocomponents/review/sub-components;SetupWizard.tsxmoved tocomponents/setup/ buildRuleMappingextracted from API route intosrc/scanners/rule-mapping.ts; confidence logic de-duplicated via exportedderiveConfidence()insrc/ai/draft.ts- Dead auto-update code removed from
criteria-store.ts;tsconfig.jsonexcludes cloned OSS test targets - Added
CONTRIBUTING.mdand.env.examplefor new contributors
- Regression test suite: three-target runner (
fixture,vpat-tool-self,cmdk) with committed JSON baselines; exact-match for synthetic fixture, ±20% tolerance for live codebases;npm run regressionrequired before every release
- Scope pre-filtering: new Step 2 in setup wizard lets you select which component types are present; unselected types are pre-marked N/A (e.g. a web-only product eliminates all 48 hardware criteria)
- Full VPAT 2.5 criteria coverage: Section 508 expanded to ~124 criteria (complete ch4 hardware, ch5 software); International expanded to ~160 criteria (WCAG 2.1 Level A, WCAG 2.2, EN 301 549 Chapters 4–6/11–12)
.docxexport now includes a Compliance Standards table with standard names, reference URLs, and versions
- Initial release: guided setup wizard, interview mode, source scan (ESLint/jsx-a11y), runtime scan (Lighthouse), AI drafting via Claude,
.docxexport