Inspect

AI-Powered Browser Testing Platform

Inspect uses AI agents to test websites in real browsers. Give it a natural-language instruction and it launches Playwright, navigates pages, finds bugs, and reports results. It ships as a monorepo of 34 packages covering browser automation, 5 LLM providers, visual regression, accessibility auditing, performance scoring, security scanning, web crawling, stealth browsing, network fault injection, agent benchmarking, agent governance, enterprise RBAC/SSO, self-healing, and more.

Inspired by: Playwright (browser API), Vitest (runner/reporter), Lighthouse (auditing), GitHub CLI (command structure), Vercel (developer experience), Expect (assertions)

Features

AI-Powered Testing -- Describe what to test in plain English; an AI agent drives a real browser
76 CLI Commands -- Organized into 8 groups: Testing, Browser, Quality, Infrastructure, Governance, Enterprise, Data & Workflow, Setup & Info
Interactive TUI -- Form-based interface with instruction history, auto-URL detection, agent/device/mode selectors
REPL Mode -- 14 slash commands (/help, /model, /heal, /generate, /cost, /audit, etc.)
7 Reporter Formats -- list, dot, json, junit, html, markdown, github
5 LLM Providers -- Claude, GPT, Gemini, DeepSeek, Ollama (local)
Visual Regression -- Pixel diff with slider, side-by-side, and overlay comparison modes
Accessibility Auditing -- WCAG 2.2 compliance with iframe support (axe-core)
Performance Scoring -- Lighthouse integration with Core Web Vitals
Security Scanning -- OWASP Top 10, CVE scanning, Nuclei multi-protocol (DNS/TCP/SSL)
Web Crawler -- Sitemap parsing, robots.txt, link discovery, batch scraping, media extraction
Change Tracking -- Scheduled monitoring with text/JSON diffing and webhook notifications
Stealth Browsing -- Fingerprint rotation, anti-detection headers, CAPTCHA detection
Network Fault Injection -- TCP proxy with toxicity presets (slow-3g, flaky-wifi, offline)
WebSocket Mocking -- Fluent API for WS handler mocking, message matching, recording
Browser Profiles -- Encrypted session persistence with cookie management
iFrame/Shadow DOM -- Traversal and element discovery across frames and shadow roots
Agent Benchmarking -- MiniWoB, WebArena, WorkArena suites with reward shaping
CAPTCHA Solving -- Multi-agent swarm architecture with vision detection
Story Testing -- Storybook, Ladle, Histoire story-level visual regression
YAML Workflows -- 14 block types (crawl, track, proxy, benchmark, task, loop, code, etc.)
Microservice Architecture -- Service registry, API gateway, message bus
Credential Vault -- AES-256-GCM encrypted storage with Bitwarden, 1Password, Azure
CI/CD Ready -- JUnit/GitHub reporters, sharding, presets, template generation
Agent Governance -- Audit trail, autonomy levels, permission management, compliance reports
Enterprise -- RBAC (5 roles), SSO (SAML/OIDC/Azure/Okta), multi-tenancy, hybrid LLM routing
Self-Healing -- Smart selector healing with similarity matching and recovery
Session Recording -- rrweb-based session recording with privacy controls
Human-in-the-Loop -- Human approval checkpoints for autonomous tests
Workflow Recording -- Record workflows and export to test scripts
Visual Test Builder -- Drag-and-drop style test step creation
Multi-Agent Scenarios -- Multi-agent orchestration for complex tests
Sandboxed Execution -- Isolated test execution with resource limits
Plugin Marketplace -- Extensible plugin system with hooks
Test Generation -- Page analysis, sitemap-based generation, YAML/instruction export
MCP Server -- Standalone Model Context Protocol server (14 browser tools)
SDK -- 9 methods: act(), extract(), observe(), agent(), navigate(), screenshot(), crawl(), track(), createProxy()

Quick Start

# 1. Install and build
pnpm install && pnpm build

# 2. Check your environment
node apps/cli/dist/index.js doctor

# 3. Run a test
ANTHROPIC_API_KEY=sk-ant-... node apps/cli/dist/index.js test \
  -m "test the login flow" \
  --url https://your-app.com \
  -y

CLI Reference

Inspect ships 76 commands organized into 8 groups.

Testing

Command	Description
`inspect test`	AI-powered browser test with natural-language instructions
`inspect run`	Run a saved test suite or YAML test file
`inspect pr`	Test a GitHub pull request with full git context
`inspect replay`	Replay a previous test run from its trace
`inspect compare`	Compare two test runs side by side
`inspect watch`	Watch for file changes and re-run tests automatically

inspect test -m "test checkout flow" --url https://shop.example.com
inspect test -m "test forms" --headed --agent gpt --mode cua
inspect test -m "test search" --workers 4 --shard 1/3 --grep "login"
inspect pr https://github.com/user/repo/pull/123
inspect run tests/checkout.yaml --retries 2
inspect watch --grep "login" --reporter dot

Browser

Command	Description
`inspect open`	Open a URL in a managed browser session
`inspect screenshot`	Capture a screenshot of a page
`inspect pdf`	Export a page to PDF
`inspect codegen`	Generate test code from browser interactions

inspect open https://example.com --device "iPhone 15"
inspect screenshot https://example.com -o screenshot.png --full-page
inspect pdf https://example.com -o page.pdf
inspect codegen https://example.com

Quality

Command	Description
`inspect a11y`	Run accessibility audit (axe-core, WCAG 2.2)
`inspect lighthouse`	Run Lighthouse performance audit
`inspect security`	Run security scan (OWASP Top 10)
`inspect chaos`	Run chaos/monkey testing (Gremlins.js)
`inspect visual`	Run visual regression comparison

inspect a11y https://example.com --standard wcag22aa
inspect lighthouse https://example.com --budget perf:90,a11y:95
inspect security https://example.com --level full
inspect chaos https://example.com --duration 30s
inspect visual --baseline main --branch feature/ui

Infrastructure

Command	Description
`inspect serve`	Start the REST API server
`inspect tunnel`	Create a Cloudflare tunnel to the API server
`inspect sessions`	Manage browser sessions
`inspect mcp`	Start the MCP (Model Context Protocol) tool server

inspect serve --port 3000 --auth jwt
inspect tunnel --subdomain my-inspect
inspect sessions list
inspect mcp

Governance

Command	Description
`inspect trail`	Show agent audit trail
`inspect autonomy`	Manage agent autonomy level
`inspect permissions`	Manage agent permissions (domains, actions)
`inspect cost`	Show session cost breakdown

inspect trail --limit 50
inspect trail --compliance eu-ai-act
inspect autonomy --level supervision
inspect permissions --allow-domain example.com
inspect permissions --block-action navigate
inspect cost --json

Enterprise

Command	Description
`inspect rbac`	Manage role-based access control
`inspect tenant`	Manage tenant plans and quotas
`inspect sso`	Configure Single Sign-On providers

inspect rbac
inspect rbac --role admin
inspect tenant --plan enterprise
inspect tenant --name "Acme Corp" --plan team
inspect sso --provider saml --sso-url https://idp.example.com/sso

Data & Workflow

Command	Description
`inspect extract`	Extract structured data from a page
`inspect crawl`	Crawl a website and extract content
`inspect track`	Monitor pages for content changes
`inspect proxy`	Network fault injection proxy server
`inspect benchmark`	Run agent benchmarks (miniwob, webarena, workarena)
`inspect workflow`	Run or create YAML workflows
`inspect credentials`	Manage the encrypted credential vault

inspect extract https://example.com -s '{"title": "string", "price": "number"}'
inspect crawl https://example.com --depth 3 --max-pages 100 --format json
inspect track https://example.com/pricing --interval 3600
inspect proxy start --preset slow-3g --upstream localhost:3000
inspect proxy presets
inspect benchmark run --suite miniwob --concurrency 2
inspect workflow run tests/e2e.yaml
inspect workflow create
inspect credentials set STAGING_PASSWORD

Setup & Info

Command	Description
`inspect init`	Initialize project config and CI templates
`inspect doctor`	Check environment, dependencies, and API keys
`inspect generate`	Generate test files from descriptions
`inspect audit`	Audit project dependencies and config
`inspect install`	Install browser binaries (Chromium, Firefox, WebKit)
`inspect show-report`	Open a generated report in the browser
`inspect show-trace`	Open a trace file in the viewer
`inspect devices`	List available device presets (25 devices)
`inspect agents`	List available AI agents and their capabilities
`inspect models`	List available LLM models across all providers
`inspect completions`	Generate shell completions (bash/zsh/fish)
`inspect alias`	Manage command aliases
`inspect engine`	Manage browser engine settings

inspect init
inspect init --ci github-actions
inspect doctor --json
inspect devices --format json
inspect models --provider anthropic
inspect completions --shell zsh >> ~/.zshrc

SDK Usage

import { Inspect } from "@inspect/sdk";

const inspect = new Inspect({
  apiKey: process.env.ANTHROPIC_API_KEY,
  headless: true,
});

await inspect.init();

// Navigate
await inspect.navigate("https://example.com");

// Execute a single action
await inspect.act("Click the login button");

// Extract structured data
const data = await inspect.extract("Get all product prices");

// Get suggested actions
const actions = await inspect.observe("What can I do on this page?");

// Run a multi-step autonomous agent
const result = await inspect.agent("Complete the checkout flow", {
  maxSteps: 20,
});

// Crawl a website
const crawled = await inspect.crawl("https://example.com", {
  depth: 3,
  maxPages: 100,
});

// Track changes on pages
const changes = await inspect.track(["https://example.com/pricing"], {
  interval: 3600,
});

// Start a fault injection proxy
const proxy = await inspect.createProxy({ preset: "slow-3g" });
// ... run tests with degraded network ...
await proxy.stop();

await inspect.close();

Configuration

Config File

Create inspect.config.ts (or .js, .json) in your project root:

import { defineConfig } from "@inspect/sdk";

export default defineConfig({
  provider: "anthropic",
  headless: true,
  device: "Desktop Chrome",
  timeout: 30_000,
  retries: 2,
  reporter: ["list", "html"],
  outputDir: "./inspect-results",
});

Presets

inspect test -m "test login" --preset ci        # CI-optimized (headless, retries, junit)
inspect test -m "test login" --preset fast       # Fast mode (reduced timeouts)
inspect test -m "test login" --preset thorough   # Thorough (more steps, screenshots)

Performance Budgets

inspect lighthouse https://example.com \
  --budget perf:90,a11y:100,bp:90,seo:90,pwa:50

CI/CD Integration

GitHub Actions

# .github/workflows/inspect.yml
name: Inspect Tests
on: [push, pull_request]
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: pnpm install && pnpm build
      - run: npx inspect test -m "test critical flows" --preset ci --reporter github
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

GitLab CI

# .gitlab-ci.yml
inspect:
  image: node:20
  before_script:
    - npm i -g pnpm && pnpm install && pnpm build
  script:
    - npx inspect test -m "test critical flows" --preset ci --reporter junit
  artifacts:
    reports:
      junit: inspect-results/junit.xml

CircleCI

# .circleci/config.yml
version: 2.1
jobs:
  inspect:
    docker:
      - image: cimg/node:20.0-browsers
    steps:
      - checkout
      - run: npm i -g pnpm && pnpm install && pnpm build
      - run: npx inspect test -m "test critical flows" --preset ci --reporter junit
      - store_test_results:
          path: inspect-results

Sharding

Split tests across CI workers for parallel execution:

# Worker 1 of 3
inspect run tests/ --shard 1/3 --reporter junit

# Worker 2 of 3
inspect run tests/ --shard 2/3 --reporter junit

# Worker 3 of 3
inspect run tests/ --shard 3/3 --reporter junit

Template Generation

inspect init --ci github-actions   # Generate .github/workflows/inspect.yml
inspect init --ci gitlab-ci        # Generate .gitlab-ci.yml
inspect init --ci circleci         # Generate .circleci/config.yml

Architecture

inspect/
├── apps/
│   └── cli/                  CLI (Commander + Ink TUI, 76 commands)
├── packages/
│   ├── shared/               Types (160+), utils (24), constants, device presets
│   ├── observability/        Analytics, tracing, metrics, logging, cost intelligence
│   ├── browser/              Playwright, ARIA, DOM, vision, profiles, backends (Lightpanda)
│   ├── devices/              Device presets and device pool
│   ├── llm/                  5 LLM providers (Claude, GPT, Gemini, DeepSeek, Ollama)
│   ├── agent/                Backward-compat facade (re-exports llm, agent-*)
│   ├── agent-memory/         Action cache, short/long-term memory, compaction
│   ├── agent-tools/          Tool registry, decorators, judge, validator
│   ├── agent-watchdogs/      Captcha, crash, DOM, download, popup watchdogs
│   ├── agent-governance/     Audit trail, autonomy levels, permissions
│   ├── orchestrator/         Test execution, scheduling, recovery, caching
│   ├── core/                 Backward-compat facade (re-exports orchestrator, git, devices)
│   ├── git/                  Git integration, GitHub PR management
│   ├── workflow/             YAML engine, 14 block types (crawl, track, proxy, benchmark)
│   ├── credentials/          AES-256-GCM vault (Bitwarden, 1Password, Azure)
│   ├── data/                 Crawler, change tracking, extractors, parsers, cloud storage
│   ├── api/                  REST server, webhooks, SSE, WebSocket
│   ├── network/              Stealth browsing, proxy, domain security, data masking
│   ├── quality/              Backward-compat facade (re-exports a11y, chaos, etc.)
│   ├── a11y/                 Accessibility auditing (axe-core, WCAG 2.2)
│   ├── lighthouse-quality/   Core Web Vitals, budgets, history
│   ├── chaos/                Gremlins.js chaos testing
│   ├── security-scanner/     Nuclei, ZAP security scanning
│   ├── mocking/              REST/GraphQL/WebSocket mocking
│   ├── resilience/           Network fault injection (TCP proxy, toxics)
│   ├── visual/               Pixel diff, masking, storybook, slider reports
│   ├── reporter/             7 formats: list, dot, json, junit, html, markdown, github
│   ├── sdk/                  Public SDK (9 methods)
│   ├── mcp/                  Standalone MCP server (Model Context Protocol)
│   ├── enterprise/           RBAC, SSO, multi-tenancy, hybrid LLM routing
│   └── services/             Microservice architecture (9 services + 3 infra)
├── evals/                    Benchmarks (MiniWoB, WebArena, WorkArena, reward shaping)
└── docker/                   Dockerfile + Dockerfile.fast

Environment Variables

Variable	Description	Required
`ANTHROPIC_API_KEY`	Claude API key (Sonnet, Opus, Haiku)	For Anthropic provider
`OPENAI_API_KEY`	OpenAI API key (GPT-4o, GPT-4.1, o3)	For OpenAI provider
`GOOGLE_AI_KEY`	Google Gemini API key (2.5 Pro/Flash)	For Gemini provider
`DEEPSEEK_API_KEY`	DeepSeek API key (R1, V3)	For DeepSeek provider
`INSPECT_LOG_LEVEL`	Logging level: `debug`, `info`, `warn`, `error`	No (default: `info`)
`INSPECT_TELEMETRY`	Set to `false` to disable telemetry	No (default: `true`)
`INSPECT_CONFIG`	Path to config file	No
`INSPECT_OUTPUT_DIR`	Output directory for results	No

Supported AI Providers

Provider	Models	Features
Anthropic	Claude 4 Sonnet/Opus, Haiku 3.5	Vision, extended thinking, tool use
OpenAI	GPT-4o, GPT-4.1, o3	Vision, function calling
Google	Gemini 2.5 Pro/Flash	Vision, thinking budget
DeepSeek	DeepSeek-R1, V3	Reasoning, cost-efficient
Ollama	Any local model	Privacy, offline use, no API key

Testing Types

Type	Tool	What It Does
Functional	AI Agent	Tests user flows with natural-language instructions
Accessibility	axe-core	WCAG 2.2 compliance with iframe support
Performance	Lighthouse	Core Web Vitals, SEO, PWA scoring
Security	Nuclei + ZAP	OWASP Top 10, CVE scanning, DNS/TCP/SSL
Visual	Pixel diff	Screenshot comparison with configurable thresholds
Chaos	Gremlins.js	Random monkey testing (5 species)
Resilience	Toxiproxy	Network fault injection (TCP proxy, toxicity presets)
Web Crawling	Custom	Sitemap/robots.txt, link discovery, batch scraping
Change Tracking	Custom	Scheduled monitoring with text/JSON diffing
Stealth	Custom	Fingerprint rotation, anti-detection, CAPTCHA detection
Mocking	MSW-inspired	REST + GraphQL + WebSocket handler mocking
Benchmarking	BrowserGym	MiniWoB, WebArena, WorkArena agent evaluation
Story Testing	Lost Pixel	Storybook, Ladle, Histoire visual regression

Development

# Install dependencies
pnpm install

# Build all packages (Turborepo)
pnpm build

# Run all 1642 tests
npx vitest run

# Run a specific test file
npx vitest run packages/shared/src/utils/index.test.ts

# Watch mode
npx vitest

# Typecheck
pnpm typecheck

# Run the CLI
node apps/cli/dist/index.js --help

Contributing

See CONTRIBUTING.md for development setup, coding conventions, and how to submit changes.

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
.agents/skills		.agents/skills
.changeset		.changeset
.github		.github
.husky		.husky
.specs		.specs
apps/cli		apps/cli
docker		docker
evals		evals
examples		examples
packages		packages
scripts		scripts
tests		tests
website-standalone		website-standalone
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
inspect.config.ts		inspect.config.ts
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
tsconfig.json		tsconfig.json
turbo.json		turbo.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

Inspect

Features

Quick Start

CLI Reference

Testing

Browser

Quality

Infrastructure

Governance

Enterprise

Data & Workflow

Setup & Info

SDK Usage

Configuration

Config File

Presets

Performance Budgets

CI/CD Integration

GitHub Actions

GitLab CI

CircleCI

Sharding

Template Generation

Architecture

Environment Variables

Supported AI Providers

Testing Types

Development

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages