AI Web Debugger

AI Web Debugger is an Electron desktop application that embeds a sandboxed browser (via Chromium DevTools Protocol) and exposes a structured tool layer so a large-language model can observe and diagnose live web pages. The LLM never touches DOM, network, or storage directly — everything flows through a ToolRegistry with Zod schemas, risk labels, ActionPolicy gating, ContentBoundaryWrapper nonce isolation, and a Redactor that strips secrets before any data reaches the model. Designed for developers and QA engineers who want an AI co-pilot for debugging unfamiliar or flaky pages.

Quickstart

This project uses bun as the package manager and script runner.

# 1 — Install dependencies
bun install

# 2 — Start the app in development mode
bun run dev

# 3 — In a separate terminal, start the fixture server (optional but useful for testing)
bun run fixture

# 4 — In the app address bar navigate to:
#     http://127.0.0.1:4321/

The fixture server at http://127.0.0.1:4321/ serves a pre-wired debug site with intentional errors, slow endpoints, and secret-leaking routes so you can exercise every tool without a real target.

Install With Homebrew

brew tap dickwu/tap
brew install --cask ai-web-debugger

The macOS build is unsigned and not notarized. If macOS shows a damaged app or quarantine warning, run:

sudo xattr -d com.apple.quarantine /Applications/AI\ Web\ Debugger.app/

Architecture

The app is split into three Electron processes. The main process owns all privileged work: CDP attachment, network/console recording, the ToolRegistry, AgentRunner, and all file I/O. It communicates with the renderer through a narrow, typed IPC whitelist exposed by contextBridge in the preload script — raw ipcRenderer is never exposed. The target web page runs in a fully sandboxed WebContentsView with no preload and no Node access; it is treated as untrusted at all times.

CDP events flow from the target page's WebContents into NetworkRecorder and ConsoleRecorder, which redact secrets and write records to CaptureStore. The ToolRegistry handlers read from CaptureStore and write back through ContentBoundaryWrapper before results reach the LLM. The AgentRunner drives the tool-use loop: it calls the LLM provider, dispatches tool calls through ToolRegistry, and accumulates evidence references from every assistant message.

┌─────────────────────────────────────────────────────────────────────┐
│  Main Process                                                        │
│                                                                      │
│  AppController ──► BrowserManager ──► TargetPage (WebContentsView)  │
│       │                  │                  │                        │
│       │            CaptureStore         CDP attach                   │
│       │          NetworkRecorder   ◄────────┘                        │
│       │          ConsoleRecorder                                     │
│       │                  │                                           │
│       │            ToolRegistry (Zod + ActionPolicy + Redactor)      │
│       │                  │                                           │
│       └──────────► AgentRunner ──► LLMProvider (mock / anthropic)   │
│                          │                                           │
│                    ContentBoundaryWrapper (nonce per run)            │
└────────────────────────────┬────────────────────────────────────────┘
                             │ typed IPC (contextBridge only)
┌────────────────────────────▼────────────────────────────────────────┐
│  Preload  uiPreload.ts — exposes window.debuggerApp                  │
└────────────────────────────┬────────────────────────────────────────┘
                             │
┌────────────────────────────▼────────────────────────────────────────┐
│  Renderer (React)  panels: Network / Console / Snapshot / Agent     │
└─────────────────────────────────────────────────────────────────────┘
                                         (separate, sandboxed view)
┌─────────────────────────────────────────────────────────────────────┐
│  Target WebContentsView  — nodeIntegration:false, sandbox:true      │
│  No preload. Treated as untrusted. CDP attached from main process.  │
└─────────────────────────────────────────────────────────────────────┘

Security Model

Five invariants are enforced unconditionally and tested by tests/integration/security-invariants.test.ts:

Sandboxed target page — nodeIntegration: false, contextIsolation: true, sandbox: true, no preload. typeof require is always undefined inside the target page.
All CDP in main process — no renderer or target-page code ever calls CDP directly. CDPClient lives entirely in src/main/cdp/.
ToolRegistry-only LLM access — AgentRunner calls toolRegistry.run() exclusively. No direct CDP, fs, or network calls from within the agent loop.
ContentBoundary on all page text — every tool whose output is 'page-data' is wrapped by ContentBoundaryWrapper.wrapJson() with a per-run nonce before the data reaches the LLM, preventing prompt injection from page content.
ActionPolicy default deny for risky categories — eval, network, state, download, upload are denied; click, fill, interact, eval-readonly require user confirmation; navigate, snapshot, get, wait, dialog are auto-allowed.

Configured Tools

Tool	Category	Risk
`browser.open_url`	navigate	page_action
`browser.open_blank`	navigate	page_action
`browser.reload`	navigate	page_action
`browser.back`	navigate	page_action
`browser.forward`	navigate	page_action
`browser.stop`	navigate	page_action
`browser.wait_for_load_state`	wait	read
`browser.batch`	navigate	page_action
`page.snapshot`	snapshot	read
`page.screenshot`	snapshot	read
`page.get_interactive_elements`	snapshot	read
`page.get_by_ref`	snapshot	read
`page.find`	snapshot	read
`page.query_selector`	snapshot	read
`page.wait_for`	wait	read
`page.scroll`	interact	page_action
`page.click`	click	page_action
`page.type`	fill	page_action
`page.press`	interact	page_action
`page.evaluate_readonly`	eval-readonly	read
`network.list_requests`	get	read
`network.get_request`	get	read
`network.get_response_body`	get	read
`console.list_messages`	get	read
`storage.get_cookies`	get	read
`storage.get_local_storage`	get	read
`storage.get_session_storage`	get	read
`dialog.status`	dialog	read
`dialog.accept`	dialog	page_action
`dialog.dismiss`	dialog	page_action
`diagnostics.summarize_current_page`	get	read
`diagnostics.doctor`	get	read

Configuration

Environment variables

Variable	Description
`LLM_PROVIDER`	Provider to use: `mock` (default) or `anthropic`
`LLM_API_KEY`	API key for the selected provider
`LLM_MODEL`	Model name (e.g. `claude-3-5-sonnet-20241022`)
`AI_WEB_DEBUGGER_LLM_PROVIDER`	Override — takes precedence over `LLM_PROVIDER`
`AI_WEB_DEBUGGER_LLM_API_KEY`	Override — takes precedence over `LLM_API_KEY`
`AI_WEB_DEBUGGER_LLM_MODEL`	Override — takes precedence over `LLM_MODEL`

Config files

Settings are loaded in this order (later entries win):

Built-in defaults
<userData>/config.json — persisted user settings
./ai-web-debugger.json — project-level override (checked into your repo)
Environment variables with AI_WEB_DEBUGGER_* prefix

<userData> on macOS is ~/Library/Application Support/ai-web-debugger.

Artifact locations

Artifact	Path
Session screenshots	`<userData>/sessions/<sessionId>/screenshots/*.png`
JSONL capture events	`<userData>/sessions/<sessionId>/events.jsonl`
Session metadata	`<userData>/sessions/<sessionId>/session.json`
Main process log	`<userData>/logs/main.jsonl`

Each app launch creates a new <sessionId> (UUID v4).

Scripts

Script	Description
`bun run dev`	Start app in development mode with hot reload
`bun run build`	Typecheck + compile with electron-vite
`bun run preview`	Preview the production build locally
`bun run start`	Run the compiled app (`electron .`)
`bun test` (or `bun run test`)	Run unit and integration tests with vitest
`bun run test:watch`	Watch mode for tests
`bun run lint`	Run ESLint (flat config, TypeScript-aware)
`bun run typecheck`	Run `tsc --noEmit` only
`bun run fixture`	Start the local debug-site fixture server on port 4321
`bun run dist`	Package the app with electron-builder (do not run in CI without signing)

Testing

# Run the full test suite (unit + integration)
bun run test

# For manual end-to-end testing, start the fixture server first:
bun run fixture
# Then launch the app and navigate to http://127.0.0.1:4321/
bun run dev

The integration tests under tests/integration/ are mock-driven — they do not launch a real Electron process. agent-mock-flow.test.ts exercises the full AgentRunner → ToolRegistry → MockLLMProvider pipeline with an in-memory CaptureStore. security-invariants.test.ts performs static source-file checks and runtime Redactor assertions.

Known limitations

The following items are out of scope for MVP v0.1 and are tracked in the implementation plan (§7 Out of scope for MVP):

Multi-tab support (only one target WebContentsView per session)
Full iframe / OOPIF multi-frame tracking
SQLite persistence (currently in-memory CaptureStore only)
Real code-signing and notarization for distribution builds
Network request interception / mutation tools
Playwright-style assertions in page.wait_for (subset implemented)
OpenAI provider adapter (only Anthropic and mock are wired)

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
.omc/plans		.omc/plans
build		build
scripts		scripts
src		src
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
electron-builder.yml		electron-builder.yml
electron.vite.config.ts		electron.vite.config.ts
eslint.config.js		eslint.config.js
package.json		package.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Web Debugger

Quickstart

Install With Homebrew

Architecture

Security Model

Configured Tools

Configuration

Environment variables

Config files

Artifact locations

Scripts

Testing

Known limitations

License

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Web Debugger

Quickstart

Install With Homebrew

Architecture

Security Model

Configured Tools

Configuration

Environment variables

Config files

Artifact locations

Scripts

Testing

Known limitations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages