Browser Sense

AI-native browser sensing layer
API sniffing · Page detection · Recording/Replay · Visual-semantic screenshots

Browser Sense is an AI-native browser control layer that treats the browser as a sensor, not a remote. Instead of blind snapshot-and-click loops, it provides:

API Sniffing — Capture JSON API responses directly, skipping DOM parsing entirely
Page Type Detection — Auto-classify pages (search/product/article/form/video) and extract structured schemas
Recording/Replay — Record action sequences and replay them later
Visual-Semantic Screenshots — Screenshots with @e ref annotations overlaid on interactive elements
MCP Server — Expose all capabilities as MCP tools for any AI agent (Claude, Cursor, etc.)

Why Browser Sense?

Problem with existing tools	Browser Sense solution
AI can't see what changed on the page	Incremental DOM diff pushed via SSE
Parsing DOM to extract data is fragile	API sniffing — get JSON directly from network
`@e` refs break when page updates	Cached ref map stays stable across operations
Multi-line JS is a JSON escaping nightmare	Execute from file: `{"file": "/tmp/code.js"}`
Each tool only works with one AI	MCP Server works with Claude, Cursor, and any MCP client

Quick Start

# 1. Clone and install
git clone https://github.com/azhu9701/browser-sense.git
cd browser-sense
bun install

# 2. Start the daemon
bun daemon/server.js

# 3. Load the Chrome extension
# Open chrome://extensions → Enable Developer mode → Load unpacked → Select ./extension/

# 4. Test
curl -s http://127.0.0.1:19000/status

Or use the one-liner installer:

bash install.sh

Features

API Sniffing

The killer feature. Intercept JSON API responses and skip DOM parsing.

# One-call: sniff + navigate + return structured data
curl -s -X POST http://127.0.0.1:19000/sniff/auto \
  -d '{"url": "https://example.com", "wait": 3000}'
# → {"apiCount": 2, "apis": [{"url": "/api/products", "body": [...]}]}

Domain-filtered — only captures requests from the same domain as the active tab.

Page Type Detection

curl -s http://127.0.0.1:19000/detect
# → {"type": "search_results", "schema": {"results": [...]}}
# → {"type": "product", "schema": {"name": "...", "price": "..."}}
# → {"type": "article", "schema": {"title": "...", "author": "..."}}

Supported types: search_results, product, article, form, video, table, feed, homepage, generic.

Recording / Replay

# Record a sequence
curl -s -X POST http://127.0.0.1:19000/record/start -d '{"name": "login-flow"}'
# ... perform actions ...
curl -s http://127.0.0.1:19000/record/stop

# Replay
curl -s -X POST http://127.0.0.1:19000/record/play -d '{"name": "login-flow"}'

Visual-Semantic Screenshots

# Screenshot with @e ref labels on every interactive element
curl -s "http://127.0.0.1:19000/screenshot?annotate=true"

API Reference

Endpoint	Method	Description
`/status`	GET	Daemon + extension health
`/navigate`	POST	Open URL
`/snapshot`	GET	Accessibility tree with `@e` refs
`/read`	GET	Plain text extraction
`/detect`	GET	Page type + structured schema
`/screenshot`	GET	Screenshot (use `?annotate=true`)
`/execute`	POST	Run JS (`{"code": "..."}` or `{"file": "..."}`)
`/sniff/auto`	POST	One-call API sniffing
`/sniff/start`	POST	Manual sniffing
`/sniff/stop`	GET	Stop sniffing
`/record/start`	POST	Start recording
`/record/stop`	GET	Stop and save
`/record/play`	POST	Replay recording
`/record/list`	GET	List recordings
`/events`	GET	SSE stream for real-time push

MCP Server

Expose Browser Sense to any MCP-compatible AI agent:

# Start MCP server
bun daemon/mcp-server.js

# Register with Claude Code
claude mcp add browser-sense -- "bun /path/to/browser-sense/daemon/mcp-server.js"

Available tools: browser_status, browser_navigate, browser_snapshot, browser_read, browser_detect, browser_click, browser_fill, browser_screenshot, browser_execute, browser_sniff_auto, browser_record_start, browser_record_stop, browser_record_play, browser_record_list, browser_list_tabs, browser_find_tab, browser_close_tab.

Architecture

Chrome Extension (Manifest V3)
├── content.js      — DOM observer, a11y tree, text extraction, annotation overlay, page classifier
├── background.js   — WebSocket ↔ daemon, command router, network interception (API sniffing)
└── popup.html      — Connection status, quick controls

Local Daemon (Bun/Node)
├── server.js       — HTTP API, WebSocket server, recording engine, event broadcasting
└── mcp-server.js   — MCP protocol wrapper exposing all tools

Requirements

Bun ≥ 1.0.0 (or Node.js ≥ 18)
Chrome or Edge (Chromium-based)
macOS, Linux, or Windows

License

MIT — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
daemon		daemon
extension		extension
recordings		recordings
skill		skill
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
install.sh		install.sh
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Browser Sense

Why Browser Sense?

Quick Start

Features

API Sniffing

Page Type Detection

Recording / Replay

Visual-Semantic Screenshots

API Reference

MCP Server

Architecture

Requirements

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Browser Sense

Why Browser Sense?

Quick Start

Features

API Sniffing

Page Type Detection

Recording / Replay

Visual-Semantic Screenshots

API Reference

MCP Server

Architecture

Requirements

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages