unbrowser

Web access for LLM agents. One static binary. No Chrome.

Install

Python (recommended) — wheel ships the native binary. Requires Python 3.10+:

pipx install pyunbrowser   # cleanest on macOS Homebrew / modern Linux (handles PEP 668)
pip  install pyunbrowser   # in a venv on python3.10+

macOS gotcha: the system /usr/bin/python3 is 3.9 and the wheel will reject it with "requires Python >=3.10". Use Homebrew's python3.13 or pipx (which manages its own Python). If pip install fails with PEP 668 ("externally-managed-environment"), that's the same issue — pipx install pyunbrowser is the right call.

from unbrowser import Client       # note: pip name is pyunbrowser, import is unbrowser
with Client() as ub:                # (PyPI's name moderation blocks 'unbrowser';
    r = ub.navigate("https://news.ycombinator.com")   # py- prefix is the standard workaround)

Cargo — binary only, no Python wrapper:

cargo install unbrowser
unbrowser --mcp

Pre-built tarball — for systems without Python or Rust:

# macOS Apple Silicon
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-aarch64-apple-darwin.tar.gz | tar xz
# macOS Intel
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-apple-darwin.tar.gz | tar xz
# Linux x86_64 (glibc 2.35+)
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-unknown-linux-gnu.tar.gz | tar xz

From source:

cargo build --release   # binary at ./target/release/unbrowser

Bare RPC

echo '{"id":1,"method":"navigate","params":{"url":"https://news.ycombinator.com"}}' | unbrowser

That's the install. Runs anywhere a static binary runs — laptop, Lambda, Cloudflare Workers, edge, embedded.

Open source under Apache 2.0. When the cheap path can't handle a page (heavy SPAs, behavioral bot challenges), escalate to a real browser via unchainedsky-cli (drives your local Chrome via CDP) or the Unchained desktop app.

By the numbers

	This binary	Headless Chrome (Playwright/Puppeteer)
Binary size	~10MB	250MB+ Chrome download
RAM / session	~50MB	200–500MB
Cold start	~100ms	~1s
Tokens / page (LLM)	~500 (BlockMap inline)	tens of thousands of HTML, parsed by you
Install steps	`cargo build`	install Chrome + Node + Playwright + system deps
Lambda / Workers / edge	✅	❌ Chrome too big
100K pages/day cost	$0 (your infra)	$$$ Chrome fleet or hosted API

5–10× lower memory, 25× smaller binary, 10× faster cold start, 70× lower per-page token cost. That's the tradeoff this product makes — defer JS-rendering (Phase 4/5) and pixel rendering (out of scope) in exchange for a footprint that fits in places Chrome doesn't.

Agent-friendly by design

This isn't a Chrome wrapper that an agent uses through a Puppeteer-shaped abstraction. It's a browser whose every output is shaped for LLM consumption:

navigate returns a BlockMap — ~500 tokens of structured page summary (landmarks, headings, interactives, density signals) right in the response. No follow-up call needed to know what's on the page.
Stable element refs (e:142) — query, click, type, submit using opaque handles. The LLM never has to scrape the DOM itself.
challenge field on every blocked navigate — provider, confidence, and the exact clearance cookie name. The agent reacts intelligently instead of guessing.
density.likely_js_filled heuristic — distinguishes "real SSR page" from "SSR shell with JS-filled cells" (the CNBC trap). The agent bails before burning round-trips on a page it can't read.
MCP-native — unbrowser --mcp exposes 12 tools to any MCP host (Claude Code, Claude Desktop, Cursor, Cline). 4 lines of config, zero glue code.
Real Chrome fingerprint (Chrome 131 JA4 + Akamai H2 hash) so sites don't block you for being a script.

For pages that do need real Chrome (heavy SPAs, JS-challenge bot walls), the binary detects them and accepts cookies via cookies_set — so you solve once in Chrome and replay forever here.

Quick demo — Hacker News top 3

from unbrowser import Client

with Client() as ub:
    ub.navigate("https://news.ycombinator.com")
    for s in ub.query(".titleline > a")[:3]:
        print(s["text"], s["attrs"]["href"])

5 lines, no headless browser install. Output is structured JSON, not 35KB of HTML. The Client wrapper handles subprocess lifecycle (atexit reaper so orphans are impossible), JSON-RPC framing, and surfaces real exceptions instead of silent result lookups.

Bare-RPC version (if you can't use Python)

The same demo without the wrapper — useful for languages other than Python or one-shot bash calls. The protocol is JSON-RPC over stdin/stdout, one JSON object per line:

import subprocess, json
p = subprocess.Popen(["./target/release/unbrowser"],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True, bufsize=1)
i = 0
def call(method, **params):
    global i; i += 1
    p.stdin.write(json.dumps({"id": i, "method": method, "params": params}) + "\n")
    p.stdin.flush()
    return json.loads(p.stdout.readline())["result"]

call("navigate", url="https://news.ycombinator.com")
for s in call("query", selector=".titleline > a")[:3]:
    print(s["text"], s["attrs"]["href"])

That's the entire protocol surface. Same shape from any language with subprocess + JSON.

SPA tier — what works, what doesn't

Empirical, not aspirational. Latest matrix: 28/30 on tested categories.

Page tier	Coverage	What to expect
Static + SSR (Wikipedia, MDN, news, docs, GitHub repo browsing, search engines, archive.org)	✅ excellent	sub-second navigate; full BlockMap; all selectors work; ~hundreds of tokens vs ~tens of KB raw
SSR + light hydration (Next.js docs, marketing pages, react.dev's static content)	✅ usable	reads SSR'd content fine; hydration adds nothing but doesn't break either
Bot-walled with cookie handoff (Zillow, Cloudflare-protected sites)	✅ via `cookies_set`	solve once in Chrome, replay forever; `challenge.provider` field tells the agent which vendor
Module-loader SPAs (Ember, AMD apps like crates.io)	⚠️ partial with `exec_scripts: true`	bundles fetch + execute, modules register, but framework auto-mount needs case-by-case shimming
Heavy React/Vue bundles (react.dev runtime, large dashboard apps)	⚠️ bounded — won't hang, won't render	with `exec_scripts: true` the navigate completes inside the 30s wall-clock budget (5s for the script-eval phase, the rest for settle); rendered DOM may not materialize. Tune via `UNBROWSER_TIMEOUT_MS`
Apps requiring Workers / Canvas / IndexedDB / WebGL	❌ out of scope by design	use the cookie-handoff path with real Chrome via `unchainedsky-cli` (CDP) or the Unchained desktop app
Hardest-tier anti-bot (PerimeterX with behavioral, Kasada, Akamai BMP advanced)	❌ even cookie handoff is fragile	real Chrome via CDP is the right tier

Vs the alternatives:

	This	curl	Playwright / headless Chrome
Static / SSR pages	✅	✅ but token-heavy	overkill
SPA-shell sites	⚠️ partial via `exec_scripts`	❌	✅
Bot-walled (with cookie handoff)	✅	❌	✅
Run in Lambda / Workers / edge	✅	✅	❌ Chrome too big
Per-page cost at 100K/day	~free	~free	$$$
LLM-shaped output	✅ BlockMap inline	DIY parse	DIY parse

Verified against (working)

Concrete sites tested with measured times. Cold-start to extracted-result.

Category	Sites	Time
Reference / docs	Wikipedia, MDN, docs.rs, PyPI, react.dev (SSR portion)	0.9 – 5.8s
News	Hacker News, BBC, TechCrunch, ArXiv listings	1 – 1.6s
Search	Google `/search`, Bing, Brave, DuckDuckGo (html)	0.2 – 1.8s
Dev	GitHub repo pages, npm, StackOverflow, HuggingFace model cards	0.7 – 2.4s
Crypto / finance	CoinGecko, Yahoo Finance (post-redirect-fix)	3.5 – 6.9s
Social	Lobsters, old.reddit.com	0.9 – 1.4s
Govt / institutional	arXiv, archive.org, gov.uk	0.6 – 1.0s
Interaction primitives	type, click + auto-follow, cookies_set/get/replay, eval, query_text	0.3 – 1.3s

Surprises: all four major search engines work cleanly. CoinGecko's heavy dashboard SSRs enough that quotes come through. HuggingFace model cards expose model name in <h1>.

Bot-detection diagnostics

Every blocked navigate returns a challenge field naming the vendor (perimeterx_block, cloudflare_turnstile, aws_waf, datadome, akamai_bmp, imperva, arkose_labs, recaptcha, press_hold, yahoo_sad_panda, interstitial, generic_human_verification, unknown_block) plus the expected clearance cookie name. Agents react with cookie handoff via cookies_set instead of guessing.

SPA-detection diagnostics

Every navigate's blockmap.density field signals SPA-ness so agents bail before wasting round-trips:

thin_shell: true — page is < 4KB body text with no headings or interactives (typical React/Ember root)
likely_js_filled: true — <table> shells exist but are empty (CNBC-class trap)
json_scripts: N — count of <script type="application/json"> (often holds the data the JS would render — try eval() on those before escalating)

Three ways agents talk to it

MCP (no glue)

{"mcpServers":{"unchained":{"command":"unbrowser","args":["--mcp"]}}}

12 tools auto-discovered by Claude Code, Claude Desktop, Cursor, Cline.

Subprocess (custom runtimes)

13 lines of Python (above). Or any language with subprocess + JSON.

Auto-escalation router (`scripts/router.py`)

from scripts.router import Router, RouterConfig, cached_cookies_solver

with Router(RouterConfig(
    binary="./target/release/unbrowser",
    chrome_solver=cached_cookies_solver("cookies.json"),
)) as r:
    r.navigate("https://www.zillow.com/homes/for_rent/")  # auto-handles 403 + cookie replay

Live event watcher (`scripts/watch.py`)

The binary emits NDJSON events (ready, navigate, challenge) on stderr. Pipe them through watch.py for color-coded one-liners:

unbrowser 2> >(python3 scripts/watch.py)

RPC methods


`navigate {url}`	fetch + parse + return `{status, url, bytes, headers, blockmap, challenge}`
`query {selector}`	CSS query → `[{ref, tag, attrs, text}]`
`text {selector?}`	textContent of FIRST match (default `body`). On Wikipedia/MDN/news sites the first `<p>` is often a hatnote — prefer `text_main` for article body.
`text_main`	textContent of `<main>` / `[role=main]` / single `<article>` / longest non-chrome subtree. Use this for reading article/docs/blog content.
`click {ref}`	dispatch click; auto-follows `<a href>` (returns `{status, url, bytes, headers, blockmap, challenge}` — same shape as `navigate`)
`type {ref, text}`	set value + dispatch input/change events
`submit {ref}`	gather GET-form fields + navigate
`eval {code}`	run JS in embedded QuickJS
`cookies_set / cookies_get / cookies_clear`	session jar
`blockmap`	recompute the page summary
`body`	raw HTML of last navigation

CSS selector engine: tag, id, class, [attr=val] (also ^=, $=, *=, ~=), all four combinators ( , >, +, ~), :first/last/nth-child/of-type, :only-child/of-type. Use eval for :not(), :has(), formulas.

When to escalate to real Chrome

This binary is the cheap path. For the cases it can't handle (heavy framework hydration, behavioral bot challenges, Workers/Canvas/IndexedDB), the next tier is a real Chrome instance driven via CDP. Two ways to get there:

	This binary	`unchainedsky-cli`	Unchained desktop app
Runs JS	QuickJS (no V8 JIT)	real Chrome via CDP	real Chrome (the user's, with their logins)
SPA hydration	partial	✅	✅
Bot challenges	cookie handoff only	active solving via real browser	manual / interactive
Setup	`pip install pyunbrowser`	`pip install unchainedsky-cli`	desktop install
Audience	agent / pipeline	agent / pipeline	end user
Per-page footprint	~50MB	full Chrome	full Chrome

The escalation path is a deliberate choice, not an automatic fallback — you ship pyunbrowser for the 80% of pages that work cheap, then route the 20% to unchainedsky-cli (or to a human via the desktop app). The vocabulary (navigate, query, click, cookies_set, BlockMap) is shared so code transfers cleanly.

Honest limits

Script execution is opt-in via exec_scripts: true. Default navigate skips it (the SSR/static path is what most agents want). With it on, inline + external <script> tags run in QuickJS — works for many SPAs, but heavy framework bootstraps (Ember, big React) often don't auto-mount because shims can't fake every browser-specific signal. The blockmap's density.likely_js_filled flag tells agents in one call when to escalate instead of burning round-trips.
All eval is wall-clock bounded. A 30s watchdog (configurable via UNBROWSER_TIMEOUT_MS, clamped to 1s..10min) covers script execution AND every subsequent settle/microtask/timer callback, so a hostile site can never wedge the binary or strand a CPU-pegged orphan process.
GET-only form submit. POST/multipart errors out — construct the request manually via eval or escalate.
Hardest-tier bot detection (PerimeterX with behavioral telemetry, advanced Akamai BMP, Kasada) needs the cookie-handoff path. The binary detects and labels the challenge for you, but solving it requires real Chrome (or a token vendor).
No screenshots. Out of scope by design.

Build

Rust 1.95+ via rustup. On macOS, also brew install cmake ninja (BoringSSL dependency).

cargo build --release

~2 min first build (BoringSSL compiles), instant after.

Architecture in one diagram

JSON-RPC stdin ─┐    ┌─ stdout
                ▼    ▲
         ┌────────────────────┐
         │  rquest (Chrome131 │   ┌──────────┐    ┌──────────────────┐
         │  TLS+H2 fingerprint)├──▶ html5ever ├───▶ rquickjs +       │
         │                    │   │  parser  │    │  dom.js +        │
         │  cookie_store      │   └──────────┘    │  blockmap.js +   │
         │  (jar)             │                   │  interact.js     │
         └────────────────────┘                   └──────────────────┘

License

Apache 2.0 — see LICENSE.

For the cases this binary can't handle (heavy framework hydration, behavioral bot challenges, anything needing real Chrome), the next tier is unchainedsky-cli — drives a real Chrome via CDP, same vocabulary. End-users who want a point-and-click agent can skip the CLI entirely and use the Unchained desktop app.

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
.github/workflows		.github/workflows
profiles		profiles
python		python
scripts		scripts
skills/unbrowser		skills/unbrowser
src		src
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

unbrowser

Install

Bare RPC

By the numbers

Agent-friendly by design

Quick demo — Hacker News top 3

SPA tier — what works, what doesn't

Verified against (working)

Bot-detection diagnostics

SPA-detection diagnostics

Three ways agents talk to it

MCP (no glue)

Subprocess (custom runtimes)

Auto-escalation router (`scripts/router.py`)

Live event watcher (`scripts/watch.py`)

RPC methods

When to escalate to real Chrome

Honest limits

Build

Architecture in one diagram

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

unbrowser

Install

Bare RPC

By the numbers

Agent-friendly by design

Quick demo — Hacker News top 3

SPA tier — what works, what doesn't

Verified against (working)

Bot-detection diagnostics

SPA-detection diagnostics

Three ways agents talk to it

MCP (no glue)

Subprocess (custom runtimes)

Auto-escalation router (scripts/router.py)

Live event watcher (scripts/watch.py)

RPC methods

When to escalate to real Chrome

Honest limits

Build

Architecture in one diagram

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Auto-escalation router (`scripts/router.py`)

Live event watcher (`scripts/watch.py`)

Packages