Web access for LLM agents. One static binary. No Chrome.
Python (recommended) — wheel ships the native binary. Requires Python 3.10+:
pipx install pyunbrowser # cleanest on macOS Homebrew / modern Linux (handles PEP 668)
pip install pyunbrowser # in a venv on python3.10+macOS gotcha: the system
/usr/bin/python3is 3.9 and the wheel will reject it with "requires Python >=3.10". Use Homebrew'spython3.13orpipx(which manages its own Python). Ifpip installfails with PEP 668 ("externally-managed-environment"), that's the same issue —pipx install pyunbrowseris the right call.
from unbrowser import Client # note: pip name is pyunbrowser, import is unbrowser
with Client() as ub: # (PyPI's name moderation blocks 'unbrowser';
r = ub.navigate("https://news.ycombinator.com") # py- prefix is the standard workaround)Cargo — binary only, no Python wrapper:
cargo install unbrowser
unbrowser --mcpPre-built tarball — for systems without Python or Rust:
# macOS Apple Silicon
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-aarch64-apple-darwin.tar.gz | tar xz
# macOS Intel
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-apple-darwin.tar.gz | tar xz
# Linux x86_64 (glibc 2.35+)
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-unknown-linux-gnu.tar.gz | tar xzFrom source:
cargo build --release # binary at ./target/release/unbrowserecho '{"id":1,"method":"navigate","params":{"url":"https://news.ycombinator.com"}}' | unbrowserThat's the install. Runs anywhere a static binary runs — laptop, Lambda, Cloudflare Workers, edge, embedded.
Open source under Apache 2.0. When the cheap path can't handle a page (heavy SPAs, behavioral bot challenges), escalate to a real browser via unchainedsky-cli (drives your local Chrome via CDP) or the Unchained desktop app.
| This binary | Headless Chrome (Playwright/Puppeteer) | |
|---|---|---|
| Binary size | ~10MB | 250MB+ Chrome download |
| RAM / session | ~50MB | 200–500MB |
| Cold start | ~100ms | ~1s |
| Tokens / page (LLM) | ~500 (BlockMap inline) | tens of thousands of HTML, parsed by you |
| Install steps | cargo build |
install Chrome + Node + Playwright + system deps |
| Lambda / Workers / edge | ✅ | ❌ Chrome too big |
| 100K pages/day cost | $0 (your infra) | $$$ Chrome fleet or hosted API |
5–10× lower memory, 25× smaller binary, 10× faster cold start, 70× lower per-page token cost. That's the tradeoff this product makes — defer JS-rendering (Phase 4/5) and pixel rendering (out of scope) in exchange for a footprint that fits in places Chrome doesn't.
This isn't a Chrome wrapper that an agent uses through a Puppeteer-shaped abstraction. It's a browser whose every output is shaped for LLM consumption:
navigatereturns a BlockMap — ~500 tokens of structured page summary (landmarks, headings, interactives, density signals) right in the response. No follow-up call needed to know what's on the page.- Stable element refs (
e:142) — query, click, type, submit using opaque handles. The LLM never has to scrape the DOM itself. challengefield on every blocked navigate — provider, confidence, and the exact clearance cookie name. The agent reacts intelligently instead of guessing.density.likely_js_filledheuristic — distinguishes "real SSR page" from "SSR shell with JS-filled cells" (the CNBC trap). The agent bails before burning round-trips on a page it can't read.- MCP-native —
unbrowser --mcpexposes 12 tools to any MCP host (Claude Code, Claude Desktop, Cursor, Cline). 4 lines of config, zero glue code. - Real Chrome fingerprint (Chrome 131 JA4 + Akamai H2 hash) so sites don't block you for being a script.
For pages that do need real Chrome (heavy SPAs, JS-challenge bot walls), the binary detects them and accepts cookies via cookies_set — so you solve once in Chrome and replay forever here.
from unbrowser import Client
with Client() as ub:
ub.navigate("https://news.ycombinator.com")
for s in ub.query(".titleline > a")[:3]:
print(s["text"], s["attrs"]["href"])5 lines, no headless browser install. Output is structured JSON, not 35KB of HTML. The Client wrapper handles subprocess lifecycle (atexit reaper so orphans are impossible), JSON-RPC framing, and surfaces real exceptions instead of silent result lookups.
Bare-RPC version (if you can't use Python)
The same demo without the wrapper — useful for languages other than Python or one-shot bash calls. The protocol is JSON-RPC over stdin/stdout, one JSON object per line:
import subprocess, json
p = subprocess.Popen(["./target/release/unbrowser"],
stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True, bufsize=1)
i = 0
def call(method, **params):
global i; i += 1
p.stdin.write(json.dumps({"id": i, "method": method, "params": params}) + "\n")
p.stdin.flush()
return json.loads(p.stdout.readline())["result"]
call("navigate", url="https://news.ycombinator.com")
for s in call("query", selector=".titleline > a")[:3]:
print(s["text"], s["attrs"]["href"])That's the entire protocol surface. Same shape from any language with subprocess + JSON.
Empirical, not aspirational. Latest matrix: 28/30 on tested categories.
| Page tier | Coverage | What to expect |
|---|---|---|
| Static + SSR (Wikipedia, MDN, news, docs, GitHub repo browsing, search engines, archive.org) | ✅ excellent | sub-second navigate; full BlockMap; all selectors work; ~hundreds of tokens vs ~tens of KB raw |
| SSR + light hydration (Next.js docs, marketing pages, react.dev's static content) | ✅ usable | reads SSR'd content fine; hydration adds nothing but doesn't break either |
| Bot-walled with cookie handoff (Zillow, Cloudflare-protected sites) | ✅ via cookies_set |
solve once in Chrome, replay forever; challenge.provider field tells the agent which vendor |
| Module-loader SPAs (Ember, AMD apps like crates.io) | exec_scripts: true |
bundles fetch + execute, modules register, but framework auto-mount needs case-by-case shimming |
| Heavy React/Vue bundles (react.dev runtime, large dashboard apps) | with exec_scripts: true the navigate completes inside the 30s wall-clock budget (5s for the script-eval phase, the rest for settle); rendered DOM may not materialize. Tune via UNBROWSER_TIMEOUT_MS |
|
| Apps requiring Workers / Canvas / IndexedDB / WebGL | ❌ out of scope by design | use the cookie-handoff path with real Chrome via unchainedsky-cli (CDP) or the Unchained desktop app |
| Hardest-tier anti-bot (PerimeterX with behavioral, Kasada, Akamai BMP advanced) | ❌ even cookie handoff is fragile | real Chrome via CDP is the right tier |
Vs the alternatives:
| This | curl | Playwright / headless Chrome | |
|---|---|---|---|
| Static / SSR pages | ✅ | ✅ but token-heavy | overkill |
| SPA-shell sites | exec_scripts |
❌ | ✅ |
| Bot-walled (with cookie handoff) | ✅ | ❌ | ✅ |
| Run in Lambda / Workers / edge | ✅ | ✅ | ❌ Chrome too big |
| Per-page cost at 100K/day | ~free | ~free | $$$ |
| LLM-shaped output | ✅ BlockMap inline | DIY parse | DIY parse |
Concrete sites tested with measured times. Cold-start to extracted-result.
| Category | Sites | Time |
|---|---|---|
| Reference / docs | Wikipedia, MDN, docs.rs, PyPI, react.dev (SSR portion) | 0.9 – 5.8s |
| News | Hacker News, BBC, TechCrunch, ArXiv listings | 1 – 1.6s |
| Search | Google /search, Bing, Brave, DuckDuckGo (html) |
0.2 – 1.8s |
| Dev | GitHub repo pages, npm, StackOverflow, HuggingFace model cards | 0.7 – 2.4s |
| Crypto / finance | CoinGecko, Yahoo Finance (post-redirect-fix) | 3.5 – 6.9s |
| Social | Lobsters, old.reddit.com | 0.9 – 1.4s |
| Govt / institutional | arXiv, archive.org, gov.uk | 0.6 – 1.0s |
| Interaction primitives | type, click + auto-follow, cookies_set/get/replay, eval, query_text | 0.3 – 1.3s |
Surprises: all four major search engines work cleanly. CoinGecko's heavy dashboard SSRs enough that quotes come through. HuggingFace model cards expose model name in <h1>.
Every blocked navigate returns a challenge field naming the vendor (perimeterx_block, cloudflare_turnstile, aws_waf, datadome, akamai_bmp, imperva, arkose_labs, recaptcha, press_hold, yahoo_sad_panda, interstitial, generic_human_verification, unknown_block) plus the expected clearance cookie name. Agents react with cookie handoff via cookies_set instead of guessing.
Every navigate's blockmap.density field signals SPA-ness so agents bail before wasting round-trips:
thin_shell: true— page is < 4KB body text with no headings or interactives (typical React/Ember root)likely_js_filled: true—<table>shells exist but are empty (CNBC-class trap)json_scripts: N— count of<script type="application/json">(often holds the data the JS would render — tryeval()on those before escalating)
{"mcpServers":{"unchained":{"command":"unbrowser","args":["--mcp"]}}}12 tools auto-discovered by Claude Code, Claude Desktop, Cursor, Cline.
13 lines of Python (above). Or any language with subprocess + JSON.
from scripts.router import Router, RouterConfig, cached_cookies_solver
with Router(RouterConfig(
binary="./target/release/unbrowser",
chrome_solver=cached_cookies_solver("cookies.json"),
)) as r:
r.navigate("https://www.zillow.com/homes/for_rent/") # auto-handles 403 + cookie replayThe binary emits NDJSON events (ready, navigate, challenge) on stderr. Pipe them through watch.py for color-coded one-liners:
unbrowser 2> >(python3 scripts/watch.py)navigate {url} |
fetch + parse + return {status, url, bytes, headers, blockmap, challenge} |
query {selector} |
CSS query → [{ref, tag, attrs, text}] |
text {selector?} |
textContent of FIRST match (default body). On Wikipedia/MDN/news sites the first <p> is often a hatnote — prefer text_main for article body. |
text_main |
textContent of <main> / [role=main] / single <article> / longest non-chrome subtree. Use this for reading article/docs/blog content. |
click {ref} |
dispatch click; auto-follows <a href> (returns {status, url, bytes, headers, blockmap, challenge} — same shape as navigate) |
type {ref, text} |
set value + dispatch input/change events |
submit {ref} |
gather GET-form fields + navigate |
eval {code} |
run JS in embedded QuickJS |
cookies_set / cookies_get / cookies_clear |
session jar |
blockmap |
recompute the page summary |
body |
raw HTML of last navigation |
CSS selector engine: tag, id, class, [attr=val] (also ^=, $=, *=, ~=), all four combinators ( , >, +, ~), :first/last/nth-child/of-type, :only-child/of-type. Use eval for :not(), :has(), formulas.
This binary is the cheap path. For the cases it can't handle (heavy framework hydration, behavioral bot challenges, Workers/Canvas/IndexedDB), the next tier is a real Chrome instance driven via CDP. Two ways to get there:
| This binary | unchainedsky-cli |
Unchained desktop app | |
|---|---|---|---|
| Runs JS | QuickJS (no V8 JIT) | real Chrome via CDP | real Chrome (the user's, with their logins) |
| SPA hydration | partial | ✅ | ✅ |
| Bot challenges | cookie handoff only | active solving via real browser | manual / interactive |
| Setup | pip install pyunbrowser |
pip install unchainedsky-cli |
desktop install |
| Audience | agent / pipeline | agent / pipeline | end user |
| Per-page footprint | ~50MB | full Chrome | full Chrome |
The escalation path is a deliberate choice, not an automatic fallback — you ship pyunbrowser for the 80% of pages that work cheap, then route the 20% to unchainedsky-cli (or to a human via the desktop app). The vocabulary (navigate, query, click, cookies_set, BlockMap) is shared so code transfers cleanly.
- Script execution is opt-in via
exec_scripts: true. Default navigate skips it (the SSR/static path is what most agents want). With it on, inline + external<script>tags run in QuickJS — works for many SPAs, but heavy framework bootstraps (Ember, big React) often don't auto-mount because shims can't fake every browser-specific signal. The blockmap'sdensity.likely_js_filledflag tells agents in one call when to escalate instead of burning round-trips. - All eval is wall-clock bounded. A 30s watchdog (configurable via
UNBROWSER_TIMEOUT_MS, clamped to 1s..10min) covers script execution AND every subsequent settle/microtask/timer callback, so a hostile site can never wedge the binary or strand a CPU-pegged orphan process. - GET-only form submit. POST/multipart errors out — construct the request manually via
evalor escalate. - Hardest-tier bot detection (PerimeterX with behavioral telemetry, advanced Akamai BMP, Kasada) needs the cookie-handoff path. The binary detects and labels the challenge for you, but solving it requires real Chrome (or a token vendor).
- No screenshots. Out of scope by design.
Rust 1.95+ via rustup. On macOS, also brew install cmake ninja (BoringSSL dependency).
cargo build --release~2 min first build (BoringSSL compiles), instant after.
JSON-RPC stdin ─┐ ┌─ stdout
▼ ▲
┌────────────────────┐
│ rquest (Chrome131 │ ┌──────────┐ ┌──────────────────┐
│ TLS+H2 fingerprint)├──▶ html5ever ├───▶ rquickjs + │
│ │ │ parser │ │ dom.js + │
│ cookie_store │ └──────────┘ │ blockmap.js + │
│ (jar) │ │ interact.js │
└────────────────────┘ └──────────────────┘
Apache 2.0 — see LICENSE.
For the cases this binary can't handle (heavy framework hydration, behavioral bot challenges, anything needing real Chrome), the next tier is unchainedsky-cli — drives a real Chrome via CDP, same vocabulary. End-users who want a point-and-click agent can skip the CLI entirely and use the Unchained desktop app.