A small, opinionated browser agent that lives in your Chrome side panel.
Think of it as a free, hackable cousin of Perplexity Comet: a chat that can actually see the page you're on, answer questions about it, and — when you let it — drive the browser for you. Click things. Fill forms. Read across tabs. Pull structured data out of messy HTML.
It's also stubbornly cheap to run. Crawler doesn't pin you to one provider. You drop in a handful of free API keys (NVIDIA, OpenRouter, Groq, Gemini, Cerebras, Mistral, or your own Ollama box) and it juggles them for you. When one rate-limits, it slides to the next. You don't notice. You don't pay.
No build step, no npm install, no bundler. Vanilla JS + ES modules. The whole thing is ~60 KB of code you can read in an afternoon.
- Open
chrome://extensions. - Flip Developer mode on (top-right corner).
- Click Load unpacked → pick this
crawler/folder. - Pin the puzzle-piece icon so it's one click away.
- Hit Ctrl+Shift+U (or
Cmd+Shift+Uon Mac) to pop the side panel. - Click the gear, paste an API key, you're done.
The toolbar icon is Chrome's default puzzle piece — we don't ship custom art. If you want branding, drop 16/48/128 px PNGs into
icons/and add aniconsblock tomanifest.json.
Most "use any AI" tools make you pick a provider in a dropdown. Crawler
doesn't. It builds one giant rotation list out of every
(provider × API key × model) you've configured, and walks it on every
request.
Here's the life of one message:
- Planner asks the pool: "give me a candidate."
- We grab the next one that isn't sitting in cooldown.
- We hit its endpoint, with its key, asking for its model.
- Worked? Great — return the answer, remember the winner.
- HTTP 429? Park that candidate for the duration of
Retry-After(defaults to 60s) and move on. - HTTP 401 / 402? That key is dead or out of credits — bench it for 24 hours so we stop pestering it.
- HTTP 5xx? 15–30s timeout. Try the next one.
Because every provider is in the same list, failover is cross-provider, not just cross-key. NVIDIA hits its quota → next call lands on OpenRouter free → that 429s → over to Groq → and you're none the wiser. The side panel quietly logs each rotation so you can see what's happening:
⟳ openrouter/deepseek-v3:free key #2 → HTTP 429 (cooldown 30s). Rotating…
Crank the gear icon open. There's one card per provider — flip on the ones you have keys for, paste the keys (one per line), tick the models you want in rotation. Use free-tier models only is on by default, so you have to opt in to spend money.
| Provider | Free? | Where to get a key |
|---|---|---|
| NVIDIA NIM | Yes — 1000 credits/key/month | build.nvidia.com |
| OpenRouter | Yes — :free-tagged models |
openrouter.ai/keys |
| Groq | Yes — generous, ridiculously fast | console.groq.com/keys |
| Cerebras | Yes — fastest tokens/sec on the planet | cloud.cerebras.ai |
| Google Gemini | Yes — Flash models are free | aistudio.google.com/apikey |
| Mistral | Yes — the small models | console.mistral.ai/api-keys |
| Ollama | Yes — local, no key needed | localhost only |
| OpenAI | No | platform.openai.com |
| Anthropic | No | console.anthropic.com |
| xAI / Custom | Varies | per provider |
If you just want this to work, set up:
- NVIDIA — 2 or 3 keys
- OpenRouter — 2 or 3 keys (only
:freemodels ticked) - Groq — 1 or 2 keys
That's roughly 30+ candidates in rotation. Hit Test Pool in Settings —
it'll send a pong ping and tell you which candidate answered.
⚠️ What we deliberately won't ship: GitHub Copilot login and ChatGPT Plus / Pro login. Both depend on private internal APIs that violate their providers' terms — using them tends to get accounts banned. If you have ChatGPT Plus and want OpenAI models here, use aplatform.openai.comAPI key (it's billed separately).
Assistant mode — you ask, it answers. The current page's text gets pulled in as context (sanitised first), and the model replies. Good for "summarise this," "what's this article actually saying," "extract the prices into JSON."
Agent mode — you set a goal, it acts. The model has to call
present_plan first (a one-line summary plus numbered steps) so you can
see what it intends to do before it touches anything. Approve the plan,
it proceeds. Reject it, it stops and says "ok."
Under the hood it's the same ReAct loop. The only difference is which tools are exposed.
A small, deliberately lean set. Snapshots are how the model "sees" the page — it gets back a numbered list of every interactive element, then operates them by index, not by selector.
present_plan— show the plan, wait for user approvalsnapshot— numbered list of links / buttons / inputs on the active tabact({ index, action })— click / type / select / hover that elementnavigate,open_tab,list_tabs— tab controlread_page,extract,click,type_text,scroll,wait_for— the classic selector-based toolset (used in non-lean mode)
Adding a new tool is one append in lib/tools.js: a JSON schema and a handler. The model picks it up on the next run.
Drop these into the side panel to get a feel for it:
- "Summarise this page in 5 bullets."
- "Extract every product name and price into a JSON array."
- "Click the 'Sign in' link and tell me what fields are on the next page."
- "Search Hacker News for 'agentic browsers' and list the top 5 results with links."
- "Open the first three results in new tabs and tell me which one mentions pricing."
manifest.json MV3 manifest — sidePanel + content scripts + Ctrl+Shift+U
background.js service worker; port routing + agent loop driver
content.js isolated-world DOM driver (click / type / scroll / extract)
sidepanel.{html,css,js} chat UI; long-lived port to the worker
options.{html,js} provider + key + safety settings
lib/llm.js OpenAI + Anthropic streaming clients (SSE parsed by hand)
lib/pool.js unified candidate rotation + cooldown bookkeeping
lib/presets.js provider definitions (endpoint, models, free vs paid)
lib/planner.js ReAct loop: plan → call tool → observe → repeat
lib/tools.js tool registry exposed to the LLM
lib/injectionFilter.js prompt-injection heuristics + page sanitiser
lib/storage.js chrome.storage wrappers (Settings, History)
The side panel opens a long-lived chrome.runtime.connect port to the
service worker. Each user message becomes a run event; the planner
streams delta, action, observation, confirm, final, done, and
error events back over the same port.
The content script is auto-injected by the manifest and re-injected
on demand from lib/tools.js — that second path covers reload races and
activeTab permission edge cases.
This thing can click buttons on your behalf. A few guardrails:
- Confirm before clicks is on by default. The agent has to ask before every click / type / navigation. Toggle off in Settings if you trust it for a particular run.
- Plan approval — agent mode forces a
present_plancall up front. No plan = no browser control. - Prompt-injection scanner — heuristic regex scan in
lib/injectionFilter.js flags page text that's
trying to override the system prompt or pull out secrets, strips
<script>blocks, and wraps page content in anUNTRUSTED_PAGE_CONTENTblock before forwarding it to the model. - No password / 2FA autofill — ever. The system prompt explicitly tells the agent to ask the user for those instead of typing them.
- Local-first storage. API keys live in
chrome.storage.local, sent to whichever provider you configured and nowhere else. Clear the lot from the side panel any time.
It's small enough that "extending" mostly means "edit the file."
- New tool → append to
TOOLSin lib/tools.js. JSON schema + a handler function. The model gets it on the next run. - New connector (Gmail, Calendar, Notion…) → add
identitytomanifest.json, drop alib/connectors/<name>.jsthat useschrome.identity.launchWebAuthFlow, expose its functions as tools. - Local model → drop a WASM runtime (e.g. MLC Web-LLM) into
lib/and branch onsettings.strictLocalinsidelib/llm.js. - New provider → one entry in lib/presets.js is usually enough, since most providers speak OpenAI-compatible JSON.
- No icons shipped — Chrome falls back to the default puzzle piece.
- The injection filter is heuristic, not a trained classifier. For production, swap in a small ONNX/WASM model.
extractandclick-by-text don't traverse iframes or Shadow DOM yet (all_frames: falsein the manifest, deliberately, to keep behaviour predictable).- OAuth connectors are scaffolding only — no Gmail / Calendar / Notion out of the box. See Extending if you want to wire one up.
0.4.0 — pre-1.0, expect breaking changes between minor versions while
the tool surface settles.