Press ⌘K on any webpage, describe what you want in plain English, and an AI agent generates, previews, and installs a content-script feature that changes the page to your liking — in under 10 seconds.
📺 Watch the demo ·
BOB is provider-agnostic (Anthropic, OpenAI, Google), bring-your-own-key, and runs entirely on-device. No backend, no proxy, no telemetry. Built at LA Hacks 2026.
A Chrome MV3 extension. The service worker hosts a tool-using LLM agent (Reflexion-style retry loop) behind a unified provider contract; content scripts handle the ⌘K overlay, DOM pruning, behavior tracking, and feature execution in MAIN world.
npm install
npm run build
Then load dist/ as an unpacked extension at chrome://extensions.
Open the BOB options page (right-click the extension icon → Options) and paste an API key for one of: Anthropic, OpenAI, or Google. Pick your provider in the same page. The extension uses that provider for all feature generation until you change it.
API key sources:
- Anthropic — https://console.anthropic.com/settings/keys
- OpenAI — https://platform.openai.com/api-keys
- Google — https://aistudio.google.com/apikey
npm run dev # Vite watch mode
npm run typecheck # tsc --noEmit
- User presses ⌘K on any page → shadow-DOM overlay opens.
- User types a request ("hide YouTube Shorts", "dim the sidebar").
- Content script captures a pruned ≤4 KB DOM snapshot of the current page (stable identifiers prioritized, autogenerated class noise stripped).
- Background service worker sends prompt + snapshot to the configured LLM,
with a tool registry the model can call:
query_dom— run targeted CSS-selector probes against the live page when the snapshot isn't enough.test_code— execute candidate JS in the user's tab and read back the DOM delta plus any thrown error before committing.
- LLM returns JSON:
{ code, name, description, urlPattern }. - Overlay shows a syntax-highlighted preview with editable name and URL pattern. The user reviews and clicks Install (nothing is auto-applied).
- Feature is saved to
chrome.storage.localand executed immediately viachrome.scripting.executeScriptwithworld: 'MAIN', using a per-injection Trusted Types policy so it survives strict-CSP sites. - On future visits to matching URLs, the content script asks the background
worker to re-inject the saved code — including across SPA navigations
(history-API patch) and DOM mutations (
window.__bobObserve, a debounced slug-keyed MutationObserver helper installed in MAIN world). - If the injected code throws, BOB feeds the runtime error and a fresh DOM snapshot back to the agent and asks it to fix the root cause — up to two retries (Reflexion-style).
- Popup lists installed features with run/error counts, expandable stack traces, toggles, and JSON import/export.
After install, the overlay flips into refine mode ("smaller", "apply more broadly", "undo this") with the prior code and a capped history of past turns in context, so refinement is a real conversation rather than a re-prompt.
- ⌘K overlay — prompt → preview → install on any page.
- Three providers, one interface — Anthropic Claude, OpenAI GPT, Google
Gemini / Gemma. Switching providers does not regress reasoning capability:
a single
effortMode: 'high'flag maps to Anthropic extended thinking, OpenAIreasoning_effort, and GeminithinkingBudget/ GemmathinkingLevel. - Tool-using agent —
query_domandtest_codelet the model verify against the real page instead of hallucinating selectors. - Reflexion retry loop — runtime errors trigger an automatic fix-the-root-cause pass with the live DOM and stack trace.
- Idempotent generated code — every mutation is tagged
data-bob='<slug>'and every observer is keyed by slug, so re-running on SPA navigation or toggle cycles doesn't compound side effects. - Conversational refinement — refine installed features in the same
overlay, with iteration history (
parentFeatureId,iterationNumber). - Behavior-driven suggestions — BOB watches click patterns (privacy-bounded: no input/contenteditable/password reads, sensitive hostnames skipped) and proactively proposes automations like "auto-dismiss this banner" after three manual dismissals. Three-state dismissal: Try / Later (3-day cooldown) / Never.
- Voice input — Web Speech API on the prompt textarea.
- Import / export — features serialize to portable JSON. Curated feature packs ship-able to teammates.
- Bring-your-own-key, on-device — keys and features live in
chrome.storage.local. No server.
| Refinement & Sandbox | Generated Chess Game | Stack Overflow Rewrite |
|---|---|---|
![]() |
![]() |
![]() |
| Features & Suggestions | Settings & API Keys |
|---|---|
![]() |
![]() |
src/
background/
index.ts Service worker, message router,
MAIN-world injector, __bobObserve helper
llm.ts Provider-agnostic feature generation entry
agent.ts Tool-loop controller + output contract parsing
tools.ts query_dom, test_code tool dispatcher
suggestions-engine.ts Behavior → suggestion synthesis, dismissal
settings.ts API keys, provider, effort mode persistence
error-recorder.ts Per-feature run/error tracking
providers/
types.ts Provider interface
prompt.ts Shared system prompt + user prompt builder
anthropic.ts Anthropic Messages API client
openai.ts OpenAI Chat Completions client
google.ts Gemini / Gemma generateContent client
content/
index.ts Content script entry, runs on every page
injector.ts Dispatches feature execution to background
dom-prune.ts ≤4 KB DOM snapshot for LLM context
behavior-tracker.ts Privacy-bounded click pattern observer
spa.ts History-API patch for SPA navigation
lifecycle.ts Per-feature install/teardown plumbing
observer-helper.ts Shared mutation-observer utilities
page-badge.ts On-page status badge
quick-toggle.ts Per-feature on-page toggle UI
voice-input.ts Web Speech API for prompt entry
overlay/
overlay.ts ⌘K UI (shadow DOM, state machine)
overlay.css Overlay styles
preview.ts Diff/preview view rendering
popup/
popup.{html,ts,css} Installed-features panel
suggestions-section.* Proactive suggestion UI
iteration.ts Refinement-tree rendering
import-export.ts JSON feature pack import/export
options/
options.{html,ts,css} Provider + API key configuration
shared/
types.ts Locked type contracts
storage.ts chrome.storage CRUD with per-key locks
messages.ts Message-passing helper
hotkey.ts ⌘K binding
keybind-conflicts.ts Detect collisions with site shortcuts
ui-scale.ts Overlay scaling
~6,000 lines of TypeScript. Zero runtime dependencies.
Why bring-your-own-key?
We don't run a backend, and we're not paying your inference bills. Your API
key never leaves your machine — it sits in chrome.storage.local.
Why not just ask ChatGPT to write a userscript? Because the agent verifies against your live DOM, retries on errors, and persists the result as an installable, idempotent feature. A chat tab can't do any of that.
Why a Chrome extension instead of a Chromium fork? Scope. MV3 + content scripts + MAIN-world injection is enough surface area to do most of what a "real" browser modification needs.
- Sites with strict Trusted Types policies (some Google properties) may still reject injected scripts despite the per-injection policy fallback. Demo on sites that work; don't fight individual sites' CSP.
- Generated code runs in the page's MAIN world and is subject to the page's CSP. Most sites work; a few do not.
- No cloud sync. Features and settings live in
chrome.storage.localon the device they were created on. Use Import / Export in the popup to move feature packs between machines.
TypeScript · Vite · @crxjs/vite-plugin · Chrome MV3 service worker ·
shadow-DOM overlay · Anthropic / OpenAI / Google APIs · Web Speech API ·
chrome.scripting.executeScript (MAIN world) · Trusted Types ·
chrome.storage.local. Zero runtime dependencies.
MIT — see LICENSE.





