An ambient, generative action layer that sits on top of every website you visit.
Hover. Pause. Act. The interface assembles itself around your attention.
Demo · Why Now · Architecture · Quickstart · Roadmap
The modern web is a collection of beautiful, complicated apps — LinkedIn, Gmail, Figma, Yahoo Finance, Capital IQ, Notion, GitHub, Amazon. Each one buries the action you actually want behind three clicks, two menus, and a sidebar you have to scroll. Every product team is locally optimizing their own UI, and the user pays the tax of learning all of them.
Beckon is a browser layer that watches where your attention goes. When you pause on something — a name, a number, a component, a row in a table — Beckon surfaces the most useful actions for that thing on that site directly under your cursor.
No tab-switching. No menu hunting. No prompt to write. The interface assembles itself around your attention, in the context of whatever app you're already in.
It's not a chatbot. It's not a sidebar. It's a thin generative UI layer that turns every existing web app into something closer to what it should have been.
The same pattern — hover, pause, act — plays out completely differently depending on which site you're on. Beckon understands the context.
|
Hover on a person → mutual connections, hiring status, last 3 posts in one line, "draft an intro" grounded in their activity. Hover on a company → headcount trend, recent funding, who in your network just joined or left. |
Hover on a sender → last 3 threads summarized, whether you owe them a reply, and "draft a response in my voice" grounded in your past correspondence with that exact person. |
|
Hover on a ticker anywhere → a live mini-chart fans out around your cursor with 1Y price, P/E, next earnings, and a "compare to sector" button. Hover on a CEO → tenure, recent insider trades, last earnings call sentiment. |
Hover on a company → TTM revenue, EBITDA, comps set, latest filings, and a one-click "export to model". The five tabs of analyst grunt work, collapsed into one hover. |
|
Hover on any frame → design tokens, "copy as Tailwind", contrast checker, and "generate three variants of this". The inspector you wish Figma had — without leaving the canvas. |
Hover on a repo → star velocity, last commit recency, who in your network contributes. Hover on a function name → the actual definition, recent changes, the most relevant issue — without leaving the file view. |
|
Hover on a product → real price history (was it actually on sale?), one-line summary of the 1-star reviews, and a "find this cheaper elsewhere" action. Five tabs of shopping research, one hover. |
Hover on any block → summarize the children, find related pages, rewrite in a different tone, extract action items. Ambient Cmd+K, but it finds you. |
In every one of these cases, Beckon is reading two things at once: what site you're on (and therefore what kind of object you're hovering on) and what that specific object means in that specific app. A "name" on LinkedIn is a person; a "name" in a Figma layer panel is a component; a "name" in a GitHub diff is a function. The action set is completely different, and Beckon generates the right one for each.
Two things had to be true for Beckon to be possible.
1. LLMs got fast and cheap enough. You can now ask a model to look at a chunk of DOM and tell you what every piece means — not in milliseconds, but inside the budget of a page load. Three years ago this cost a dollar per page. Today it costs a fraction of a cent with Claude Haiku 4.5.
2. The web stopped being documents and became apps — but the interaction model never caught up. We still navigate apps the way we navigated documents in 1995: by scanning a page for the link that matches our intent. Beckon is the interaction model that finally fits what the web actually is now.
The result is a single browser extension that makes every web app you already use feel 10× more direct, without any of those apps having to ship a single update.
The hard problem is latency. Hover-then-wait-1-second is a broken experience — the user has already moved on. Our trick: all LLM calls happen at page-load time, none at hover time.
┌───────────────────────── BOOT PHASE (once per page, ~1s) ─────────────────────────┐
│ │
│ Page load ──▶ Extract DOM ──▶ Claude Haiku ──▶ Cache │
│ DOM ready Find clickables Tag by intent In-memory map<element, actions> │
│ │
└───────────────────────────────────────────────────────────────────────────────────────┘
│
▼
┌───────────────────────── RUNTIME PHASE (every hover, <16ms) ──────────────────────┐
│ │
│ Mouse dwell ──▶ Lookup ──▶ Radial menu ──▶ Execute │
│ 600ms hover Match cache Animate buttons Preview or embed │
│ │
│ ⚡ Zero LLM calls. Pure local. 60fps guaranteed. │
│ │
└───────────────────────────────────────────────────────────────────────────────────────┘
│
▼
User clicks a *dynamic* action
│
▼
┌────── Claude Sonnet streams the answer ──────┐
│ "Summarize this post" / "Draft an intro" / │
│ "Pull last 4 quarters of revenue" / etc. │
└─────────────────────────────────────────────────┘
Two-tier action model. Static actions (open link, copy, translate) run purely locally — zero network. Dynamic actions (summaries, drafts, charts, comps) hit the LLM only after the user has clicked — by which point intent is unambiguous and a 1-second wait is acceptable.
Site-aware classification. A single shared classify endpoint takes the harvested DOM elements + the URL and routes them through site-specific prompts. LinkedIn knows "person/company/post"; Yahoo knows "ticker/CEO/sector"; GitHub knows "repo/file/function". Adding a new site is one prompt template + one selector list.
Generative UI as first-class output. The dynamic action layer doesn't return text — it returns rendered components (mini-charts, comp tables, draft cards). Claude Sonnet decides not just what to say but what shape the answer should take for that object on that site.
| Layer | Choice | Why |
|---|---|---|
| Extension framework | WXT + Vite | Fastest MV3 dev loop on the planet. HMR for content scripts. |
| UI rendering | Shadow DOM + vanilla TS | Total style isolation from the host page. Zero React overhead in content script. |
| Classification model | Claude Haiku 4.5 | Cheap enough to run on every page load. ~200ms p50. |
| Generation model | Claude Sonnet 4.6 | Streamed into preview cards on click. |
| Backend proxy | Hono on Node | Tiny, fast, edge-ready. Hides API keys, batches requests. |
| Cache | In-memory Map<stableKey, ActionSet> |
Zero-latency runtime lookups. LRU evicted at 500 entries. |
| Dwell detection | Custom mousemove + 600ms timer |
Treated as an intent proxy — not eye tracking, but close enough. |
- Node.js 18+
- An Anthropic API key
- Chrome / Brave / Arc (any Chromium browser)
git clone https://github.com/your-org/beckon.git
cd beckoncd server
cp .env.example .env
# Add your ANTHROPIC_API_KEY to .env
npm install
npm run devServer boots on http://localhost:3456 with two endpoints:
| Endpoint | Model | Purpose |
|---|---|---|
POST /api/classify |
Haiku 4.5 | Boot-phase batch tagging of DOM elements |
POST /api/generate |
Sonnet 4.6 | Runtime generation for clicked dynamic actions |
cd ../extension
npm install
npm run build- Open
chrome://extensions - Toggle Developer mode (top-right)
- Click Load unpacked
- Select
extension/.output/chrome-mv3(can't see.outputin Finder? pressCmd + Shift + .)
Open LinkedIn, Gmail, or Yahoo Finance. Hover on a name, a ticker, a sender. Pause for 0.6 seconds. Watch the UI come to you.
Press
Escto dismiss. Hover somewhere else to summon a new menu.
| Site | Hover targets | Status |
|---|---|---|
| Person · Company · Post · Job | ✅ Live | |
| Gmail | Sender · Thread · Attachment | ✅ Live |
| Yahoo Finance | Ticker · CEO · Sector | ✅ Live |
| Capital IQ | Company · Filing · Comp set | ✅ Live |
| GitHub | Repo · File · Function name | ✅ Live |
| Amazon | Product · Seller · Review block | ✅ Live |
| Notion / Docs | Block · Heading · Database row | ✅ Live |
Adding a new site = one selector list + one prompt template. The runtime is fully site-agnostic.
- Boot/runtime split architecture
- Shadow-DOM radial menu with auto-flip & spread
- Two-tier static/dynamic action model
- Hono proxy with Haiku + Sonnet routing
- LinkedIn end-to-end (person · company · post · job)
- Generative UI components — mini-charts, comp tables, draft cards as first-class outputs
- Gmail, Yahoo Finance, Capital IQ end-to-end
- Local-first fallback (Gemini Nano / WebLLM) for sensitive pages
- Per-site privacy allowlist + sensitive-field detection
- User memory: actions learn from what you actually click
- Focus mode v2: radial dim that follows your cursor
- The cursor is the UI. Anything that takes the user's eye away from where they're already looking has lost.
- Latency is the product. A perfect feature behind a 2-second wait is worse than a mediocre feature with zero wait.
- Generative ≠ chat. The output of an AI action should be a component, not a paragraph.
- Every site is its own ontology. Don't pretend the web is uniform. Embrace site-specific intelligence.
- The web app stays untouched. Beckon is a layer, not a fork. Your LinkedIn is still LinkedIn.