Skip to content

yanzzzk/Beckon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Beckon

BECKON

Don't go find the UI. Let it find you.

An ambient, generative action layer that sits on top of every website you visit.
Hover. Pause. Act. The interface assembles itself around your attention.


🎯 Supported Platforms

LinkedIn

LinkedIn

Gmail

Gmail

Capital IQ

Capital IQ

📊

Yahoo Finance

🐙

GitHub

🛒

Amazon

📝

Notion / Docs


Built with WXT Powered by Claude Chrome MV3 Hackathon


Demo · Why Now · Architecture · Quickstart · Roadmap



💡 The Idea

The modern web is a collection of beautiful, complicated apps — LinkedIn, Gmail, Figma, Yahoo Finance, Capital IQ, Notion, GitHub, Amazon. Each one buries the action you actually want behind three clicks, two menus, and a sidebar you have to scroll. Every product team is locally optimizing their own UI, and the user pays the tax of learning all of them.

Beckon is a browser layer that watches where your attention goes. When you pause on something — a name, a number, a component, a row in a table — Beckon surfaces the most useful actions for that thing on that site directly under your cursor.

No tab-switching. No menu hunting. No prompt to write. The interface assembles itself around your attention, in the context of whatever app you're already in.

It's not a chatbot. It's not a sidebar. It's a thin generative UI layer that turns every existing web app into something closer to what it should have been.


✨ What It Looks Like

The same pattern — hover, pause, act — plays out completely differently depending on which site you're on. Beckon understands the context.

💼 LinkedIn

Hover on a person → mutual connections, hiring status, last 3 posts in one line, "draft an intro" grounded in their activity. Hover on a company → headcount trend, recent funding, who in your network just joined or left.

📧 Gmail

Hover on a sender → last 3 threads summarized, whether you owe them a reply, and "draft a response in my voice" grounded in your past correspondence with that exact person.

📈 Yahoo Finance

Hover on a ticker anywhere → a live mini-chart fans out around your cursor with 1Y price, P/E, next earnings, and a "compare to sector" button. Hover on a CEO → tenure, recent insider trades, last earnings call sentiment.

🏛️ Capital IQ

Hover on a company → TTM revenue, EBITDA, comps set, latest filings, and a one-click "export to model". The five tabs of analyst grunt work, collapsed into one hover.

🎨 Figma

Hover on any frame → design tokens, "copy as Tailwind", contrast checker, and "generate three variants of this". The inspector you wish Figma had — without leaving the canvas.

🐙 GitHub

Hover on a repo → star velocity, last commit recency, who in your network contributes. Hover on a function name → the actual definition, recent changes, the most relevant issue — without leaving the file view.

🛒 Amazon

Hover on a product → real price history (was it actually on sale?), one-line summary of the 1-star reviews, and a "find this cheaper elsewhere" action. Five tabs of shopping research, one hover.

📝 Notion / Google Docs

Hover on any block → summarize the children, find related pages, rewrite in a different tone, extract action items. Ambient Cmd+K, but it finds you.

In every one of these cases, Beckon is reading two things at once: what site you're on (and therefore what kind of object you're hovering on) and what that specific object means in that specific app. A "name" on LinkedIn is a person; a "name" in a Figma layer panel is a component; a "name" in a GitHub diff is a function. The action set is completely different, and Beckon generates the right one for each.


🔥 Why This Is Hard, and Why Now

Two things had to be true for Beckon to be possible.

1. LLMs got fast and cheap enough. You can now ask a model to look at a chunk of DOM and tell you what every piece means — not in milliseconds, but inside the budget of a page load. Three years ago this cost a dollar per page. Today it costs a fraction of a cent with Claude Haiku 4.5.

2. The web stopped being documents and became apps — but the interaction model never caught up. We still navigate apps the way we navigated documents in 1995: by scanning a page for the link that matches our intent. Beckon is the interaction model that finally fits what the web actually is now.

The result is a single browser extension that makes every web app you already use feel 10× more direct, without any of those apps having to ship a single update.


🏗️ Architecture

The hard problem is latency. Hover-then-wait-1-second is a broken experience — the user has already moved on. Our trick: all LLM calls happen at page-load time, none at hover time.

┌─────────────────────────  BOOT PHASE  (once per page, ~1s)  ─────────────────────────┐
│                                                                                       │
│   Page load ──▶ Extract DOM ──▶ Claude Haiku ──▶ Cache                               │
│    DOM ready    Find clickables   Tag by intent    In-memory map<element, actions>   │
│                                                                                       │
└───────────────────────────────────────────────────────────────────────────────────────┘
                                            │
                                            ▼
┌─────────────────────────  RUNTIME PHASE  (every hover, <16ms)  ──────────────────────┐
│                                                                                       │
│   Mouse dwell ──▶ Lookup ──▶ Radial menu ──▶ Execute                                 │
│    600ms hover    Match cache  Animate buttons   Preview or embed                    │
│                                                                                       │
│   ⚡ Zero LLM calls. Pure local. 60fps guaranteed.                                    │
│                                                                                       │
└───────────────────────────────────────────────────────────────────────────────────────┘
                                            │
                                            ▼
                          User clicks a *dynamic* action
                                            │
                                            ▼
                ┌──────  Claude Sonnet streams the answer  ──────┐
                │   "Summarize this post" / "Draft an intro" /    │
                │   "Pull last 4 quarters of revenue" / etc.      │
                └─────────────────────────────────────────────────┘

Three design decisions that make this work

Two-tier action model. Static actions (open link, copy, translate) run purely locally — zero network. Dynamic actions (summaries, drafts, charts, comps) hit the LLM only after the user has clicked — by which point intent is unambiguous and a 1-second wait is acceptable.

Site-aware classification. A single shared classify endpoint takes the harvested DOM elements + the URL and routes them through site-specific prompts. LinkedIn knows "person/company/post"; Yahoo knows "ticker/CEO/sector"; GitHub knows "repo/file/function". Adding a new site is one prompt template + one selector list.

Generative UI as first-class output. The dynamic action layer doesn't return text — it returns rendered components (mini-charts, comp tables, draft cards). Claude Sonnet decides not just what to say but what shape the answer should take for that object on that site.


⚙️ Tech Stack

Layer Choice Why
Extension framework WXT + Vite Fastest MV3 dev loop on the planet. HMR for content scripts.
UI rendering Shadow DOM + vanilla TS Total style isolation from the host page. Zero React overhead in content script.
Classification model Claude Haiku 4.5 Cheap enough to run on every page load. ~200ms p50.
Generation model Claude Sonnet 4.6 Streamed into preview cards on click.
Backend proxy Hono on Node Tiny, fast, edge-ready. Hides API keys, batches requests.
Cache In-memory Map<stableKey, ActionSet> Zero-latency runtime lookups. LRU evicted at 500 entries.
Dwell detection Custom mousemove + 600ms timer Treated as an intent proxy — not eye tracking, but close enough.

🚀 Quickstart

Prerequisites

1. Clone & install

git clone https://github.com/your-org/beckon.git
cd beckon

2. Start the backend

cd server
cp .env.example .env
# Add your ANTHROPIC_API_KEY to .env
npm install
npm run dev

Server boots on http://localhost:3456 with two endpoints:

Endpoint Model Purpose
POST /api/classify Haiku 4.5 Boot-phase batch tagging of DOM elements
POST /api/generate Sonnet 4.6 Runtime generation for clicked dynamic actions

3. Build the extension

cd ../extension
npm install
npm run build

4. Load it into Chrome

  1. Open chrome://extensions
  2. Toggle Developer mode (top-right)
  3. Click Load unpacked
  4. Select extension/.output/chrome-mv3 (can't see .output in Finder? press Cmd + Shift + .)

5. Try it

Open LinkedIn, Gmail, or Yahoo Finance. Hover on a name, a ticker, a sender. Pause for 0.6 seconds. Watch the UI come to you.

Press Esc to dismiss. Hover somewhere else to summon a new menu.


🎯 Demo Sites (working today)

Site Hover targets Status
LinkedIn Person · Company · Post · Job ✅ Live
Gmail Sender · Thread · Attachment ✅ Live
Yahoo Finance Ticker · CEO · Sector ✅ Live
Capital IQ Company · Filing · Comp set ✅ Live
GitHub Repo · File · Function name ✅ Live
Amazon Product · Seller · Review block ✅ Live
Notion / Docs Block · Heading · Database row ✅ Live

Adding a new site = one selector list + one prompt template. The runtime is fully site-agnostic.


🗺️ Roadmap

  • Boot/runtime split architecture
  • Shadow-DOM radial menu with auto-flip & spread
  • Two-tier static/dynamic action model
  • Hono proxy with Haiku + Sonnet routing
  • LinkedIn end-to-end (person · company · post · job)
  • Generative UI components — mini-charts, comp tables, draft cards as first-class outputs
  • Gmail, Yahoo Finance, Capital IQ end-to-end
  • Local-first fallback (Gemini Nano / WebLLM) for sensitive pages
  • Per-site privacy allowlist + sensitive-field detection
  • User memory: actions learn from what you actually click
  • Focus mode v2: radial dim that follows your cursor

🧠 Design Principles

  1. The cursor is the UI. Anything that takes the user's eye away from where they're already looking has lost.
  2. Latency is the product. A perfect feature behind a 2-second wait is worse than a mediocre feature with zero wait.
  3. Generative ≠ chat. The output of an AI action should be a component, not a paragraph.
  4. Every site is its own ontology. Don't pretend the web is uniform. Embrace site-specific intelligence.
  5. The web app stays untouched. Beckon is a layer, not a fork. Your LinkedIn is still LinkedIn.

Beckon

the interaction model that finally fits what the web actually is.

Don't go find the UI. Let it find you.

Releases

No releases published

Packages

 
 
 

Contributors