Agentic Browser Chat

A privacy-respecting, local-first AI assistant that lives in your browser. Chat with any page, run agent tools, manage notes, tasks, and quizzes, attach files, and keep persistent memory — all from an injected overlay panel that ships with zero servers.

What it is

Agentic Browser Chat is a Chrome extension that puts an AI assistant into an overlay panel injected into every page (a shadow-DOM UI on top of the page, not Chrome's built-in side panel API). It can:

Read whatever page you are on and answer questions about it.
Hold long-running conversations with searchable history, favorites, and per-chat model overrides.
Take actions on your behalf through a built-in tool registry: page queries, form filling, code execution in a sandboxed worker, web search, web fetch, document and image generation, and more.
Remember facts about you across chats, save reusable procedures as skills, and apply your own custom instructions to every chat.
Manage your own notes (with version history and pop-out editor), tasks (with reminders), and quizzes — all alongside the chat.
Attach files (PDF, DOCX, PPTX, XLSX, CSV, images), browser tabs, other chats, other notes, screenshots, and pasted spreadsheets.

It is designed to be local-first: all your data lives in your browser. There is no backend, no analytics, no developer-controlled server. Network requests go only to OpenRouter, the single LLM provider this extension uses.

Screenshots

_{Chat — start a conversation about the current page}	_{Page context — ask anything about what's on screen}	_{Attachments — PDF, DOCX, PPTX, XLSX, CSV, images}
_{Notes — build a personal knowledge base}	_{Tasks — with due dates and reminders}	_{Quizzes — practice from your notes}
_{Image generation — inline via your selected model}	_{Setup — paste your OpenRouter key once}	_{Settings — model, theme, data management}

Highlights

A single browser-side agent that can read the current page, search the web and fetch URLs (under guardrails), run JavaScript inside a sandboxed Web Worker for precise computation, fill forms with a confirmation step, and generate Office documents (DOCX, PPTX, XLSX, PDF, CSV) and images. It carries a persistent memory the model can write to, a library of reusable skills you (or the model) can save, and your own custom instructions that ride on every chat. Notes, tasks, and quizzes live in the same panel so you can capture, schedule, and review knowledge without leaving the page. Everything is stored locally in IndexedDB; the only external network destination is OpenRouter, with your own API key.

Features

Chat with any page

Per-page chat with the LLM, with full markdown, code, diagram, and math rendering.
Persistent chat history: searchable, sortable, pinnable. Old chats can be auto-pruned on a threshold you set.
Suggested prompts on every new chat (Summarise this for me, Explain this to me, Help me with this task/question).
Per-chat model override: pick any OpenRouter model for an individual conversation without changing your default.
Favorites: pin the chats you keep coming back to and switch to the Favs view to filter the list.
Quick Question: a lightweight modal triggered from right-clicking a selection. Each Quick Question becomes its own short chat in a separate Quick Q log so it doesn't clutter your main history.
Copy raw chat for export or sharing.
Reduce-to-float / Expand panel modes: shrink the panel to a floating bubble, or expand it to take over the viewport.

Right-click menu and content selector

Right-click menu entries:
- On a text selection: Explain selection, Summarize selection, Proofread selection, Quick Question about selection, Add selection to chat.
- On an image: Add image to chat.
Content selector: a hover-highlight mode you can toggle on. Click any page element to add its content to the chat. Right-clicking the highlight opens a small menu that lets you choose Add simple HTML to chat (cleaned, flattened representation) or Add raw HTML to chat (the literal markup).
Leave-warning: if you try to navigate away while the agent is mid-task, the extension warns you before the page unloads.

Attachments

Attach context to a chat via the + button in the input area:

Source	Notes
Image upload	`.png`, `.jpg`, `.webp`, `.gif`
File upload	`.txt`, `.md`, `.json`, `.csv`, `.pdf`, `.docx`, `.xlsx`, `.xls`, `.ods`, `.pptx` (and other `text/*`)
Take screenshot	Snapshots the current tab and attaches the image inline
Browser tab content	Pick another open tab; its content gets flattened and attached
Note	Attach an existing note as context
Chat summary	Attach a summary of another chat
Spreadsheet from clipboard	Paste tabular data from your clipboard and treat it as a spreadsheet attachment

Inside the Notes editor, you can also attach files to the note itself (the same file types as chat input).

Notes

A full note editor inside the panel with Edit mode and a separate read view.
Pop-out: open any note in its own floating window so you can keep editing while you browse and chat in the main panel.
Version history: changes are versioned so you can recover earlier drafts.
Favorites and search.
File attachments on each note.
Skill notes: a note can be marked as a skill with a slug like calculate-worksheet-discrepancy. The agent can then load it on demand via the skill tool. See Memory, Skills, and Custom Instructions.

Tasks

Title, optional description, dueAt, reminderAt.
Filter by All / Pending / Completed.
Chrome notifications when reminders fire.
Optional alert sound played alongside the notification.
Configurable reminder lead time (default minutes before due) in Settings.
Reminder alarms are checked every minute by the service worker.

Quiz

Generate questions from any source material — pasted text, an attached note, or a chat — via the generate_questions tool.
Supports MCQ (multiple choice), FITB (fill-in-the-blank), and a mix of the two.
Optional focus parameter scopes generation to a specific topic.
Spaced practice: each question has a pausedUntil field so you can mark it answered and have it disappear from the active pool for a chosen interval.

Agent tools

The agent calls these tools mid-conversation. Grouped by purpose, with safeguards highlighted:

Filesystem-style operations on the notes corpus

read, write, edit — read, create, and modify note content.
grep, ls — discover notes by regex (content or title scope) or list them.

Page interaction

page_query — query elements on the current page using selectors and return their content / state.
page_fill_form — fill form fields and (optionally) submit. Safeguard: comma-separated selector lists are rejected; one confirmed selector per field. Form submission is gated by an explicit user confirmation step.
eval — run JavaScript in a sandboxed Web Worker. Safeguards: no DOM, no chrome APIs, no network (fetch / XHR / WebSocket / importScripts / caches / IndexedDB / BroadcastChannel all blocked). Hard timeout of 5–30 seconds. Output capped at 200 KB; input vars capped at 1 MB.

Web

web_search — search the web. Required gateway: the agent must search before it can fetch any URL it didn't already see.
web_fetch — fetch a URL and either summarize it or answer a specific prompt against it. Handles HTML, plain text, JSON, images (via a vision model), and documents (PDF, DOCX, XLSX, PPTX). Safeguard: the runtime rejects any URL that did not appear in the conversation context (user message or prior tool result). The model cannot fabricate a URL and fetch it. 15-second timeout per request.

Generation

create_document — generate a downloadable DOCX, XLSX, PDF, PPTX, or CSV file with structured content (headings, bullets, tables, sheets, slides). Files are stored as blobs and displayed inline.
generate_image — image generation through the configured image model.
generate_questions — produces quiz items and saves them directly to the Quiz tab.

Context awareness

get_environment — current date, time, timezone, locale, and OS/platform.
memory — read/write the persistent memory note (see below).
skill — create, read, update, or delete skill notes (see below).

Memory, Skills, and Custom Instructions

The extension supports three distinct mechanisms for injecting context into the agent. Knowing the difference matters.

Mechanism	Who writes it	How it's referenced	When the model sees it
Custom instructions	You, in Settings → Agent Rules	Static text typed once	Injected automatically into the system prompt of every chat
Memory	The agent (via the `memory` tool), on your request	A single persistent memory note managed by the model	Injected automatically into every chat
Skills	You or the agent (via the `skill` tool)	Each skill has a unique slug like `calculate-worksheet-discrepancy`. The model lists available skills but loads a skill's body only when needed	Listed in every chat; full body loaded on demand

In practice:

Custom instructions are the place for global preferences: "Always respond in British English", "Keep answers concise", "When showing code, prefer TypeScript".
Memory is for facts the agent should always know about you: "User's name is Tayo", "User uses VS Code", "User's pets are named Bo and Lyra". You ask the agent to remember; it calls the memory tool and the entry is appended (phrased in third person).
Skills are for procedures you don't want to retype: a step-by-step process the agent can re-apply on request. You say "remember how to do X"; the agent saves a skill. Later you can say "do X for this data" and the agent loads /x and follows it.

API call logs

A built-in API Logs view captures every request to OpenRouter (prompt, tool calls, raw response, token counts, cost estimate). It's intended as a power-user and debugging feature — useful when an answer is wrong and you want to inspect exactly what the model received, or when you're tuning custom instructions and want to see the assembled system prompt.

Open it from Settings → View API logs. Logs can be viewed as rendered text or raw JSON. They can be cleared from the same view.

Cross-tab sync

The panel state stays consistent across tabs:

Open the panel on tab A, switch to tab B, and the panel appears with the same conversation, selected note, and tab focus.
A Sync all button in the panel toolbar forces a manual re-sync of chats, notes, tasks, and quiz questions across tabs when needed (e.g. after fixing a stale draft).
The service worker tracks the active tab and pushes state updates to the relevant content scripts.

Rendering

The chat surface renders model output with:

Markdown via marked, sanitized through DOMPurify.
Syntax highlighting for code via highlight.js.
Mermaid diagrams inline.
LaTeX / math via MathJax with the TeX-SVG output.
A multi-slide onboarding carousel introduces these features on first run (the screenshots above are taken from it).

Install

From the Chrome Web Store

Coming soon. A link will be added here once the listing is live.

From source (developer mode)

This is the supported path while the Chrome Web Store listing is in review, and the recommended path if you want to hack on the code.

Clone the repo:

git clone https://github.com/t33w411/agentic-browser-chat.git
cd agentic-browser-chat

Open chrome://extensions in Chrome (or any Chromium-based browser: Edge, Brave, Arc, Vivaldi).
Toggle Developer mode on (top-right).
Click Load unpacked.
Select the cloned agentic-browser-chat directory.
Pin the extension to the toolbar for easy access.

There is no build step. The extension loads source files directly. After pulling new changes, click the extension's reload icon in chrome://extensions and reload any open tabs you want the changes to apply to.

Quick start

Click the extension's toolbar icon to open the side panel on any page.
Walk through the onboarding carousel, or skip to Settings (gear icon).
Paste your OpenRouter API key and pick a default model.
Optional: in Settings → Agent Rules, add a line or two of custom instructions (e.g. "Be concise. Answer in British English.").
Close settings and try:
- "Summarize this page."
- Select a paragraph, right-click → Quick Question about selection.
- Drag a PDF into the input box and ask about it.
- Ask: "Remember that my name is Tayo and I prefer concise responses." The agent will write that to memory so it persists across chats.
Open the Notes tab; the agent can also write to and read from your notes via its filesystem-style tools.
Open the Tasks tab and create a task with a due time — you'll get a Chrome notification when it fires.

Configuration

All configuration lives in the panel's Settings tab. There is no separate options page.

Setting	What it does
Theme	Light, dark, or follow system.
OpenRouter API Key	Your OpenRouter API key. Stored locally in `chrome.storage`. Never transmitted anywhere except in the `Authorization` header of OpenRouter API calls.
Default chat model	Used for new conversations. Per-chat overrides are available from inside any chat.
Image generation model	Used by the `generate_image` tool.
Custom instructions (Agent Rules)	Free-text instructions injected into the system prompt of every new chat.
Alert sound	Toggles a sound effect for task reminders and agent confirmation prompts.
Reminder lead time	Default minutes before `dueAt` to fire a reminder (0–1440).
Storage used	Live estimate of how much browser storage the extension is consuming.
Delete chats older than	Auto-prune chats above a threshold (30 / 60 / 90 / 180 / 365 days, or Never). Pinned chats are excluded.
Prune orphaned blobs	Manually reclaim space used by attachment blobs no longer referenced by any chat or note.
View API logs	Open the API call inspector (see API call logs).

LLM provider

The extension uses OpenRouter as its single provider. OpenRouter aggregates models from OpenAI, Anthropic, Google, Meta, Mistral, DeepSeek, and many others behind one API and one key, so you can switch models freely without juggling separate accounts.

Direct integrations with OpenAI, Anthropic, or other provider APIs are not supported.

No API key ships with the extension. Bring your own OpenRouter key.

Permissions: what each one is for

Manifest V3 requires upfront declaration of every permission. Here is the why for each:

Permission	Why
`<all_urls>` host access	The extension needs to read page content (when you ask it to) and inject the panel + content selector on any site. Activated only when you invoke a feature.
`activeTab`, `tabs`	Identify the current tab and synchronize the panel across tabs.
`scripting`	Inject the panel and tool scripts dynamically (and re-inject after extension reload).
`contextMenus`	Right-click selection actions (Explain, Summarize, etc.).
`clipboardRead`, `clipboardWrite`	Read clipboard when you paste into the panel; write formatted output when you copy from it.
`storage`	Persist settings, API key, and preferences.
`alarms`	Schedule task reminders.
`notifications`	Show reminder notifications when a task is due.
`offscreen`	Run DOM-dependent utilities under MV3, which strips the service worker of a DOM.

If a future feature would require a new permission, it will be called out in the changelog and discussed in an issue first.

Privacy

Short version: all your data stays on your device. The only network destination is OpenRouter, which you authenticate with your own key.

The extension does not:

Operate any developer-controlled server.
Send analytics, telemetry, or crash reports anywhere.
Sync data across devices.
Phone home for license checks, updates beyond the standard Chrome Web Store mechanism, or anything else.

The extension does:

Store chats, notes, tasks, quizzes, settings, memory, skills, and attachment blobs in your browser's IndexedDB and chrome.storage.
Send your conversation, attachments, and API key to OpenRouter over HTTPS on each turn you initiate.

Full details: PRIVACY.md.

Architecture

The extension is a single-process Manifest V3 setup:

A service worker (background/service-worker.js) coordinates everything: storage, tab messaging, context menus, alarms, the IndexedDB handler, and the agent's API logger.
Content scripts are injected into every tab. They render the side panel inside a shadow DOM, drive the content selector and selection actions, and bridge messages to the service worker.
The panel is built in vanilla JS with no framework and no bundler. UI lives inside a shadow root with mode: 'open'; queries go through the shadow root, never document.
The agent lives client-side. It builds the system prompt + tool list (agent/contextBuilder.js), calls the LLM (agent/client.js), executes tool calls (agent/toolExec.js), and compacts history when the context window fills (agent/compactor.js).
All persistent state lives in IndexedDB via Dexie (shared/db.js), proxied through a service-worker handler so content scripts have a single, race-free path to the DB.

Why no framework?

Manifest V3 forbids remotely-loaded code; everything must be bundled.
A framework would add complexity (bundler, build, source maps) for marginal benefit at this scale.
Vanilla JS keeps the code reviewable and the extension small.

Project layout

agentic-browser-chat/
├── manifest.json              # MV3 manifest; declares scripts, permissions, web-accessible resources
├── background/                # Service worker and its modules
│   ├── service-worker.js      # Entry point: wires up everything below
│   ├── dbHandler.js           # IndexedDB proxy for content scripts
│   ├── tabMessaging.js        # Cross-tab/content messaging + script re-injection
│   ├── contextMenus.js        # Right-click menu setup
│   ├── commands.js            # Keyboard command handlers
│   ├── panelDataRepoImpl.js   # Repository functions backing the panel (notes, chats, tasks, blobs)
│   └── apiLoggerImpl.js       # Persistent log of LLM API calls
├── content/                   # Content-script bootstrap
│   ├── preInit.js             # First in the load order; bumps the listener-generation counter
│   ├── keyboardShield.js      # Captures keys early to protect panel inputs
│   ├── agentLeaveWarning.js   # Warns on navigate-away while agent is working
│   └── main.js                # Last in the load order; runs re-init recovery
├── panel/                     # The injected overlay panel (shadow-DOM UI)
│   ├── panel.js               # Boot controller; attaches the shadow root
│   ├── panelTemplate.js       # HTML template
│   ├── panelData.js           # Static reference data (templates, defaults)
│   ├── panelRuntime.js        # All UI logic, tab routing, data-action delegation
│   ├── panelStateSync.js      # Cross-tab state sync
│   ├── panelIcons.js          # Icon SVGs
│   ├── images/                # Static images used inside the panel
│   └── panel.css
├── agent/                     # The LLM agent loop
│   ├── client.js              # Provider-agnostic HTTP client
│   ├── contextBuilder.js      # Builds the system prompt + message list per turn
│   ├── tools.js               # Tool definitions (schemas)
│   ├── toolExec.js            # Executes tool calls in the page/extension
│   ├── documentGeneration.js  # Generated-doc support
│   ├── fileParsing.js         # PDF/DOCX/XLSX/etc. parsing pipeline
│   ├── compactor.js           # History compaction when context fills
│   └── apiLogger.js           # Client side of the API log
├── shared/                    # Code used by both content scripts and the service worker
│   ├── db.js                  # Dexie schema
│   ├── messages.js            # Message-type constants
│   ├── storage.js             # chrome.storage wrapper
│   ├── search.js              # FlexSearch index helpers
│   ├── runtimeRequest.js      # Promise-based chrome.runtime.sendMessage wrapper
│   ├── panelDataRepo.js       # Proxy that forwards repo calls to the service worker
│   └── toolRegistry.js        # Lookup for available tools
├── tools/                     # Page-interaction tools
│   ├── contentSelector.js     # Hover-highlight + click-to-attach
│   ├── selectionContextActions.js  # Right-click actions on selected text
│   └── flattenedContent.js    # Clean HTML extraction
├── ui/                        # Shared UI primitives
│   ├── floatingPanel.js       # Panel host element + show/hide
│   └── toast.js               # Toast notifications
├── utils/                     # Pure utilities
│   ├── dom.js
│   └── clipboard.js
├── offscreen/                 # MV3 offscreen document (needed for DOM-only APIs)
├── lib/                       # Vendored third-party libraries (see THIRD_PARTY_NOTICES.md)
├── sounds/                    # Notification sounds
├── styles.css                 # Page-side CSS (toast host etc.)
├── CONTRIBUTING.md
├── CODE_OF_CONDUCT.md
├── SECURITY.md
├── PRIVACY.md
├── THIRD_PARTY_NOTICES.md
└── LICENSE

Bundled libraries

For MV3 compliance (no remote code), all third-party JS is vendored under lib/. Each retains its own license. Full table with versions and upstream URLs: THIRD_PARTY_NOTICES.md.

Highlights:

Library	What it does here
Dexie.js	Wrapper around IndexedDB
FlexSearch	Full-text search index over chats/notes
marked + highlight.js + DOMPurify	Rendering of assistant Markdown safely
Mermaid + MathJax	Diagrams and math rendering inline
PDF.js, mammoth.js, SheetJS, PapaParse, JSZip	Parsing attached files

Development

Prerequisites

Chrome or another Chromium-based browser (Manifest V3 required).
Node.js 18+ for the node --check syntax verification step. No npm install needed; the extension itself has no runtime npm dependencies.

Loop

Edit a file.
node --check path/to/file.js (mandatory after every JS edit).
Click the reload icon next to the extension in chrome://extensions.
Reload the tab you are testing against.
Open DevTools on the page and the service worker (link in chrome://extensions) to watch both consoles.

Architectural rules

A few patterns that will save you time:

Every new content script must be registered in both manifest.json and background/tabMessaging.js, in matching order.
The panel lives in a shadow DOM; all panel queries go through ABChatContent.ui.panelShadowRoot, never document.
No inline event handlers anywhere. Use data-action delegation on the panel's mount node.
Persistent DOM listeners must check the listener-generation counter on every fire so they no-op after the extension reloads.
Adding a new chip field requires syncing four call sites (DOM write, DOM read, DB write, DB read).
Adding a new repo function requires registering it in background/panelDataRepoImpl.js and the proxy table in shared/panelDataRepo.js.

Debugging tips

Service worker logs: chrome://extensions → click Service worker under the extension.
Content-script logs: regular page DevTools.
IndexedDB inspection: DevTools → Application → IndexedDB → agentic-browser-chat.
API call log: visible inside the panel under Settings → View API logs.

Troubleshooting

Symptom	Likely cause / fix
Panel doesn't open	Reload the extension at `chrome://extensions`, then reload the page. After extension code changes, both reloads are required.
"Extension context invalidated" errors after reload	Expected once after extension reload; reload the page to re-inject content scripts.
LLM call fails with 401	OpenRouter API key missing or invalid. Re-enter it in Settings.
Agent says "URL not allowed" when asked to fetch a page	Expected behavior. `web_fetch` only accepts URLs that already appeared in the conversation or in prior tool results — ask the agent to `web_search` first, or paste the URL into the chat.
Task reminder didn't fire	Chrome may have suspended the service worker. The next browser activity will restore alarms; OS-level notification permissions also matter.

Roadmap & status

This is a personal project released as open source. Maintenance is best-effort and breaking changes can happen between minor versions until v2.0. The current focus:

Stability and bug fixes
More agent tools
Better cross-tab sync

Out of scope (for now):

Cloud sync
Mobile
Replacing OpenRouter with a heavier multi-provider abstraction

Contributing

PRs are welcome. Before opening one, please read:

CONTRIBUTING.md — setup, conventions, PR checklist.
CODE_OF_CONDUCT.md.

Good first contributions: bug fixes with clear reproductions, documentation improvements, new agent tools that follow the existing patterns, accessibility, performance.

Security

Please report security issues privately, not as public issues. See SECURITY.md.

License

Bundled third-party libraries are subject to their own licenses; see THIRD_PARTY_NOTICES.md.

Acknowledgements

Built on the shoulders of the open-source projects listed in THIRD_PARTY_NOTICES.md. Special thanks to Dexie, marked, highlight.js, Mermaid, MathJax, PDF.js, mammoth.js, SheetJS, PapaParse, DOMPurify, FlexSearch, and JSZip — without them this would not exist.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github		.github
agent		agent
background		background
content		content
lib		lib
offscreen		offscreen
panel		panel
shared		shared
sounds		sounds
tools		tools
ui		ui
utils		utils
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PRIVACY.md		PRIVACY.md
README.md		README.md
SECURITY.md		SECURITY.md
THIRD_PARTY_NOTICES.md		THIRD_PARTY_NOTICES.md
icon.png		icon.png
manifest.json		manifest.json
package.json		package.json
styles.css		styles.css

Folders and files

Latest commit

History

Repository files navigation

Agentic Browser Chat

Table of contents

What it is

Screenshots

Highlights

Features

Chat with any page

Right-click menu and content selector

Attachments

Notes

Tasks

Quiz

Agent tools

Memory, Skills, and Custom Instructions

API call logs

Cross-tab sync

Rendering

Install

From the Chrome Web Store

From source (developer mode)

Quick start

Configuration

LLM provider

Permissions: what each one is for

Privacy

Architecture

Why no framework?

Project layout

Bundled libraries

Development

Prerequisites

Loop

Architectural rules

Debugging tips

Troubleshooting

Roadmap & status

Contributing

Security

License

Acknowledgements

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages