This repository is a Firefox-first build of Azzta Agent. Recent changes include:
- Firefox WebExtensions manifest (
manifest.firefox.json) and sidebar UI wiring. - Cross-browser runtime shim (
browservschrome) in background, UI, tools, and content scripts. - Safe fallbacks when tab grouping APIs are unavailable in Firefox.
onMessagehandler updated to avoid the Firefox “promised response went out of scope” error.
Azzta Agent is a premium warm-paper inspired Chrome (Chromium) extension built for professionals and teams who want brand-safe browser automation. Every detail – from the tactile UI to the safety prompts – is tuned for production distribution, paid plans, and tight visual identity. The Live/History panes, profile manager (including vision routes and orchestrator tools), and compacted context allow you to confidently surface Azzta as a monetizable feature inside your workflow toolkit.
- Safe automation: Unsafe actions (installing extensions, deleting history, visiting unknown schemes) are blocked unless explicitly requested, and tab groups created by the user are preserved.
- Vision & screenshot controls: Screenshots stay off by default; enable them only when pairing the main agent with a vision-capable profile. Vision bridges automatically describe captures so non-vision models can continue reasoning.
- Orchestrator workflows: Toggle orchestrator mode to expose
spawn_subagent/subagent_complete, letting the orchestrator spin focused helpers for subtasks and gather sanitized summaries. - Warm Paper UI: Geist Sans typography, rounded cards, solid warm hues, card glow, and tab/history panes deliver a clean, tactile experience with subtle pulses and scrollbars.
- History & context management: Sessions persist locally (if enabled),
<think>blocks render cleanly, context is compacted when the limit nears, and the model keeps working even after individual errors.
- Azzta identity: Header copy, pill tabs, pill buttons, and status rings carry the brand narrative – the product never looks like a rough utility overlay but a curated experience you can safely charge for.
- Monetization-friendly controls: Strict safety guardrails, orchestrator tooling, and history segmentation let you offer Azzta as a paid tier where customers rely on reliable outcomes.
- Design system ready: Warm Paper palette (solid creams, soft tans, charcoal text) plus Geist Sans/Mono fonts, subtle rings, and card shadows form a reusable system for future marketing or docs.
- Configure multiple profiles (system prompt, model, provider, temperature, tokens, timeout, screenshot preferences) and switch mid-session.
- Vision profiles describe screenshots for non-vision agents, while screenshots stay disabled unless explicitly toggled.
- Profiles are reused by orchestrators and optional sub-agents for flexible workflows.
- Navigation & visibility:
navigate,getContent,screenshot(vision aware),getTabs,describeSessionTabs, glow annotations keep track of operated pages. - Interaction:
click,type,pressKey,scroll,focusTab,switchTab,openTab,closeTab. - Tab orchestration:
groupTabs,describeSessionTabs,spawn_subagent,subagent_complete. - History & safety: History APIs remain accessible but require explicit user consent; orchestrator prompts instruct the model to avoid destructive actions.
- When enabled, exposes
spawn_subagentandsubagent_completetools. - Orchestrator builds sub-agent histories that run tools independently, report progress, and return structured summaries.
- Supports up to ten simultaneous helpers, each respecting the same navigation/safety guardrails.
- Sub-agent navigation bar: Visual navigation between main agent and sub-agents with status indicators.
- Live status indicators: Green pulsing indicator shows active agent, completed sub-agents marked with checkmarks.
- The Agent Teams panel lists every saved configuration, letting you tap rich pills to assign Main, Vision, Orchestrator, or Team roles with a single tap.
- Toggle Vision or Orchestrator pills to route screenshots and coordination duties while Team pills let multiple allied agents run in parallel.
- A built-in profile editor inside the Agent Library tab lets you click any card, update its provider/API/model/system prompt settings, and save immediately—no more bouncing back to general settings for simple edits.
- Live/History tabs allow quick switching between the current conversation and saved sessions.
- Tab selection preserves user choices, and the panel displays a glowing status ring plus tool timeline entries.
- Thinking blocks render
<think>/<analysis>snippets, and streaming updates show incremental thought-progress. - Dark mode optimized: Teal/cyan gradient message bubbles, clean styling without shadows for modern aesthetic.
- Context tracking: Real-time token usage display with cumulative session tracking.
- Multi-layer error recovery: Automatic retry with exponential backoff for API errors.
- Tool ordering fixes: Intelligent message sanitization prevents "tool call result does not follow tool call" errors.
- Silent retry: Up to 3 automatic retries before surfacing errors to user.
- Emergency recovery: Automatic tool history clearing for persistent ordering issues.
- Non-terminating errors: Process continues gracefully after errors instead of stopping completely.
- Clone or download:
git clone <repo-url> cd browser-ai
- Ensure
icons/containsicon16.png,icon48.png,icon128.png(optional but recommended). - Open
chrome://extensions, enable Developer Mode, and load the unpackedbrowser-aidirectory. - Pin the extension to the toolbar if desired.
- Copy the Firefox manifest into place:
cp manifest.firefox.json manifest.json
- Open
about:debugging#/runtime/this-firefox, click "Load Temporary Add-on", and selectmanifest.json. - Click the toolbar icon to open the sidebar panel.
Notes:
- Firefox does not support tab groups; the
groupTabstool is hidden and session grouping is skipped. - The UI opens as a sidebar instead of a Chrome side panel.
- Open the side panel via the toolbar icon.
- Click the gear icon to open Settings.
- Choose your provider (OpenAI, Anthropic, Custom) and paste the corresponding API key.
- Configure model, temperature, timeout, and system prompt per profile.
- Toggle screenshot tools only when pairing with a vision-capable profile or vision bridge.
- Enable orchestrator mode to expose sub-agent tooling and pick an orchestrator profile if desired.
- In the Agent Teams panel, tap the Main / Vision / Orchestrator / Team pills to assign the right profiles to each role. You can select multiple profiles for auxiliary duties to keep complex workflows readable.
- Click any card to edit it directly in the Agent Library tab, then hit Save Profile to keep the change.
- Save settings and switch profiles via the dropdown.
- Ask the assistant to drive the browser: “Open example.com, find contact emails, and summarize the hero section.”
- Use Vision bridging: “Capture the modal, describe it with the vision profile, then continue filling the form.”
- Spawn a helper: “Start a sub-agent to collect pricing tiers on this page and summarize with subagent_complete.”
- Save history: toggle it on to store sessions locally, then switch to the History tab to reopen or review them.
sidepanel/ → UI (HTML/CSS/JS) with Warm Paper styling and live/history tabs
background.js → Service worker orchestrating AI calls, tool execution, security guards, and orchestrator helpers
ai/provider.js → OpenAI/Anthropic integration, tool formatting, vision descriptions, safety prompts
tools/browser-tools.js → Browser automation helpers, page glow, group-aware tab management, URL validation
content.js → DOM helpers for highlighting, hover simulation, metadata
- API keys are stored locally and only used for outbound requests to configured providers.
- The orchestrator/system prompts explicitly forbid installs, destructive history edits, or actions outside user-selected tabs.
- Navigation validates URLs and rejects unsafe schemes (
javascript:,data:,chrome:). - Screenshots opt-in and never transmit image data unless the vision profile is explicitly configured.
- Content scripts have no access beyond user tabs; no telemetry is collected.
npm test # Runs the full testing suite
npm run validate # Validates extension files and manifest
npm run test:unit # Unit tests
npm testcurrently fails when corepack cannot fetch signatures on certain Node versions; rerun after ensuring corepack can update.
- Panel not opening: reload via
chrome://extensions, ensure it is enabled, and check console logs. - AI not responding: verify API key, model name, provider limits, and extension permissions (tabs, storage, scripting).
- Tools failing: inspect the Console/Network tabs for script injection issues, ensure DOM selectors exist, or increase tool timeout.
- Visual selector (point, highlight, confirm).
- Workflow templates (multi-pass diaries).
- Native vision model selection UI.
- Traffic-safe DevTools Protocol integration.
- Export session history or transcripts.
Improvements welcome: more tests, additional tools, UI polish, safety audits, localization.
MIT License – see LICENSE.
Create an issue or join discussions in this repository.