Skip to content

Browser Tool

scarecr0w12 edited this page Jun 19, 2026 · 3 revisions

Browser Tool

The browser tool provides headless browser automation via Playwright, enabling agents to navigate websites, interact with pages, capture screenshots, and extract accessibility snapshots.

Architecture

The browser tool uses Playwright's Chromium engine in headless mode. All actions are gated through the security supervisor for sensitive operations (screenshots, form interactions).

Agent → browser tool call
  → Launch headless Chromium
  → Navigate to URL / Perform action
  → Security gate (screenshots, form data)
  → Return result (screenshot base64, snapshot text, or action confirmation)
  → Cleanup browser context

Available Actions

Action Description Parameters
navigate Navigate to a URL url
click Click an element target (selector or description)
type Type text into an input target, text
screenshot Capture page screenshot fullPage (boolean), format (png/jpeg)
snapshot Capture accessibility snapshot target (optional element selector)
evaluate Execute JavaScript on page code (function string)
wait Wait for text or time time (seconds) or text

Security

The browser tool triggers the security supervisor for:

  • Screenshots — may capture sensitive UI content (credentials, personal data)
  • Snapshots — accessibility tree may include sensitive form labels or data
  • Evaluate — arbitrary JavaScript execution (RCE-equivalent, requires explicit policy approval)

Each action goes through:

  1. Policy validation (regex allow/deny rules)
  2. LLM security supervisor review (for sensitive actions)
  3. Human approval if needed

Configuration

The browser tool uses a 30-second default timeout for all operations. No additional configuration is required — it uses the system's Playwright installation.

Example Usage

Agent: I need to check the CortexPrism GitHub releases page.

Tool call: browser.navigate("https://github.com/CortexPrism/cortex/releases")
Result: Navigated to https://github.com/CortexPrism/cortex/releases

Tool call: browser.snapshot()
Result: [Accessibility tree showing release list with versions and dates]

Agent: The latest release is v0.44.0 from June 18, 2026...

Capabilities

The browser tool requires these capabilities:

  • network:fetch — for page navigation
  • browser:navigate — URL navigation
  • browser:screenshot — screenshot capture
  • browser:evaluate — JavaScript execution

See Also

Clone this wiki locally