browser

A CLI for browser automation via Chrome DevTools Protocol, built on go-rod.

Launch a headless Chrome, then drive it entirely from the command line — navigate pages, fill forms, click buttons, take screenshots, run JavaScript, inspect the accessibility tree, and more. The browser runs as a persistent background process, so each CLI invocation is fast and stateless.

Install

macOS / Linux / WSL

curl -fsSL https://raw.githubusercontent.com/ShawnPana/browser/main/install.sh | sh

Windows (PowerShell)

irm https://raw.githubusercontent.com/ShawnPana/browser/main/install.ps1 | iex

Windows (CMD)

powershell -c "irm https://raw.githubusercontent.com/ShawnPana/browser/main/install.ps1 | iex"

Go install

go install github.com/ShawnPana/browser@latest

Build from source

git clone https://github.com/ShawnPana/browser.git
cd browser
go build -o browser .

Requires Chrome or Chromium installed on the system (or set BROWSER_CHROME_BIN).

Quick Start

# Launch a headless Chrome
browser start

# Navigate to a page
browser open https://example.com

# Inspect the page
browser get title
browser ax-tree --depth 3

# Interact
browser click 'a[href="/more"]'
browser input '#search' 'hello world'
browser screenshot result.png   # saves to ~/.browser/tmp/result.png; use ./result.png for CWD

# Clean up
browser stop

Commands

Browser Lifecycle

Command	Description
`browser start`	Launch headless Chrome
`browser start --show`	Launch with UI visible
`browser start -k`	Launch with `--ignore-certificate-errors`
`browser stop`	Close the active browser
`browser connect <host:port>`	Connect to an existing Chrome instance
`browser connect <https://url>`	Connect to a cloud browser
`browser connect <index>`	Switch active browser by registry index
`browser status`	Show all registered browsers with liveness

The browser persists as a background process across CLI invocations. State is stored in ~/.browser/state.json, and output files (screenshots, PDFs, downloads) default to ~/.browser/tmp/. Multiple browsers can be registered simultaneously — one is active at a time.

Navigation

Command	Description
`browser open <url>`	Navigate to URL (waits for load)
`browser back`	Go back in history
`browser forward`	Go forward in history
`browser reload`	Reload page
`browser reload --hard`	Hard reload (bypass cache)

Page Info

Command	Description
`browser get url`	Print current URL
`browser get title`	Print page title
`browser get html`	Print full page HTML
`browser get html <selector>`	Print element's outer HTML
`browser get text <selector>`	Print element's visible text
`browser get attr <selector> <name>`	Print element attribute value
`browser get value <selector>`	Print input element value
`browser get box <selector>`	Print bounding box (x, y, width, height)
`browser get styles <sel> [prop...]`	Print computed styles

Interaction

All interaction commands accept either a CSS selector or x,y coordinates.

With CSS selectors:

Command	Description
`browser click <selector>`	Click element
`browser dblclick <selector>`	Double-click element
`browser rightclick <selector>`	Right-click element
`browser input <selector> <text>`	Clear field and type text (replace)
`browser type <selector> <text>`	Type text without clearing (append)
`browser press <key>`	Press key combo (e.g. `Enter`, `Control+a`)
`browser clear <selector>`	Clear input field
`browser select <selector> <value>`	Select dropdown option by value
`browser submit <selector>`	Submit form
`browser hover <selector>`	Hover over element
`browser focus <selector>`	Focus element
`browser check <selector>`	Check checkbox
`browser uncheck <selector>`	Uncheck checkbox
`browser scrollintoview <selector>`	Scroll element into viewport

With coordinates:

Command	Description
`browser click <x> <y>`	Click at coordinates
`browser dblclick <x> <y>`	Double-click at coordinates
`browser rightclick <x> <y>`	Right-click at coordinates
`browser input <x> <y> <text>`	Click at coordinates, clear, then type
`browser type <x> <y> <text>`	Click at coordinates, then type (no clear)
`browser hover <x> <y>`	Hover at coordinates
`browser scroll <x> <y> <delta>`	Scroll at coordinates (delta in px)
`browser scroll <selector> <delta>`	Scroll inside an element
`browser drag <x1> <y1> <x2> <y2>`	Drag from one point to another
`browser element-at <x> <y>`	Describe the DOM element at coordinates

Keyboard

Send keystrokes without a selector (acts on the currently focused element):

Command	Description
`browser keyboard type <text>`	Type text with real keystrokes
`browser keyboard inserttext <text>`	Insert text without key events

JavaScript

browser eval <expression>

Evaluates a JavaScript expression in the page context. The expression is auto-wrapped in () => { return (expr); }.

browser eval 'document.title'                                    # → Example Domain
browser eval '2 + 2'                                             # → 4
browser eval 'document.querySelectorAll("a").length'             # → 12
browser eval 'JSON.stringify([...document.querySelectorAll("h2")].map(e => e.textContent))'

Output formatting: strings are printed unquoted, numbers and booleans are printed raw, objects and arrays are pretty-printed as JSON.

File Operations

Command	Description
`browser file <selector> <path>`	Upload a file to a file input
`browser file <selector> -`	Upload from stdin
`browser download <selector> [file]`	Download resource from element's `href`/`src`
`browser download <selector> -`	Download to stdout

browser file 'input[type="file"]' ./document.pdf
cat image.png | browser file 'input[type="file"]' -
browser download 'a.download-link' ./output.pdf

Waiting

Command	Description
`browser wait <selector>`	Wait for element to become visible
`browser wait-load`	Wait for page load event
`browser wait-stable`	Wait for DOM stability (300ms)
`browser wait-idle`	Wait for requestIdleCallback (5s)
`browser sleep <seconds>`	Sleep for a duration

Output

Bare filenames default to ~/.browser/tmp/. Use an explicit path (./, ../, /) to write elsewhere.

browser screenshot                          # → ~/.browser/tmp/screenshot.png (auto-named)
browser screenshot page.png                 # → ~/.browser/tmp/page.png
browser screenshot ./page.png               # → ./page.png (explicit path)
browser screenshot -w 1920 file.png         # Custom width, full-page height
browser screenshot -w 1920 -h 1080 file.png # Fixed viewport clip
browser pdf page.pdf                        # → ~/.browser/tmp/page.pdf
browser pdf ./page.pdf                      # → ./page.pdf (explicit path)

When -h is specified, the screenshot clips to the viewport height. Without it, the full scrollable page is captured.

Tabs

Command	Description
`browser tabs`	List all open tabs
`browser switch <index>`	Switch active tab
`browser new-tab [url]`	Open a new tab
`browser close-tab [index]`	Close a tab (defaults to active)

browser new-tab https://site-a.com
browser new-tab https://site-b.com
browser tabs
# * [0] about:blank - about:blank
#   [1] Site A - https://site-a.com
#   [2] Site B - https://site-b.com

browser switch 1     # switch to Site A
browser close-tab 0  # close the blank tab

Checks and Assertions

These commands use exit codes for scripting: 0 = pass, 1 = fail, 2 = error.

Command	Description
`browser exists <selector>`	Check if element exists
`browser visible <selector>`	Check if element is visible
`browser count <selector>`	Count matching elements
`browser assert <expr>`	Assert JS expression is truthy
`browser assert <expr> <expected>`	Assert JS result equals expected string
`browser assert <expr> <expected> -m <msg>`	Assert with custom failure message

browser exists 'h1' && echo "found"
browser assert 'document.title' 'My Page'
browser assert 'location.pathname' '/dashboard' -m 'should be on dashboard'

Accessibility

Inspect the accessibility tree to understand page structure without relying on CSS selectors or visual layout.

Command	Description
`browser ax-tree`	Print full accessibility tree
`browser ax-tree --depth N`	Limit tree depth
`browser ax-tree --json`	Output as JSON
`browser ax-find --name <name>`	Find nodes by accessible name
`browser ax-find --role <role>`	Find nodes by ARIA role
`browser ax-node <selector>`	Get accessibility info for a specific element
`browser ax-node <selector> --json`	Output as JSON

browser ax-tree --depth 3
# [RootWebArea] "Example" (url=https://example.com)
#   [navigation] "Main"
#     [link] "Home" (focusable=true)
#     [link] "About" (focusable=true)
#   [main] ""
#     [heading] "Welcome" (level=1)

browser ax-find --role button
browser ax-find --name "Submit"
browser ax-node '#save-btn' --json

Cloud (Browser Use API)

Manage cloud browsers via the Browser Use API.

Command	Description
`browser cloud login <api-key>`	Save API key
`browser cloud logout`	Remove API key
`browser cloud <METHOD> <path> [body]`	REST passthrough
`browser cloud poll <task-id>`	Poll a task until completion
`browser cloud --help`	Show API endpoints (fetches OpenAPI spec)

browser cloud login "$BROWSER_USE_API_KEY"
browser cloud POST /browsers '{"headless": true}'
browser connect https://<uuid>.cdp0.browser-use.com

# All commands work identically on cloud browsers
browser open https://example.com
browser screenshot page.png   # → ~/.browser/tmp/page.png

# Stop auto-detects cloud URLs and calls the API
browser stop

# Run a cloud task
browser cloud POST /tasks '{"url":"https://example.com","task":"Extract the main heading"}'
browser cloud poll <task-id>

Exit Codes

Code	Meaning
0	Success, or check passed (`exists`, `visible`, `assert`)
1	Check failed (`exists`, `visible`, `assert` returned false)
2	Error (bad arguments, no browser, timeout, network failure)

Environment Variables

Variable	Description	Default
`BROWSER_HOME`	State directory (also controls output dir: `$BROWSER_HOME/tmp/`)	`~/.browser`
`BROWSER_TIMEOUT`	Command timeout in seconds	`30`
`BROWSER_CHROME_BIN`	Chrome binary path	auto-detect
`BROWSER_USE_API_KEY`	Cloud API key (alternative to `cloud login`)	none

Architecture

browser/
├── main.go              # Entry point
├── cli/                 # Command routing and arg parsing
│   ├── root.go          # Main switch/case dispatcher, help text
│   ├── utils.go         # Output formatting helpers
│   ├── browser_cmds.go  # start, stop, connect, status
│   ├── nav_cmds.go      # open, back, forward, reload
│   ├── page_cmds.go     # get url/title/html/text/attr/value/box/styles
│   ├── interact_cmds.go # click, dblclick, rightclick, input, type, press, ...
│   ├── keyboard_cmds.go # keyboard type, keyboard inserttext
│   ├── wait_cmds.go     # wait, wait-load, wait-stable, wait-idle, sleep
│   ├── screenshot_cmds.go
│   ├── pdf_cmds.go      # pdf
│   ├── tab_cmds.go      # tabs, switch, new-tab, close-tab
│   ├── assert_cmds.go   # exists, count, visible, assert
│   ├── ax_cmds.go       # ax-tree, ax-find, ax-node
│   └── cloud_cmds.go    # cloud login, REST passthrough, poll
├── core/                # State management and browser connectivity
│   ├── context.go       # Context (state dir, timeout)
│   ├── state.go         # JSON state persistence (~/.browser/state.json)
│   ├── service.go       # WithPage / WithBrowser — connect to active browser
│   ├── launcher.go      # Chrome launch via rod/launcher
│   ├── cloud.go         # Cloud config and cache
│   └── tools/           # Stateless tool functions
│       ├── navigate.go
│       ├── pageinfo.go
│       ├── interact.go
│       ├── keyboard.go
│       ├── js.go
│       ├── pdf.go
│       ├── screenshot.go
│       ├── tabs.go
│       ├── wait.go
│       ├── assert.go
│       ├── accessibility.go
│       └── file.go

Three layers:

core/tools/ — Pure functions that take a *rod.Page or *rod.Browser and do one thing. No state, no CLI concerns.
core/ — Browser lifecycle and persistent state. Manages a JSON registry of browser connections so the process survives across CLI calls.
cli/ — Thin command layer. Parses args, calls into core, prints results.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.github/workflows		.github/workflows
cli		cli
core		core
skills		skills
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
LICENSE		LICENSE
README.md		README.md
go.mod		go.mod
go.sum		go.sum
install.ps1		install.ps1
install.sh		install.sh
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

browser

Install

macOS / Linux / WSL

Windows (PowerShell)

Windows (CMD)

Go install

Build from source

Quick Start

Commands

Browser Lifecycle

Navigation

Page Info

Interaction

Keyboard

JavaScript

File Operations

Waiting

Output

Tabs

Checks and Assertions

Accessibility

Cloud (Browser Use API)

Exit Codes

Environment Variables

Architecture

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

browser

Install

macOS / Linux / WSL

Windows (PowerShell)

Windows (CMD)

Go install

Build from source

Quick Start

Commands

Browser Lifecycle

Navigation

Page Info

Interaction

Keyboard

JavaScript

File Operations

Waiting

Output

Tabs

Checks and Assertions

Accessibility

Cloud (Browser Use API)

Exit Codes

Environment Variables

Architecture

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages