dp_cli is an Agent-first CLI wrapper around DrissionPage.
The current MVP focuses on a Playwright-CLI-style workflow:
opensnapshot- choose a
reffrom semantic nodes click/typebyref- re-snapshot when the page changes
The key design choice is: the main contract is semantic snapshot + ref, not hand-written Chinese area descriptions.
conda activate dp-cli
pip install DrissionPage pytest langchain-openaiIf you need to create the environment first:
conda create -n dp-cli python=3.11
conda activate dp-cli
pip install DrissionPage pytest langchain-openaiAll commands support these options:
--session <name>— Session name (default:default)--headless— Run browser without GUI
Every command returns the same top-level JSON shape:
{
"ok": true,
"session": "demo",
"action": "snapshot",
"data": { ... },
"error": null
}On error:
{
"ok": false,
"session": "demo",
"action": "click",
"data": null,
"error": {
"code": "ref_stale",
"message": "Element ref 'e12' is stale for the current runtime or page.",
"details": {
"ref": "e12",
"expected_page_id": "page_xxx",
"actual_page_id": "page_yyy"
}
}
}| Exit Code | Code | Description |
|---|---|---|
| 1 | unexpected_error |
General unexpected error |
| 2 | browser_config_error |
Browser configuration failed |
| 3 | element_not_found |
Target element not found on page |
| 4 | invalid_input |
Missing or invalid command arguments |
| 5 | ref_not_found |
Ref does not exist in this session |
| 6 | ref_stale |
Ref belongs to a previous page/runtime |
| 7 | invalid_ref_type |
Container ref used where element ref required |
| 8 | element_not_interactable |
Element exists but cannot be interacted with |
Open a page in the session browser.
python -m dp_cli open https://example.com --session demoOutput:
{
"ok": true,
"session": "demo",
"action": "open",
"data": {
"page": {
"url": "https://example.com",
"title": "Example Domain"
}
},
"error": null
}Return a structured page snapshot with semantic node discovery.
# Default planner view (low-token, agent-friendly)
python -m dp_cli snapshot --session demo
# Full discovery mode (all nodes)
python -m dp_cli snapshot --session demo --mode full
# Expand a container subtree
python -m dp_cli snapshot r5 --session demo --depth 3
# Extract mode for data extraction
python -m dp_cli snapshot --session demo --mode extractOptions:
[ref]— Optional container ref to expand (e.g.,r5)--mode full|agent_summary|extract— Snapshot mode (default:agent_summary)--depth <N>— Discovery depth for subtree expansion
Output (agent_summary mode):
{
"ok": true,
"session": "demo",
"action": "snapshot",
"data": {
"schema_version": "0.6",
"mode": "agent_summary",
"page": {
"url": "https://example.com",
"title": "Example Domain"
},
"page_identity": {
"runtime_id": "rt_abc123",
"page_id": "page_def456",
"snapshot_id": "snap_ghi789",
"snapshot_seq": 1
},
"scope": "page",
"root_ref": null,
"depth": null,
"index": {
"interactable_elements": [
{"ref": "e1", "role": "link", "name": "More information"},
{"ref": "e2", "role": "button", "name": "Submit"}
],
"surface_index": [
{"ref": "r1", "ref_type": "container", "role": "search", "name": "Search", "child_count": 3, "in_viewport": true, "interactable_now": false},
{"ref": "e1", "ref_type": "element", "role": "link", "name": "More information", "child_count": 0, "in_viewport": true, "interactable_now": true}
],
"deep_index": [
{"ref": "r2", "ref_type": "container", "role": "generic", "name": "", "text": "Footer copyright text...", "in_viewport": false}
],
"tree": {
"roots": ["r1", "r2"],
"parent_map": {"e1": "r1", "e2": "r1"},
"children_map": {"r1": ["e1", "e2"]}
},
"stats": {
"total_nodes": 42,
"surface_count": 12,
"deep_count": 30,
"in_viewport": 20,
"offscreen": 22,
"interactable_now": 8
}
}
},
"error": null
}Find elements by CSS locator or text content.
# Find by CSS locator
python -m dp_cli find --session demo --locator "tag:a"
python -m dp_cli find --session demo --locator "#search-input"
# Find by text content
python -m dp_cli find --session demo --text "Search"
python -m dp_cli find --session demo --text "Next page"Output:
{
"ok": true,
"session": "demo",
"action": "find",
"data": {
"page": { "url": "...", "title": "..." },
"page_identity": { "runtime_id": "...", "page_id": "..." },
"count": 3,
"nodes": [
{
"ref": "e1",
"ref_type": "element",
"tag": "a",
"role": "link",
"name": "",
"text": "More information",
"locator": "xpath:/html/body/div/p[2]/a",
"visibility": {
"visible": true,
"in_viewport": true,
"interactable_now": true
}
}
],
"query": {
"locator": "tag:a",
"text": null
}
},
"error": null
}Click an element by ref or locator.
# Click by ref (preferred)
python -m dp_cli click --session demo --ref e12
# Click by locator
python -m dp_cli click --session demo --locator "#submit-button"Output:
{
"ok": true,
"session": "demo",
"action": "click",
"data": {
"page": { "url": "...", "title": "..." },
"target": {
"ref": "e12",
"locator": "xpath:/html/body/form/button"
},
"target_state": {
"visible": true,
"in_viewport": true,
"interactable_now": true
}
},
"error": null
}Error example (ref_stale):
{
"ok": false,
"session": "demo",
"action": "click",
"data": null,
"error": {
"code": "ref_stale",
"message": "Element ref 'e12' is stale for the current runtime or page. Re-run snapshot or find first.",
"details": {
"ref": "e12",
"expected_page_id": "page_xxx",
"actual_page_id": "page_yyy"
}
}
}Type text into a form field by ref or locator.
# Type by ref (preferred)
python -m dp_cli type --session demo --ref e11 --text "Hello World"
# Type by locator
python -m dp_cli type --session demo --locator "#search-input" --text "python tutorial"Output:
{
"ok": true,
"session": "demo",
"action": "type",
"data": {
"page": { "url": "...", "title": "..." },
"target": {
"ref": "e11",
"locator": "xpath:/html/body/form/input[1]"
},
"target_state": {
"visible": true,
"in_viewport": true,
"interactable_now": true
},
"typed_text": "Hello World"
},
"error": null
}Expand a container ref to reveal its child nodes.
python -m dp_cli expand r5 --session demo --depth 3Output:
{
"ok": true,
"session": "demo",
"action": "expand",
"data": {
"page": { "url": "...", "title": "..." },
"page_identity": { "runtime_id": "...", "page_id": "..." },
"target_ref": "r5",
"mode": "full",
"count": 15,
"nodes": [
{
"ref": "e20",
"ref_type": "element",
"tag": "a",
"role": "link",
"text": "Article Title",
"locator": "xpath:/html/body/div[3]/div[1]/a"
}
]
},
"error": null
}List items within a group/container.
python -m dp_cli list-items r3 --session demo --sample-size 5Output:
{
"ok": true,
"session": "demo",
"action": "list-items",
"data": {
"page": { "url": "...", "title": "..." },
"group_ref": "r3",
"group_kind": "list",
"item_count": 10,
"sample_items": [
{ "item_ref": "e5", "fields": {} },
{ "item_ref": "e6", "fields": {} },
{ "item_ref": "e7", "fields": {} }
],
"schema_hints": {
"title": "text",
"author": "text"
}
},
"error": null
}Extract structured data from a group/container.
# Extract all items from a group
python -m dp_cli extract r3 --session demo
# Extract with schema hints
python -m dp_cli extract r3 --session demo --schema title author url
# Extract sample only (first 3 items)
python -m dp_cli extract r3 --session demo --sample-onlyOutput:
{
"ok": true,
"session": "demo",
"action": "extract",
"data": {
"group_ref": "r3",
"item_count": 10,
"fields": ["title", "author", "url"],
"items": [
{
"title": "First Article",
"author": "John Doe",
"url": "https://example.com/1"
},
{
"title": "Second Article",
"author": "Jane Smith",
"url": "https://example.com/2"
}
]
},
"error": null
}Get locator candidates for a ref.
python -m dp_cli resolve-locator --session demo --ref e12Output:
{
"ok": true,
"session": "demo",
"action": "resolve-locator",
"data": {
"ref": "e12",
"fingerprint": "fp_abc123",
"confidence": 0.9,
"locator_candidates": [
"xpath:/html/body/form/button",
"css:form > button[type=submit]"
],
"re_resolve_result": "matched"
},
"error": null
}Execute JavaScript on the page.
python -m dp_cli eval "document.title" --session demo
python -m dp_cli eval "document.querySelectorAll('a').length" --session demoOutput:
{
"ok": true,
"session": "demo",
"action": "eval",
"data": {
"result": "Example Domain"
},
"error": null
}Return agent-friendly session state.
python -m dp_cli session inspect --session demoOutput:
{
"ok": true,
"session": "demo",
"action": "session.inspect",
"data": {
"session_name": "demo",
"session_id": "sess_abc123",
"runtime": {
"runtime_id": "rt_def456",
"status": "running",
"browser_pid": 12345,
"port": 9333,
"headless": false,
"last_seen_at": "2026-04-23T10:30:00Z"
},
"page": {
"tab_id": "tab_ghi789",
"url": "https://example.com",
"title": "Example Domain",
"page_id": "page_jkl012",
"snapshot_id": "snap_mno345",
"snapshot_seq": 3
},
"ref_count": 25,
"container_ref_count": 8,
"element_ref_count": 17,
"last_snapshot_file": ".dpcli/snapshots/demo/snap_mno345.json",
"last_snapshot_mode": "agent_summary"
},
"error": null
}Each discovered node in snapshot includes:
{
"ref": "e12",
"ref_type": "element",
"id": "search-input",
"tag": "input",
"role": "textbox",
"name": "Search",
"text": "",
"value": "",
"placeholder": "Search...",
"href": "",
"input_type": "text",
"title": "",
"aria_label": "Search",
"alt": "",
"label": "Search",
"locator": "xpath:/html/body/div/form/input",
"depth": 3,
"bounds": {
"x": 100.0,
"y": 200.0,
"width": 300.0,
"height": 40.0
},
"visibility": {
"visible": true,
"in_viewport": true,
"interactable_now": true
},
"context": {
"landmark": "search",
"heading": "",
"form": "search-form",
"list": "",
"dialog": ""
},
"states": {
"disabled": false,
"checked": false,
"selected": false,
"expanded": false
}
}r*— Semantic container ref (groups, lists, regions)e*— Interactive element ref (buttons, links, inputs)
Command constraints:
| Command | Accepts r* |
Accepts e* |
|---|---|---|
snapshot |
Yes | Yes |
expand |
Yes | No |
list-items |
Yes | No |
extract |
Yes | No |
click |
No | Yes |
type |
No | Yes |
resolve-locator |
Yes | Yes |
# 1. Open a page
python -m dp_cli open https://github.com/login --session github --headless
# 2. Take a snapshot to discover elements
python -m dp_cli snapshot --session github --headless
# -> Returns e1 (username input), e2 (password input), e3 (sign-in button)
# 3. Type credentials
python -m dp_cli type --session github --headless --ref e1 --text "my-username"
python -m dp_cli type --session github --headless --ref e2 --text "my-password"
# 4. Click sign-in
python -m dp_cli click --session github --headless --ref e3
# 5. Re-snapshot after navigation
python -m dp_cli snapshot --session github --headless# 1. Open Hacker News
python -m dp_cli open https://news.ycombinator.com --session hn --headless
# 2. Take snapshot to find the news list container
python -m dp_cli snapshot --session hn --headless
# -> Returns r1 (news list container)
# 3. List items in the container
python -m dp_cli list-items r1 --session hn --headless --sample-size 5
# 4. Extract structured data
python -m dp_cli extract r1 --session hn --headless --schema title url authorclick and type do more than simple selector execution:
- Validate that the ref still belongs to the current runtime and page
- Reject stale refs with
ref_stale(exit code 6) - Reject container refs with
invalid_ref_type(exit code 7) - Verify that the target element is interactable
- Auto-scroll into view before action when needed
- Return
element_not_interactable(exit code 8) when the element exists but cannot be acted on
When you get ref_stale:
The page likely navigated or changed. Take a new snapshot to get fresh refs:
python -m dp_cli snapshot --session demo --headlessSession state lives under:
.dpcli/sessions/<session-name>/
meta.json — Session metadata (port, browser path, runtime info)
state.json — Ref mappings, active page, snapshot history
profile/ — Browser user data directory
Snapshot artifacts live under:
.dpcli/snapshots/<session-name>/
<snapshot-id>.json — Full snapshot data
To reset a session (clear all refs and state):
rm -rf .dpcli/sessions/<session-name>Copy .env template and fill in your API key:
cp .env .env.local# OpenAI-compatible API configuration
OPENAI_API_KEY=your-api-key-here
OPENAI_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
OPENAI_MODEL=gpt-4o-miniOptional environment variables:
DPCLI_BROWSER_PATH— Override browser executable pathDPCLI_RUN_PUBLIC_SMOKE— Enable public network smoke tests
Local semantic workflow smoke test:
python scripts/test_local_cli.pyPublic smoke test:
python scripts/test_public_smoke.pyAgent loop test with natural language goals:
python tests/test_agent_computor.py --scenario automationRun with visible browser:
# Edit test_agent_computor.py: set headless=False in TestRunner
python tests/test_agent_computor.py --scenario automationRun local regression tests:
pytest -q tests/test_cli_local.py
pytest -q testsEnable public smoke tests explicitly:
# Windows
set DPCLI_RUN_PUBLIC_SMOKE=1
pytest -q tests/test_public_smoke.py
# Linux/macOS
export DPCLI_RUN_PUBLIC_SMOKE=1
pytest -q tests/test_public_smoke.pyThis version intentionally focuses on the minimum reliable contract for agents:
- Semantic snapshot with full-page discovery
- Planner projection with pinned controls
- Ref-driven interaction (
r*containers,e*elements) - Stable session identity (session_id, runtime_id, page_id, snapshot_id)
- Stale ref detection and recovery
- Full-page
findfallback - Visible/interactable execution safety
- Group compression and schema extraction
- Container expansion for subtree exploration
dp_cli/
├── cli.py — Argument parsing, JSON dispatch, main entry
├── service.py — CliService: command orchestration
├── adapter.py — DrissionPageAdapter: DOM snapshot via injected JS
├── session.py — SessionManager: browser lifecycle + tab restore
├── runtime.py — RuntimeContext: ref mapping + page identity
├── session_store.py — SessionStore: JSON persistence + browser discovery
├── models.py — Dataclasses: state, nodes, bounds, visibility
├── errors.py — CliError hierarchy with structured exit codes
├── compressor.py — DOM node grouping and compression
├── projector.py — Planner view and extraction projectors
├── grouper.py — Group kind detection and field schema extraction
├── locator.py — Locator candidate generation
└── fingerprint.py — Node fingerprinting for stable ref resolution