Predicate Snapshot Skill for OpenClaw

ML-powered DOM pruning that reduces browser prompt tokens by up to 99.8% while preserving actionable elements.

Quick Start

1. Install the Skill

Via ClawHub (Recommended):

npx clawdhub@latest install predicate-snapshot

Manual Installation:

git clone https://github.com/PredicateSystems/openclaw-predicate-skill ~/.openclaw/skills/predicate-snapshot
cd ~/.openclaw/skills/predicate-snapshot
npm install
npm run build

2. Get Your API Key (Optional)

For ML-powered ranking (95% token reduction), get a free API key:

Go to PredicateSystems.ai
Sign up for a free account (includes 500 free credits/month)
Navigate to Dashboard > API Keys
Click Create New Key and copy your key (starts with sk-...)

Without API key: Local heuristic-based pruning (~80% token reduction) With API key: ML-powered ranking for cleaner output (~95% token reduction)

3. Configure the API Key

Option A: Environment Variable (Recommended)

# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
export PREDICATE_API_KEY="sk-your-key-here"

Option B: OpenClaw Config File

Add to ~/.openclaw/config.yaml:

skills:
  predicate-snapshot:
    api_key: "sk-your-key-here"
    max_credits_per_session: 100  # Optional: limit credits per session

4. Use the Skill

# In OpenClaw:
/predicate-snapshot              # Get ranked elements
/predicate-act click 42          # Click element by ID
/predicate-snapshot-local        # Free local mode (no API)

Overview

This OpenClaw skill replaces the default accessibility tree snapshot with Predicate's ML-ranked DOM elements. Instead of sending 800+ elements (~18,000 tokens) to the LLM, it sends only the 50 most relevant elements (configurable) (~500 tokens).

Real-World Demo Results

Tested with the included demo (npm run demo):

Site	OpenClaw Snapshot (A11y Tree)	Predicate Snapshot	Savings
slickdeals.net	598,301 tokens (24,567 elements)	1,283 tokens (50 elements)	99.8%
news.ycombinator.com	16,484 tokens (681 elements)	587 tokens (50 elements)	96%
example.com	305 tokens (12 elements)	164 tokens (4 elements)	46%
httpbin.org/html	1,590 tokens (34 elements)	164 tokens (4 elements)	90%
Total	616,680 tokens	2,198 tokens	99.6%

Ad-heavy sites like slickdeals.net show the most dramatic savings—from 598K tokens down to just 1.3K tokens. Simple pages like example.com have minimal elements, so savings are lower.

Why Fewer Elements Is Better

You might wonder: "Isn't 50 elements vs 24,567 elements comparing apples to oranges?"

No—and here's why:

Most elements are noise. Of those 24,567 elements on slickdeals.net, the vast majority are:
- Ad iframes and tracking pixels
- Hidden elements and overlays
- Decorative containers (<div>, <span>)
- Non-interactive text nodes
- Duplicate/redundant elements
LLMs need actionable elements and enough context to reason. For browser automation, the agent needs to:
- Click buttons and links
- Fill form fields
- Read key content for decision-making
Predicate's ML ranking identifies the ~50 most relevant elements—including both interactive controls and contextual text—while filtering out the noise.
More elements = worse performance. Sending 600K tokens to an LLM causes:
- Higher latency (slower responses)
- Higher cost ($11K+/month vs $5/month)
- Context window overflow on complex pages
- More hallucinations from irrelevant context
Quality over quantity. Predicate's snapshot includes:
- ML-ranked importance scores
- Dominant group detection (for ordinal tasks like "click 3rd item")
- Visual cues (is_primary, position)
- Semantic role information
This structured context helps LLMs make better decisions than a raw element dump.

The goal isn't to preserve all elements—it's to preserve the right elements.

Proven in Production

Small local LLM model (3B) success: The Predicate Snapshot engine powered a complex browser automation task using only a 3B parameter local model—featured on Hacker News front page
Deep dive: Read why the accessibility tree alone isn't enough for web automation: Why A11y Alone Isn't Enough

Summary

Approach	Tokens (avg)	Elements	Signal Quality
Accessibility Tree	~150,000+	~6,000+	Low (noise)
Predicate Snapshot	~500-1,300	50	High (ML-ranked)

How It Works

Does This Replace the Default A11y Tree?

No, this skill does not automatically replace OpenClaw's default accessibility tree. Instead, it provides an alternative snapshot command that you can use when you want better element ranking.

Command	What It Does
Default OpenClaw	Uses raw accessibility tree (~18,000 tokens)
`/predicate-snapshot`	Uses ML-ranked Predicate snapshot (~500 tokens)
`/predicate-snapshot-local`	Uses local heuristic ranking (free, no API)

To use Predicate snapshots in your workflow:

Use /predicate-snapshot instead of the default page observation
Use /predicate-act click <ID> to interact with elements by their ID
The element IDs from Predicate snapshots work with /predicate-act

Future: OpenClaw may add configuration to set Predicate as the default snapshot provider.

⚡ Usage with Autonomous Agents

Important: OpenClaw agents work autonomously—they don't wait for manual slash commands. Here's how to integrate Predicate snapshots into autonomous workflows.

Option 1: Include in Task Instructions (Recommended)

Add Predicate snapshot instructions directly in your task prompt:

Navigate to amazon.com and find the cheapest laptop under $500.

IMPORTANT: For page observation, use /predicate-snapshot instead of the
default accessibility tree. Use /predicate-act to interact with elements
by their ID from the snapshot.

Option 2: Modify Agent System Prompt

For consistent usage across all tasks, add to your agent's system prompt:

## Browser Observation
When observing web pages, always use /predicate-snapshot instead of the
default accessibility tree. This provides ML-ranked elements optimized
for efficient decision-making (~500 tokens vs ~18,000 tokens).

To interact with page elements:
1. Call /predicate-snapshot to get ranked elements with IDs
2. Call /predicate-act <action> <element_id> to perform actions

Option 3: OpenClaw Config (Future)

OpenClaw may add support for setting the default snapshot provider:

# ~/.openclaw/config.yaml (proposed future feature)
browser:
  snapshot_provider: predicate-snapshot

Why This Matters

Without explicit instructions, the agent will use OpenClaw's default accessibility tree, which:

Sends ~18,000 tokens per page observation
Includes thousands of irrelevant elements
Costs more and runs slower

By instructing the agent to use /predicate-snapshot, you get:

~500 tokens per observation (97% reduction)
Only the 50 most relevant elements
Faster, cheaper, more accurate automation

Usage

Capture Snapshot

/predicate-snapshot [--limit=50] [--include-ordinal]

Returns a pipe-delimited table of ranked elements:

ID|role|text|imp|is_primary|docYq|ord|DG|href
42|button|Add to Cart|0.95|true|320|1|cart-actions|
15|button|Buy Now|0.92|true|340|2|cart-actions|
23|link|Product Details|0.78|false|400|0||/dp/...

Execute Actions

/predicate-act click 42        # Click element by ID
/predicate-act type 15 "query" # Type into element
/predicate-act scroll 23       # Scroll to element

Local Mode (Free)

/predicate-snapshot-local [--limit=50]

Uses heuristic ranking without ML API calls. Lower accuracy but no credits consumed.

Example Workflow

1. /predicate-snapshot              # Get ranked elements
2. /predicate-act click 42          # Click "Add to Cart"
3. /predicate-snapshot              # Refresh after action
4. Verify cart updated

Output Format

Column	Description
ID	Unique element identifier for `/predicate-act`
role	ARIA role (button, link, textbox, etc.)
text	Visible text content (truncated to 30 chars)
imp	Importance score (0.0-1.0, ML-ranked)
is_primary	Whether element is a primary action
docYq	Vertical position in document
ord	Ordinal within dominant group
DG	Dominant group identifier
href	Link URL if applicable

Each ML-powered snapshot consumes 1 credit. Local snapshots are free.

Development

Run in Docker (Recommended for Safe Testing)

Docker provides an isolated environment for testing browser automation—no risk to your local machine, browser profiles, or credentials.

cd predicate-snapshot-skill

# Run the skill MCP tools test (no API keys required)
./docker-test.sh skill

# Run the login demo (requires LLM API key)
./docker-test.sh demo:login

Test options:

Command	What it tests	API Keys Required?
`./docker-test.sh`	Skill MCP tools & browser integration	No
`./docker-test.sh skill`	Same as above (explicit)	No
`./docker-test.sh openclaw`	OpenClaw full runtime integration	No
`./docker-test.sh demo:login`	Full 6-step login workflow	Yes (LLM)
`./docker-test.sh demo`	Basic token comparison	Yes (LLM)

Passing API Keys:

The demo:login and demo tests require at least one LLM API key (OpenAI or Anthropic) for element selection:

# Option 1: Export environment variables
export OPENAI_API_KEY="sk-..."          # OpenAI API key
export ANTHROPIC_API_KEY="sk-ant-..."   # OR Anthropic API key
export PREDICATE_API_KEY="sk-..."       # Optional: for ML-ranked snapshots

./docker-test.sh demo:login

# Option 2: Inline (single command)
OPENAI_API_KEY="sk-..." PREDICATE_API_KEY="sk-..." ./docker-test.sh demo:login

# Option 3: Using docker-compose with .env file
# Create a .env file with your keys:
echo "OPENAI_API_KEY=sk-..." >> .env
echo "PREDICATE_API_KEY=sk-..." >> .env
docker-compose up demo-login

API Key	Required?	Purpose
`OPENAI_API_KEY`	One of these required	LLM for element selection
`ANTHROPIC_API_KEY`	One of these required	LLM for element selection
`PREDICATE_API_KEY`	Optional	ML-ranked snapshots (reduces noise & tokens)

Why Docker is safer:

Concern	Docker Isolation
Browser profile	Fresh Chromium instance, no cookies or history
Network traffic	Contained, won't trigger corporate firewalls
File system	Only `./test-output/` is mounted
Credentials	None stored—test site uses fake credentials

Using docker-compose:

docker-compose up skill-test      # Skill MCP tools test
docker-compose up openclaw-test   # OpenClaw full runtime test
docker-compose up demo-login      # Login demo

The test uses a purpose-built test site (https://www.localllamaland.com/login) with fake credentials (testuser / password123)—no real accounts involved.

Run Demo (Local)

Compare token usage between accessibility tree and Predicate snapshot:

Get free credits for testing at https://www.PredicateSystems.ai

# With API key (REAL ML-ranked snapshots)
PREDICATE_API_KEY=sk-... npm run demo

# Without API key (uses extension's local ranking)
npm run demo

Example output:

======================================================================
 TOKEN USAGE COMPARISON: Accessibility Tree vs. Predicate Snapshot
======================================================================
 Mode: PredicateBrowser with extension loaded
 Snapshots: REAL (API key detected)
======================================================================

Analyzing: https://news.ycombinator.com
  Capturing accessibility tree...
  Capturing Predicate snapshot (REAL - ML-ranked via API)...

======================================================================
 RESULTS
======================================================================

news.ycombinator.com (REAL)
  +---------------------------------------------------------+
  | Accessibility Tree:   16,484 tokens (681 elements)      |
  | Predicate Snapshot:      587 tokens (50 elements)       |
  | Savings:                  96%                           |
  +---------------------------------------------------------+

======================================================================
 TOTAL: 616,680 -> 2,198 tokens (99.6% reduction)
======================================================================

 MONTHLY COST PROJECTION (5,000 tasks × 5 snapshots = 25,000 snapshots)
   Accessibility Tree: $11,562.75 (LLM tokens only)
   Predicate Snapshot: $5.12 ($1.37 LLM + $3.75 API)
   Monthly Savings:    $11,557.63

Run Login Demo (Multi-Step Workflow)

This demo demonstrates real-world browser automation with a 6-step login workflow:

Navigate to login page and wait for delayed hydration
Fill username field (LLM selects element + human-like typing)
Fill password field (button state: disabled → enabled)
Click login button
Navigate to profile page
Extract username from profile card

The Predicate overlay shows ML-ranked element IDs (green borders) that the LLM uses to select form fields.

Target site: https://www.localllamaland.com/login - a test site with intentional challenges:

Delayed hydration: Form loads after ~600ms (SPA pattern)
State transitions: Login button disabled until both fields filled
Late-loading content: Profile card loads after 800-1200ms

Setup:

Copy the example env file:

cp .env.example .env

Edit .env and add your OpenAI API key:

# .env
OPENAI_API_KEY=sk-your-openai-api-key-here

# Optional: for ML-ranked snapshots
PREDICATE_API_KEY=sk-your-predicate-api-key-here

Run the demo with visible browser and element overlay:

npm run demo:login -- --headed --overlay

Alternative LLM providers:

# Anthropic Claude
ANTHROPIC_API_KEY=sk-... npm run demo:login -- --headed --overlay

# Local LLM (Ollama)
SENTIENCE_LOCAL_LLM_BASE_URL=http://localhost:11434/v1 npm run demo:login -- --headed --overlay

# Headless mode (no browser window)
npm run demo:login

Flags:

--headed - Run browser in visible window (not headless)
--overlay - Show green borders around captured elements (requires --headed)

This demo compares A11y Tree vs Predicate Snapshot across all 6 steps, measuring:

Tokens per step: Input size for each LLM call
Latency: Time per step including form interactions
Success rate: Step completion across the workflow

Key Observations

Metric	OpenClaw A11y Tree Snapshot	Predicate Snapshot	Delta
Steps Completed	6/6	6/6	Both pass
Total Tokens	5,366	1,565	-71%
Token Savings	baseline	67-74% per step	Significant

Why Predicate Snapshot is better:

Dramatic token reduction: 71% fewer tokens across the entire workflow (5,366 → 1,565 tokens)
ML-ranked elements: Only the most relevant interactable elements are included with enough context, reducing noise
Stable identifiers: data-predicate-id attributes survive SPA re-renders
runtime.check().eventually(): Properly waits for SPA hydration before capturing snapshots

Raw Demo Logs

Full Docker demo output: pastebin.com/ksETcQ4C

Click to expand results summary

======================================================================
 RESULTS SUMMARY
======================================================================

+-----------------------------------------------------------------------+
| Metric              | A11y Tree        | Predicate        | Delta     |
+-----------------------------------------------------------------------+
| Total Tokens        |             5366 |             1565 | -71%      |
| Total Latency (ms)  |            51675 |            75555 | +46%      |
| Steps Passed        |              6/6 |              6/6 |           |
+-----------------------------------------------------------------------+

Key Insight: Predicate snapshots use 71% fewer tokens
for a multi-step login workflow with form filling.

Step-by-step breakdown:
----------------------------------------------------------------------
Step 1: Wait for login form hydration
  A11y: 0 tokens, 12060ms, PASS
  Pred: 0 tokens, 10792ms, PASS (0% savings)
Step 2: Fill username field
  A11y: 1251 tokens, 7613ms, PASS
  Pred: 361 tokens, 12324ms, PASS (71% savings)
Step 3: Fill password field
  A11y: 1305 tokens, 13691ms, PASS
  Pred: 362 tokens, 18410ms, PASS (72% savings)
Step 4: Click login button
  A11y: 1377 tokens, 7909ms, PASS
  Pred: 362 tokens, 13233ms, PASS (74% savings)
Step 5: Navigate to profile page
  A11y: 0 tokens, 1ms, PASS
  Pred: 0 tokens, 0ms, PASS (0% savings)
Step 6: Extract username from profile
  A11y: 1433 tokens, 10401ms, PASS
  Pred: 480 tokens, 20796ms, PASS (67% savings)

Summary

Step	A11y Tree	Predicate Snapshot	Token Savings
Step 1: Wait for login form hydration	PASS	PASS	-
Step 2: Fill username	1,251 tokens, PASS	361 tokens, PASS	71%
Step 3: Fill password	1,305 tokens, PASS	362 tokens, PASS	72%
Step 4: Click login	1,377 tokens, PASS	362 tokens, PASS	74%
Step 5: Navigate to profile	PASS	PASS	-
Step 6: Extract username	1,433 tokens, PASS	480 tokens, PASS	67%
Total	5,366 tokens, 6/6 steps	1,565 tokens, 6/6 steps	71%

Key Insight: Predicate Snapshot reduces tokens by 67-74% per step while maintaining the same pass rate. For multi-step workflows, this translates to significant cost savings and faster LLM inference.

Build

npm run build

Test

npm test

Why Predicate Snapshot Over Accessibility Tree?

OpenClaw and similar browser automation frameworks default to the Accessibility Tree (A11y) for navigating websites. While A11y works for simple cases, it has fundamental limitations that make it unreliable for production LLM-driven automation:

A11y Tree Limitations

Problem	Description	Impact on LLM Agents
Optimized for Consumption, Not Action	A11y is designed for assistive technology (screen readers), not action verification or layout reasoning	Lacks precise semantic geometry and ordinality (e.g., "the first item in a list") that agents need for reliable reasoning
Hydration Lag & Structural Inconsistency	In JS-heavy SPAs, A11y often lags behind hydration or misrepresents dynamic overlays and grouping	Snapshots miss interactive nodes or incorrectly label states (e.g., confusing `focused` with `active`)
Shadow DOM & Iframe Blind Spots	A11y struggles to maintain global order across Shadow DOM and iframe boundaries	Cross-shadow ARIA delegation is inconsistent; iframe contents are often missing or lose spatial context
Token Inefficiency	Extracting the entire A11y tree for small actions wastes context window and compute	Superfluous nodes (like `genericContainer`) consume tokens without helping the agent
Missing Visual/Layout Bugs	A11y trees miss rendering-time issues like overlapping buttons or z-index conflicts	Agent reports elements as "correct" but cannot detect visual collisions

Predicate Snapshot Advantages

Capability	How Predicate Solves It
Post-Rendered Geometry	Layers in actual bounding boxes and grouping missing from standard A11y representations
Live DOM Synchronization	Anchors on the live, post-rendered DOM ensuring perfect sync with actual page state
Unified Cross-Boundary Grounding	Rust/WASM engine prunes and ranks elements across Shadow DOM and iframes, maintaining unified element ordering
Token-Efficient Pruning	Specifically prunes uninformative branches while preserving all interactive elements, enabling 3B parameter models to perform at larger model levels
Deterministic Verification	Binds intent to deterministic outcomes via snapshot diff, providing an auditable "truth" layer rather than just a structural "report"

Bottom Line: A11y trees tell you what should be there. Predicate Snapshots tell you what is there—and prove it.

Architecture

predicate-snapshot-skill/
├── src/
│   ├── index.ts      # MCP tool definitions
│   ├── snapshot.ts   # PredicateSnapshotTool implementation
│   └── act.ts        # PredicateActTool implementation
├── demo/
│   ├── compare.ts    # Token comparison demo
│   ├── llm-action.ts # Simple LLM action demo (single clicks)
│   └── login-demo.ts # Multi-step login workflow demo
├── SKILL.md          # OpenClaw skill manifest
└── package.json

Dependencies

@predicatesystems/runtime - Predicate SDK with PredicateContext
playwright (peer) - Browser automation

License

(MIT OR Apache-2.0)

Support

ClawHub: clawhub.ai/rcholic/predicate-snapshot
GitHub: github.com/PredicateSystems/openclaw-predicate-skill
Documentation: predicatesystems.ai/docs
Issues: GitHub Issues

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
demo		demo
docs		docs
src		src
.dockerignore		.dockerignore
.env.example		.env.example
.eslintrc.js		.eslintrc.js
.gitignore		.gitignore
.npmignore		.npmignore
Dockerfile		Dockerfile
README.md		README.md
SKILL.md		SKILL.md
docker-compose.yml		docker-compose.yml
docker-test.sh		docker-test.sh
package-lock.json		package-lock.json
package.json		package.json
test-openclaw-integration.sh		test-openclaw-integration.sh
test-skill.ts		test-skill.ts
tsconfig.json		tsconfig.json

PredicateSystems/openclaw-predicate-skill

Folders and files

Latest commit

History

Repository files navigation

Predicate Snapshot Skill for OpenClaw

Quick Start

1. Install the Skill

2. Get Your API Key (Optional)

3. Configure the API Key

4. Use the Skill

Overview

Real-World Demo Results

Why Fewer Elements Is Better

Proven in Production

Summary

How It Works

Does This Replace the Default A11y Tree?

⚡ Usage with Autonomous Agents

Option 1: Include in Task Instructions (Recommended)

Option 2: Modify Agent System Prompt

Option 3: OpenClaw Config (Future)

Why This Matters

Usage

Capture Snapshot

Execute Actions

Local Mode (Free)

Example Workflow

Output Format

Development

Run in Docker (Recommended for Safe Testing)

Run Demo (Local)

Run Login Demo (Multi-Step Workflow)

Key Observations

Raw Demo Logs

Summary

Build

Test

Why Predicate Snapshot Over Accessibility Tree?

A11y Tree Limitations

Predicate Snapshot Advantages

Architecture

Dependencies

License

Support

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages