Skip to content

Compact vs Full ACT Prompt β€” Analysis

Emre Sokullu edited this page May 24, 2026 · 1 revision

The numbers

  • Full ACT prompt: ~33,000 chars (~8k tokens)
  • Compact ACT prompt: ~3,400 chars (~850 tokens)
  • Compact is ~10% the size of full β€” a 7k+ token savings on every turn

Side-by-side structure

The compact column β‰ˆ everything the model sees end-to-end. The full column lists section names only.

Topic Compact Full
Opening One line ("You are WebBrain, an AI browser agent. You interact with web pages through tools.") Full paragraph about operating env, browser session, "you ARE the user from the website's POV"
OPERATING ENVIRONMENT (refusals, scheduling, auth) Rule #1 only: "Never refuse… Just do it through the UI" 6 detailed bullets including "You CANNOT schedule, sleep, set timers", refusal patterns to avoid, legitimate decline cases
Tool list Tools NEVER listed by name. Mentions inline: read_page, click, type_text, scratchpad_write, done, download_social_media, get_accessibility_tree, download_file, upload_file β€” that's it Full tool list with one-line descriptions for ~25 tools including click_ax, type_ax, set_field, hover, drag_drop, wait_for_element, wait_for_stable, verify_form, record_tab, clarify, iframe_*, etc.
ACCESSIBILITY TREE (the preferred page-read path) Section absent. Never mentioned. No click_ax/type_ax/set_field/ref_id guidance. 30+ lines: format example, parameters, ref_id stability rules, "DEFAULT ACT LOOP" step-by-step, fallback to legacy click()
Current Page Priority Rule #2: one line Full section: "user is on this page for a reason", when to navigate away
Verify after action Rule #3: one line "CRITICAL β€” do NOT rush" section (4 bullets: don't chain calls, verify creation success, fill one field at a time, parse request first)
TYPING patterns One line at the bottom: "Click field, then type_text with NO selector" Full TYPING section with branches for: text input vs <select> vs ARIA combobox / Stripe-style picker, type-ahead, ArrowDown+Enter for portaled listboxes, rich-text editors (CodeMirror, Gmail compose), hard rule "no click_ax loops"
CLICKING preference order One line: "text > index > selector > x,y" Full CLICKING section with the order PLUS rules about jQuery pseudo-classes, ambiguity errors, data-testid guessing, coord interpretation on HiDPI displays
Index instability Last line in compact, one sentence Full INDEX INSTABILITY section: "indices are NOT stable", per-turn invalidation, "don't guess from training data"
MODALS & DIALOGS Section absent. 5 bullets: modal-scoped queries return "no match" for outer page, finish modal first, post-submit verify dialog closed, occlusion-blocked error pattern
FORMS / verify_form Section absent. 6 bullets: call verify_form before submit, check creation timestamp matches "now", CAPTCHA handling
IFRAMES (Stripe, embedded widgets) Section absent. 6 bullets: iframe_read/click/type tools, urlFilter param, cross-origin works because extension bypass
UI vs API Rule #10: one line "Interact through the visible UI" Full section: rationale, two exceptions (user override, /allow-api flag), concrete examples per site
SCROLLING Section absent. 4 bullets: scroll for below-fold buttons, check all fields before submit
SCRATCHPAD usage patterns Rule #11: one sentence Full section: 5 use cases, "after download_file IMMEDIATELY scratchpad the path"
DON'T REDO WORK Rule #12: ~3 lines Full section: download-already-done loop, fetch already returned, ref_ids stable across calls, the "doubt β†’ re-navigate" antipattern
LISTINGS & PAGINATION Short version (~3 lines) Detailed version (~6 lines with ?page=/?sd= URL patterns, maxChars overflow, terminal-list tasks)
Scheduling / wait-for-event Rule #13: 2 lines Long section in operating-env explaining no cron, no timers
Social media downloads Rule #14: one line Full SOCIAL MEDIA DOWNLOADS section with download_social_media mode/scroll options, recommendation field handling

What's actually missing in compact

In rough order of how often it bites real tasks:

  1. The whole AX-tree path. No mention of get_accessibility_tree, click_ax, type_ax, set_field, or ref_id. A model on compact has no idea these tools exist from the prompt. (They exist in the tool list β€” see "Tool access" below β€” but the model has no guidance about preferring them.)
  2. Modal handling. No occlusion / modal-scoped click guidance.
  3. iframes. No Stripe / payment-widget tooling.
  4. Form verify. No verify_form mention.
  5. Combobox handling β€” the Stripe currency picker / portaled-listbox sequence.
  6. CAPTCHA + solve_captcha.
  7. All the "new in this PR" tools β€” hover, drag_drop, wait_for_stable.

Tool access β€” compact vs full

Yes β€” same tools, no filtering. Looking at getToolsForMode(mode, opts):

export function getToolsForMode(mode, opts = {}) {
  const base = (mode === 'ask')
    ? AGENT_TOOLS.filter(t => ASK_ONLY_TOOLS.includes(t.function.name))
    : AGENT_TOOLS;
  if (!opts.strictSecretMode) return base;
  return base.map(t => (t.function.name === 'done' ? DONE_TOOL_STRICT : t));
}

Only branches on ask vs everything-else, and strictSecretMode. The compact prompt is not even an input to tool filtering. So useCompactPrompt: true β†’ same 44 tool schemas sent to the model.

But. Tool schemas alone don't teach the model when to use which tool. The full ACT prompt has 200+ lines of "prefer X over Y in case Z" guidance. The compact prompt has none of that. So in practice, a small model on compact mode:

  • Can call click_ax({ref_id: "ref_42"}) β€” the schema is there, the description is there
  • Won't think to β€” because nothing in the prompt tells it click_ax exists or that ref_ids come from get_accessibility_tree
  • Will default to the patterns the prompt does mention: click({text: "..."}), type_text, screenshot, done

This is actually the deeper reason compact is risky and why the default was flipped. The agent loses access to the AX-tree-driven workflow β€” the single biggest reliability win on modern React/portal-heavy sites β€” without any explicit filtering. The model just doesn't know to use those tools.

Summary: same tools, missing guidance, materially weaker behavior. Not a "feature toggle" so much as a "drop the playbook" toggle.