v3.1.0 — Richer clickable naming for DOM grounding
v3.1.0 — Richer clickable naming
Improves the DOM-grounding clickable map the agent sees, so it can tell elements apart from the text channel instead of relying on the screenshot alone — especially anchors, JS-handler buttons, and icon-only controls.
What changed
- Detection now unions the markup heuristic with CDP's native
isClickable, catching elements made clickable via JS event listeners (not just<a>/<button>/onclick). - Naming falls back through a chain so elements stop arriving as bare
[N] tag:
aria-label/alt/title/name/placeholder→ descendant text (e.g.<a><span>Home</span></a>→ "Home") →onclickhandler name (viewBankAccount('0')→ "view bank account", verb only) → FontAwesome icon class (fa-eye→ "view",fa-plus-circle→ "add") →role. roleoverrides the displayed tag, so<i role="button">is shown asbutton, noti.- Clickable map cap raised 80 → 200, sorted visible-first so on-screen elements survive truncation on dense pages (e.g. long sidebars).
snapshot_parsersplit intosnapshot_parser/clickables/clickable_namingto honor the 200-LOC module cap.
Verification
- Unit tests for the naming chain (text/onclick/icon/role precedence, FontAwesome mapping, style-token filtering).
- Live e2e against headless Chrome confirming anchor text, nested-span text,
onclick-derived names, androle→buttonall resolve on a real page.
🤖 Generated with Claude Code