Skip to content

feat(browser): selector-first find + get/click/type/select (A2+A3)#1112

Merged
jackwener merged 2 commits intomainfrom
feat/browser-selector-first
Apr 21, 2026
Merged

feat(browser): selector-first find + get/click/type/select (A2+A3)#1112
jackwener merged 2 commits intomainfrom
feat/browser-selector-first

Conversation

@jackwener
Copy link
Copy Markdown
Owner

Summary

Lands A2 + A3 in a single PR per #opencli-browser discussion.

A2 — browser find --css <sel>: structured CSS query. Returns {matches_n, entries[]} so agents can go from a semantic selector straight to a list of candidates, without parsing free-text snapshot output. Each entry exposes nth / ref / tag / role / text / attrs / visible. Invisible elements are still returned so agents can reason about offscreen vs truly-missing targets. Attribute whitelist kept small on purpose (id, class, name, type, placeholder, aria-label, title, href, value, role, data-testid) — no style / onclick leaks, locked by test.

A3 — selector-first read path: browser get text/value/attributes <target> now accept either a numeric ref (legacy snapshot index) or a CSS selector. On multi-match CSS, the first element wins; matches_n is always reported so agents notice ambiguity. --nth <n> picks a specific one.

Bonus scope (approved by @codex-mini1 + @First-principles-1)click / type / select share the same contract:

  • Write ops reject multi-match CSS without --nth as selector_ambiguous (clicking one of three buttons at random is almost never what the agent meant).
  • Numeric ref path unchanged.
  • click{clicked, target, matches_n}, type{typed, text, target, matches_n, autocomplete}, select{selected, target, matches_n} / {error: { code: 'not_a_select' | 'option_not_found', available? }}.

Unified error envelope — every selector-first command emits

{ error: { code, message, hint?, candidates?, matches_n? } }

codes: invalid_selector / selector_not_found / selector_ambiguous / selector_nth_out_of_range (CSS) + not_found / stale_ref (numeric ref) + usage_error (malformed --nth / --css).

Shared resolverresolveTargetJs is the single source of truth for numeric-vs-CSS classification. click / typeText / scrollTo share one runResolve helper in BasePage; the CLI reuses it via resolveRef.

Per @WAWQAQ's directive, no back-compat shims — the new surface is the surface.

Test plan

  • src/browser/find.test.ts — locks generated-JS shape, injection-safety, attr whitelist, per-entry shape, error branches
  • src/browser/target-resolver.test.ts — updated for new CSS error codes
  • src/browser/target-errors.test.ts — updated union + matches_n field
  • src/cli.test.ts — new describe blocks for find, get text/value/attributes, click/type, select: success envelope, selector_ambiguous, selector_nth_out_of_range, not_a_select, option_not_found, usage_error on malformed --nth
  • pnpm typecheck clean
  • 125 targeted tests green

Reviewers

@codex-mini1 @First-principles-1

…envelope

A2: new `browser find --css <sel>` — structured JSON (matches_n + entries[]) so
agents can go from semantic selector directly to a list of candidates without
parsing free-text snapshot output. Per-entry shape: nth/ref/tag/role/text/attrs/
visible. Attr whitelist kept small (11 high-signal fields), invisible elements
still returned so agents can reason about offscreen vs missing.

A3: get text/value/attributes now accept a selector-first <target> (numeric ref
OR CSS) and emit `{value, matches_n}`. Bonus scope (approved by reviewers):
click/type/select share the same contract with `--nth <n>`, emitting
`{clicked|typed|selected, target, matches_n, ...}` on success.

Unified structured error envelope across all selector-first commands:
  { error: { code, message, hint?, candidates?, matches_n? } }
with codes invalid_selector / selector_not_found / selector_ambiguous /
selector_nth_out_of_range (CSS) plus not_found / stale_ref (numeric ref).

Write commands reject multi-match CSS without `--nth` as selector_ambiguous;
reads default to "first match wins" but always expose matches_n so agents
notice ambiguity. `resolveTargetJs` is the single source of truth; click /
typeText / scrollTo share a `runResolve` helper in BasePage.

No back-compat shims per design directive.

125 targeted tests green; tsc clean.
Two blockers from PR #1112 review:

1. First-principles-1 (blocker): `browser find --css` now allocates fresh
   numeric refs for untagged matches. It scans `window.__opencli_ref_identity`
   (and any stray `data-opencli-ref` attrs) for the current max, allocates
   `max+1` upward, writes `data-opencli-ref` on the element, and populates
   the identity map with the same fingerprint shape snapshot uses (tag,
   role, text, ariaLabel, id, testId). `find -> click <ref>` now works on
   fresh pages without requiring `browser state` first. Type changed from
   `ref: number | null` to `ref: number`.

2. codex-mini1 (blocker): removed the `isCssLike` regex
   (`^[a-zA-Z#.\[]`) in `resolveTargetJs`. Valid selectors like `:root`,
   `:has(...)`, `*` used to short-circuit to "Cannot parse target" before
   reaching `querySelectorAll`, so `find --css` accepted them but
   `get/click/type/select` did not. Now: numeric → ref path, everything
   else → querySelectorAll, and the browser parser decides. Same selector
   surface across all selector-first commands.

Tests added:
- target-resolver: pseudo-selectors flow into CSS branch (not rejection)
- find: ref allocation writes attribute + identity map; fingerprint shape matches resolver
- cli: find envelope now expects numeric refs

127 targeted tests green; tsc clean.
@jackwener jackwener merged commit 8a8f048 into main Apr 21, 2026
13 checks passed
@jackwener jackwener deleted the feat/browser-selector-first branch April 21, 2026 05:47
luxiaolei pushed a commit to luxiaolei/OpenCLI that referenced this pull request Apr 22, 2026
…ackwener#1112)

* feat(browser): selector-first find + get/click/type/select with JSON envelope

A2: new `browser find --css <sel>` — structured JSON (matches_n + entries[]) so
agents can go from semantic selector directly to a list of candidates without
parsing free-text snapshot output. Per-entry shape: nth/ref/tag/role/text/attrs/
visible. Attr whitelist kept small (11 high-signal fields), invisible elements
still returned so agents can reason about offscreen vs missing.

A3: get text/value/attributes now accept a selector-first <target> (numeric ref
OR CSS) and emit `{value, matches_n}`. Bonus scope (approved by reviewers):
click/type/select share the same contract with `--nth <n>`, emitting
`{clicked|typed|selected, target, matches_n, ...}` on success.

Unified structured error envelope across all selector-first commands:
  { error: { code, message, hint?, candidates?, matches_n? } }
with codes invalid_selector / selector_not_found / selector_ambiguous /
selector_nth_out_of_range (CSS) plus not_found / stale_ref (numeric ref).

Write commands reject multi-match CSS without `--nth` as selector_ambiguous;
reads default to "first match wins" but always expose matches_n so agents
notice ambiguity. `resolveTargetJs` is the single source of truth; click /
typeText / scrollTo share a `runResolve` helper in BasePage.

No back-compat shims per design directive.

125 targeted tests green; tsc clean.

* fix(browser): unify selector surface + allocate fresh refs in find

Two blockers from PR jackwener#1112 review:

1. First-principles-1 (blocker): `browser find --css` now allocates fresh
   numeric refs for untagged matches. It scans `window.__opencli_ref_identity`
   (and any stray `data-opencli-ref` attrs) for the current max, allocates
   `max+1` upward, writes `data-opencli-ref` on the element, and populates
   the identity map with the same fingerprint shape snapshot uses (tag,
   role, text, ariaLabel, id, testId). `find -> click <ref>` now works on
   fresh pages without requiring `browser state` first. Type changed from
   `ref: number | null` to `ref: number`.

2. codex-mini1 (blocker): removed the `isCssLike` regex
   (`^[a-zA-Z#.\[]`) in `resolveTargetJs`. Valid selectors like `:root`,
   `:has(...)`, `*` used to short-circuit to "Cannot parse target" before
   reaching `querySelectorAll`, so `find --css` accepted them but
   `get/click/type/select` did not. Now: numeric → ref path, everything
   else → querySelectorAll, and the browser parser decides. Same selector
   surface across all selector-first commands.

Tests added:
- target-resolver: pseudo-selectors flow into CSS branch (not rejection)
- find: ref allocation writes attribute + identity map; fingerprint shape matches resolver
- cli: find envelope now expects numeric refs

127 targeted tests green; tsc clean.

(cherry picked from commit 8a8f048)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant