Skip to content

[v4]: unify selector type#1860

Merged
seanmcguire12 merged 5 commits intomainfrom
unify-v4-selector-shape
Mar 19, 2026
Merged

[v4]: unify selector type#1860
seanmcguire12 merged 5 commits intomainfrom
unify-v4-selector-shape

Conversation

@seanmcguire12
Copy link
Member

@seanmcguire12 seanmcguire12 commented Mar 19, 2026

why

  • to unify selector into a single field across page request bodies
  • this groups semantically similar fields into a shared & reusable type

what changed

  • replaced PageSelectorSchema (xpath-only object) with a discriminated union of 4 selector types:
    • PageXPathSelectorSchema,
    • PageCssSelectorSchema,
    • PageTextSelectorSchema,
    • PageCoordinateSelectorSchema
  • added PageElementSelectorSchema (xpath/css/text, no coordinates) for waitForSelector
  • collapsed the dual *SelectorParams / *CoordinatesParams schema pairs for click, hover, scroll, and dragAndDrop into single schemas with a unified selector field

example /hover before:

Screenshot 2026-03-19 at 1 39 06 PM

example /hover after:

Screenshot 2026-03-19 at 1 41 14 PM

test plan

  • added integration tests for css, text, and coordinate selector types

Summary by cubic

Unifies the v4 page selector into a single selector union and adds css, text, and coordinate options. This simplifies request bodies and makes click, hover, scroll, and drag‑and‑drop consistent.

  • New Features

    • One selector union for all page actions: xpath, css, text, or coordinates.
    • Introduced Selector and ElementSelector unions; split former PageSelector into XPathSelector, CssSelector, TextSelector, and CoordinateSelector.
    • Added ElementSelector (xpath/css/text) for element‑only cases like waitForSelector and scroll‑by‑percentage; exported PageScrollElementParams and PageScrollCoordinateParams.
    • dragAndDrop supports mixing selector types (e.g., xpath → coordinates).
    • Added integration tests for css, text, coordinate selectors, and mixed drag‑and‑drop.
  • Migration

    • click/hover: always send { selector: { xpath|css|text|x,y } }. Remove top‑level x/y.
    • scroll: use element selector + percentage, or coordinate selector + deltaY.
    • dragAndDrop: from/to now use the unified selector (elements or coordinates).
    • waitForSelector: use ElementSelector (no coordinates).

Written for commit b0b15f5. Summary will update on new commits. Review in cubic

@changeset-bot
Copy link

changeset-bot bot commented Mar 19, 2026

⚠️ No Changeset found

Latest commit: b0b15f5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Mar 19, 2026

Greptile Summary

This PR unifies all selector types across the v4 page API into a single discriminated union (PageSelector = xpath | css | text | coordinate) and a narrower element-only variant (PageElementSelector = xpath | css | text). The paired *SelectorParams / *CoordinatesParams schema families for click, hover, scroll, and dragAndDrop are collapsed into single schemas, simplifying the request surface while preserving backwards-compatible wire formats. Non-xpath selector types currently fall back to a stub xpath in the route handlers, with that limitation documented in comments.

Key changes:

  • PageXPathSelectorSchema, PageCssSelectorSchema, PageTextSelectorSchema, PageCoordinateSelectorSchema replace the old single-field PageSelectorSchema
  • PageElementSelectorSchema (xpath/css/text only) is used for waitForSelector and scroll-by-element params to prevent coordinate selectors where they are invalid
  • PageScrollParams retains a two-branch union (PageScrollElementParams / PageScrollCoordinateParams) because the secondary fields differ (percentage vs deltaY); all other actions collapse to a single schema
  • PageDragAndDropParams now accepts PageSelector (all 4 types) for both from and to, enabling mixed drag sources/targets — tested with an xpath→coordinate combination

Issue found:

  • PagePointSchema / PagePoint is now orphaned — PageDragAndDropCoordinatesParamsSchema (its only consumer) was removed but PagePoint is still exported and registered in both pageOpenApiComponents and openapi.v4.yaml

Confidence Score: 4/5

  • Safe to merge; changes are well-structured and the only finding is a minor orphaned schema left over from the refactor.
  • The schema unification is internally consistent between the Zod definitions, OpenAPI YAML, and route handlers. Integration tests cover the new selector variants. The one issue — PagePoint remaining registered after its only referencing schema was removed — is dead code rather than a functional bug and has no runtime impact.
  • packages/server-v4/src/schemas/v4/page.ts and packages/server-v4/openapi.v4.yaml — specifically the orphaned PagePoint/PagePointSchema that should either be removed or explicitly retained with a comment.

Important Files Changed

Filename Overview
packages/server-v4/src/schemas/v4/page.ts Core schema refactor – replaces PageSelectorSchema (xpath-only) with a 4-way discriminated union (xpath/css/text/coordinates) and a 3-way element-only variant for waitForSelector. PagePoint is now orphaned after the removal of PageDragAndDropCoordinatesParamsSchema.
packages/server-v4/openapi.v4.yaml OpenAPI spec updated to match schema changes: adds PageXPathSelector, PageCssSelector, PageTextSelector, PageCoordinateSelector, PageElementSelector anyOf unions, collapses paired *Params schemas, and moves PagePoint to the bottom – but PagePoint is no longer referenced by any $ref.
packages/server-v4/src/routes/v4/page/scroll.ts Accesses params.selector (valid on both union branches) and applies the same "xpath" in sel guard; percentage/deltaY fields are unused by the stub, which is consistent with the rest of the stub layer.
packages/server-v4/src/routes/v4/page/shared.ts Comment updated to reflect the new discriminated-union design; normalizeXPath remains exported but unused within server-v4 (pre-existing).
packages/server-v4/test/integration/v4/page.test.ts Two new integration tests added: one verifying css/text/coordinate selector acceptance for click, and one verifying mixed-type selectors for dragAndDrop. Tests correctly reflect stub-layer behavior.

Class Diagram

%%{init: {'theme': 'neutral'}}%%
classDiagram
    class PageSelector {
        <<anyOf>>
    }
    class PageElementSelector {
        <<anyOf>>
    }
    class PageXPathSelector {
        +xpath: string
    }
    class PageCssSelector {
        +css: string
    }
    class PageTextSelector {
        +text: string
    }
    class PageCoordinateSelector {
        +x: number
        +y: number
    }

    PageSelector --> PageXPathSelector
    PageSelector --> PageCssSelector
    PageSelector --> PageTextSelector
    PageSelector --> PageCoordinateSelector

    PageElementSelector --> PageXPathSelector
    PageElementSelector --> PageCssSelector
    PageElementSelector --> PageTextSelector

    class PageClickParams {
        +selector: PageSelector
        +button?: MouseButton
        +clickCount?: number
    }
    class PageHoverParams {
        +selector: PageSelector
    }
    class PageScrollElementParams {
        +selector: PageElementSelector
        +percentage: number
    }
    class PageScrollCoordinateParams {
        +selector: PageCoordinateSelector
        +deltaX?: number
        +deltaY: number
    }
    class PageDragAndDropParams {
        +from: PageSelector
        +to: PageSelector
        +button?: MouseButton
        +steps?: number
        +delay?: number
    }
    class PageWaitForSelectorParams {
        +selector: PageElementSelector
        +state?: WaitForSelectorState
        +timeout?: number
    }

    PageClickParams --> PageSelector
    PageHoverParams --> PageSelector
    PageScrollElementParams --> PageElementSelector
    PageScrollCoordinateParams --> PageCoordinateSelector
    PageDragAndDropParams --> PageSelector
    PageWaitForSelectorParams --> PageElementSelector
Loading

Comments Outside Diff (1)

  1. packages/server-v4/src/schemas/v4/page.ts, line 160-166 (link)

    P2 PagePoint is now orphaned

    PagePointSchema / PagePoint is still exported and registered in pageOpenApiComponents, but with the removal of PageDragAndDropCoordinatesParamsSchema it is no longer referenced by any other Zod schema or OpenAPI $ref. It has become dead code.

    If it was kept intentionally (e.g. for future use or SDK generation), a comment explaining that would help. Otherwise it can be removed alongside its entry in pageOpenApiComponents at line 1405 and its definition in openapi.v4.yaml.

Last reviewed commit: "prettier"

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 8 files

Confidence score: 4/5

  • This PR looks safe to merge with minimal risk: the reported concern is a low-to-moderate severity API/schema hygiene issue (4/10) rather than a clear runtime break.
  • The key issue is in packages/server-v4/src/schemas/v4/page.ts: internal schemas are being exposed via pageOpenApiComponents, which can leak private implementation details instead of keeping them scoped to PageScrollParamsSchema.
  • Pay close attention to packages/server-v4/src/schemas/v4/page.ts - ensure internal-only schemas are removed from pageOpenApiComponents to avoid unintended public surface area.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/server-v4/src/schemas/v4/page.ts">

<violation number="1" location="packages/server-v4/src/schemas/v4/page.ts:1415">
P2: Remove these internal schemas from `pageOpenApiComponents`. Since they are not exported, they should remain private implementation details of `PageScrollParamsSchema`.\n\n(Based on your team's feedback about avoiding exposing internal types as public APIs.) [FEEDBACK_USED]</violation>
</file>
Architecture diagram
sequenceDiagram
    participant Client
    participant Validator as OpenAPI / Zod Validator
    participant Handler as Route Handler (v4)
    participant Shared as Shared Utils (normalizeXPath)
    participant Browser as Stagehand Core

    Note over Client,Browser: Unified Page Action Flow (Click, Hover, Drag, Scroll)

    Client->>Validator: POST /v4/page/[action]
    Note right of Client: CHANGED: Request now uses unified 'selector' field<br/>instead of top-level x/y coordinates

    activate Validator
    Validator->>Validator: CHANGED: Match PageSelector Union
    alt Selector is XPath/CSS/Text
        Validator-->>Handler: PageElementSelector matched
    else Selector is Coordinate
        Validator-->>Handler: PageCoordinateSelector matched (x, y)
    end
    deactivate Validator

    activate Handler
    Handler->>Handler: NEW: Extract specific key from selector union

    alt Action is 'scroll'
        alt NEW: PageElementSelector used
            Handler->>Handler: Require 'percentage' param
        else NEW: PageCoordinateSelector used
            Handler->>Handler: Require 'deltaY' (and optional 'deltaX')
        end
    else Action is 'dragAndDrop'
        Note right of Handler: NEW: 'from' and 'to' can be<br/>different selector types (e.g. xpath to coord)
    else Action is 'waitForSelector'
        Note right of Handler: NEW: Restricted to PageElementSelector<br/>(Coordinates not allowed)
    end

    Handler->>Shared: normalizeXPath(params.selector)
    Shared-->>Handler: xpath (real or stub)

    Handler->>Browser: Execute Stagehand action
    Browser-->>Handler: Action Result
    Handler-->>Client: 200 OK (with PageAction result)
    deactivate Handler

    Note over Client,Validator: Unhappy Path: Validation Error
    Client->>Validator: POST /v4/page/click { selector: { invalid: 'type' } }
    Validator-->>Client: 400 Bad Request (ValidationErrorResponse)
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

.meta({ id: "PageCoordinateSelector" });

// Full union (all 4 types)
export const PageSelectorSchema = z
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just call this SelectorSchema or ElementSelectorSchema or SuperSelectorSchema? Having page in the name makes me think it's a selector to select a page out of the list of all pages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will do!

Copy link
Member

@pirate pirate left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved either way but see my comment about naming.

@seanmcguire12 seanmcguire12 merged commit 425b18a into main Mar 19, 2026
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants