Skip to content

Feature Request: Add Element Locator Tool for aria-ref to ref Mapping #588

Open
@yattin

Description

@yattin

Feature Request: Add Element Locator Tool for aria-ref to ref Mapping

🎯 Feature Summary

Add a new tool that allows querying elements using aria-ref attributes within captured PageSnapshots and returns the corresponding MCP internal ref identifier along with traditional CSS and XPath selectors for enhanced automation workflows.

🔍 Background & Context

Currently, the Playwright MCP workflow requires:

  1. Call browser_snapshot to get accessibility snapshot
  2. Manually identify the target element's ref from the snapshot
  3. Use the ref parameter in interaction tools like browser_click, browser_type, etc.

The server enables LLMs to interact with web pages through structured accessibility snapshots, and all interaction tools require a ref parameter which is the "Exact target element reference from the page snapshot".

However, there's currently no programmatic way to locate elements by HTML attributes (like aria-ref) and retrieve their corresponding MCP ref identifier.

🛠️ Proposed Tool

Tool Name: browser_find_element

Description

Locate elements within the current PageSnapshot using various attribute-based selectors and return their MCP ref identifiers along with standard CSS/XPath selectors.

Parameters

{
  // Query parameters (at least one required)
  "aria_ref"?: string,           // aria-ref attribute value
  "data_testid"?: string,        // data-testid attribute value  
  "id"?: string,                 // HTML id attribute
  "css_selector"?: string,       // CSS selector

// Options
"include_selectors"?: boolean, // Include CSS/XPath (default: true)
"multiple"?: boolean // Return all matches vs first match (default: false)
}

Expected Response

{
  "found": true,
  "elements": [
    {
      "ref": "node-123",  // MCP internal reference for use with other tools
      "element_info": {
        "role": "button",
        "name": "Submit Form", 
        "tag": "button",
        "attributes": {
          "aria-ref": "submit-btn",
          "class": "btn btn-primary"
        }
      },
      "selectors": {
        "css_path": "form.main-form button[aria-ref='submit-btn']",
        "xpath": "//form[@class='main-form']//button[@aria-ref='submit-btn']",
        "css_nth": "form.main-form button:nth-child(2)"
      }
    }
  ],
  "total_found": 1
}

🎯 Use Cases & Integration

1. Streamlined Automation Workflow

// Before (current workflow)
const snapshot = await browser_snapshot();
// Manual inspection needed to find ref for aria-ref="submit"
const clickResult = await browser_click({
  element: "Submit button",
  ref: "node-123" // Had to manually identify this
});

// After (with new tool)
const findResult = await browser_find_element({
aria_ref: "submit"
});
const clickResult = await browser_click({
element: "Submit button",
ref: findResult.elements[0].ref // Programmatically obtained
});

2. Enhanced Test Automation

  • Generate stable selectors for CI/CD pipelines
  • Bridge gap between accessibility-first development and automation
  • Support for multiple selector strategies in one tool

3. Debugging & Development Support

  • Quick element identification during development
  • Validation of accessibility attributes
  • Cross-reference between MCP refs and standard selectors

🏗️ Technical Implementation Considerations

Integration with Existing Architecture

  • Leverage Current PageSnapshot: Build upon existing browser_snapshot infrastructure
  • Maintain Consistency: Follow same response patterns as other MCP tools
  • Performance: Cache snapshot data to avoid redundant accessibility tree traversals

Selector Generation Strategy

Based on Playwright's accessibility tree structure that contains information about elements, their roles, names, values, and relationships:

function generateSelectors(element) {
  return {
    css_path: buildCSSPath(element),     // Hierarchical CSS selector
    xpath: buildXPath(element),          // XPath expression
    css_nth: buildNthChildCSS(element)   // Nth-child based selector
  };
}

Error Handling

  • Handle cases where attributes are not found
  • Support partial matching for dynamic content
  • Graceful degradation when selector generation fails

🔄 Compatibility with Current MCP Tools

The new tool integrates seamlessly with the existing tool ecosystem:

Current Tool Integration Benefit
browser_snapshot Uses same accessibility data source
browser_click Provides ref parameter directly
browser_type Enables attribute-based form field targeting
browser_hover Supports accessible element hover actions
browser_drag Facilitates drag-and-drop with accessible elements

📋 Acceptance Criteria

  • Successfully locate elements by aria-ref, data-testid, id, and CSS selectors
  • Return valid MCP ref identifiers compatible with existing interaction tools
  • Generate stable and unique CSS Path and XPath selectors
  • Handle multiple matches with configurable return behavior
  • Maintain performance with large PageSnapshots (>1000 elements)
  • Provide comprehensive error messages for debugging
  • Include accessibility metadata in responses
  • Follow existing MCP tool parameter and response patterns

🚀 Future Enhancement Opportunities

Phase 2 Features

  • Fuzzy Matching: Support approximate text matching for accessible names
  • Selector Validation: Verify generated selectors against live page state
  • Batch Operations: Find multiple elements in single tool call
  • Performance Metrics: Include selector generation timing in responses

Integration Possibilities

  • Test Generation: Auto-generate Playwright test code using returned selectors
  • Page Object Models: Support structured element mapping for large applications
  • Accessibility Auditing: Identify elements missing required accessibility attributes

📖 Technical References


Priority: High
Complexity: Medium
Dependencies: Existing PageSnapshot infrastructure
Estimated Effort: 2-3 development cycles

Note: This tool addresses the gap between accessibility-first development practices and automation needs, making MCP more powerful for developers using semantic HTML attributes.

# Feature Request: Add Element Locator Tool for aria-ref to ref Mapping

🎯 Feature Summary

Add a new tool that allows querying elements using aria-ref attributes within captured PageSnapshots and returns the corresponding MCP internal ref identifier along with traditional CSS and XPath selectors for enhanced automation workflows.

🔍 Background & Context

Currently, the Playwright MCP workflow requires:

  1. Call browser_snapshot to get accessibility snapshot
  2. Manually identify the target element's ref from the snapshot
  3. Use the ref parameter in interaction tools like browser_click, browser_type, etc.

The server enables LLMs to interact with web pages through structured accessibility snapshots, and all interaction tools require a ref parameter which is the "Exact target element reference from the page snapshot".

However, there's currently no programmatic way to locate elements by HTML attributes (like aria-ref) and retrieve their corresponding MCP ref identifier.

🛠️ Proposed Tool

Tool Name: browser_find_element

Description

Locate elements within the current PageSnapshot using various attribute-based selectors and return their MCP ref identifiers along with standard CSS/XPath selectors.

Parameters

{
  // Query parameters (at least one required)
  "aria_ref"?: string,           // aria-ref attribute value
  "data_testid"?: string,        // data-testid attribute value  
  "id"?: string,                 // HTML id attribute
  "css_selector"?: string,       // CSS selector
  
  // Options
  "include_selectors"?: boolean, // Include CSS/XPath (default: true)
  "multiple"?: boolean          // Return all matches vs first match (default: false)
}

Expected Response

{
  "found": true,
  "elements": [
    {
      "ref": "node-123",  // MCP internal reference for use with other tools
      "element_info": {
        "role": "button",
        "name": "Submit Form", 
        "tag": "button",
        "attributes": {
          "aria-ref": "submit-btn",
          "class": "btn btn-primary"
        }
      },
      "selectors": {
        "css_path": "form.main-form button[aria-ref='submit-btn']",
        "xpath": "//form[@class='main-form']//button[@aria-ref='submit-btn']",
        "css_nth": "form.main-form button:nth-child(2)"
      }
    }
  ],
  "total_found": 1
}

🎯 Use Cases & Integration

1. Streamlined Automation Workflow

// Before (current workflow)
const snapshot = await browser_snapshot();
// Manual inspection needed to find ref for aria-ref="submit"
const clickResult = await browser_click({
  element: "Submit button",
  ref: "node-123" // Had to manually identify this
});

// After (with new tool)
const findResult = await browser_find_element({
  aria_ref: "submit"
});
const clickResult = await browser_click({
  element: "Submit button", 
  ref: findResult.elements[0].ref // Programmatically obtained
});

2. Enhanced Test Automation

  • Generate stable selectors for CI/CD pipelines
  • Bridge gap between accessibility-first development and automation
  • Support for multiple selector strategies in one tool

3. Debugging & Development Support

  • Quick element identification during development
  • Validation of accessibility attributes
  • Cross-reference between MCP refs and standard selectors

🏗️ Technical Implementation Considerations

Integration with Existing Architecture

  • Leverage Current PageSnapshot: Build upon existing browser_snapshot infrastructure
  • Maintain Consistency: Follow same response patterns as other MCP tools
  • Performance: Cache snapshot data to avoid redundant accessibility tree traversals

Selector Generation Strategy

Based on Playwright's accessibility tree structure that contains information about elements, their roles, names, values, and relationships:

function generateSelectors(element) {
  return {
    css_path: buildCSSPath(element),     // Hierarchical CSS selector
    xpath: buildXPath(element),          // XPath expression
    css_nth: buildNthChildCSS(element)   // Nth-child based selector
  };
}

Error Handling

  • Handle cases where attributes are not found
  • Support partial matching for dynamic content
  • Graceful degradation when selector generation fails

🔄 Compatibility with Current MCP Tools

The new tool integrates seamlessly with the existing tool ecosystem:

Current Tool Integration Benefit
browser_snapshot Uses same accessibility data source
browser_click Provides ref parameter directly
browser_type Enables attribute-based form field targeting
browser_hover Supports accessible element hover actions
browser_drag Facilitates drag-and-drop with accessible elements

📋 Acceptance Criteria

  • Successfully locate elements by aria-ref, data-testid, id, and CSS selectors
  • Return valid MCP ref identifiers compatible with existing interaction tools
  • Generate stable and unique CSS Path and XPath selectors
  • Handle multiple matches with configurable return behavior
  • Maintain performance with large PageSnapshots (>1000 elements)
  • Provide comprehensive error messages for debugging
  • Include accessibility metadata in responses
  • Follow existing MCP tool parameter and response patterns

🚀 Future Enhancement Opportunities

Phase 2 Features

  • Fuzzy Matching: Support approximate text matching for accessible names
  • Selector Validation: Verify generated selectors against live page state
  • Batch Operations: Find multiple elements in single tool call
  • Performance Metrics: Include selector generation timing in responses

Integration Possibilities

  • Test Generation: Auto-generate Playwright test code using returned selectors
  • Page Object Models: Support structured element mapping for large applications
  • Accessibility Auditing: Identify elements missing required accessibility attributes

📖 Technical References


Priority: High
Complexity: Medium
Dependencies: Existing PageSnapshot infrastructure
Estimated Effort: 2-3 development cycles

Note: This tool addresses the gap between accessibility-first development practices and automation needs, making MCP more powerful for developers using semantic HTML attributes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions