Skip to content

added xpath selection optimizer#141

Merged
sauravpanda merged 4 commits intomainfrom
add-xpath-optimizer
Nov 18, 2025
Merged

added xpath selection optimizer#141
sauravpanda merged 4 commits intomainfrom
add-xpath-optimizer

Conversation

@sauravpanda
Copy link
Copy Markdown
Collaborator

@sauravpanda sauravpanda commented Nov 17, 2025

Summary by cubic

Adds an XPath selection optimizer that generates a small set of robust, prioritized XPath alternatives (plus an absolute fallback) to improve selector healing stability and reduce brittle paths. Also fixes the alternative limit to return only the absolute XPath when max_alternatives=1.

  • New Features

    • Integrates XPathOptimizer into SelectorGenerator to produce optimized XPaths, capped at 2 (optimized + absolute fallback).
    • Adds priority scoring and strategy metadata (id/name/data/aria > table/form-anchored > class/text > absolute).
    • Limits total strategies returned (default: 2) to keep selection concise and reliable.
    • Wires optimizer into HealingService with flags (enable_xpath_optimization=True).
    • Improves CLI to auto-use default values when all required inputs have defaults, and auto-run stored workflows without prompts.
  • Bug Fixes

    • Corrects max_alternatives=1 behavior to return only the absolute XPath; includes tests and a small script to verify.

Written for commit 371ab17. Summary will update automatically on new commits.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 issues found across 6 files

Prompt for AI agents (all 4 issues)

Understand the root cause of the following 4 issues and fix them.


<file name="workflows/workflow_use/healing/selector_generator.py">

<violation number="1" location="workflows/workflow_use/healing/selector_generator.py:67">
Defaulting max_total_strategies to 2 truncates the selector list to only two strategies, removing critical fallbacks like XPath and causing element healing to fail once the first strategies miss.</violation>
</file>

<file name="workflows/workflow_use/healing/xpath_optimizer.py">

<violation number="1" location="workflows/workflow_use/healing/xpath_optimizer.py:138">
When max_alternatives is 1 or less, the function still keeps one non-fallback alternative and then appends the absolute path, returning two entries even though the limit was set to 1; callers expecting the documented cap can get more selectors than requested.</violation>
</file>

<file name="workflows/cli.py">

<violation number="1" location="workflows/cli.py:1228">
Optional workflow inputs without defaults can no longer be supplied: the new check treats &quot;optional&quot; as equivalent to &quot;has a default&quot;, skips the entire input prompt, and silently drops those parameters whenever every required field has a default.</violation>

<violation number="2" location="workflows/cli.py:2505">
run-stored-workflow skips prompting for optional parameters with no defaults whenever the required parameters already have defaults, so those values cannot be provided anymore.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

10. Direct CSS/xpath (fallback)
"""

def __init__(self, enable_xpath_optimization: bool = True, max_xpath_alternatives: int = 2, max_total_strategies: int = 2):
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Defaulting max_total_strategies to 2 truncates the selector list to only two strategies, removing critical fallbacks like XPath and causing element healing to fail once the first strategies miss.

Prompt for AI agents
Address the following comment on workflows/workflow_use/healing/selector_generator.py at line 67:

<comment>Defaulting max_total_strategies to 2 truncates the selector list to only two strategies, removing critical fallbacks like XPath and causing element healing to fail once the first strategies miss.</comment>

<file context>
@@ -62,6 +64,23 @@ class SelectorGenerator:
 	10. Direct CSS/xpath (fallback)
 	&quot;&quot;&quot;
 
+	def __init__(self, enable_xpath_optimization: bool = True, max_xpath_alternatives: int = 2, max_total_strategies: int = 2):
+		&quot;&quot;&quot;
+		Initialize the SelectorGenerator.
</file context>
Suggested change
def __init__(self, enable_xpath_optimization: bool = True, max_xpath_alternatives: int = 2, max_total_strategies: int = 2):
def __init__(self, enable_xpath_optimization: bool = True, max_xpath_alternatives: int = 2, max_total_strategies: Optional[int] = None):
Fix with Cubic

if max_alternatives > 1:
unique_alternatives = unique_alternatives[: max_alternatives - 1]
else:
unique_alternatives = unique_alternatives[:1]
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When max_alternatives is 1 or less, the function still keeps one non-fallback alternative and then appends the absolute path, returning two entries even though the limit was set to 1; callers expecting the documented cap can get more selectors than requested.

Prompt for AI agents
Address the following comment on workflows/workflow_use/healing/xpath_optimizer.py at line 138:

<comment>When max_alternatives is 1 or less, the function still keeps one non-fallback alternative and then appends the absolute path, returning two entries even though the limit was set to 1; callers expecting the documented cap can get more selectors than requested.</comment>

<file context>
@@ -117,17 +119,29 @@ def optimize_xpath(self, absolute_xpath: str, element_info: Optional[Dict] = Non
+		if max_alternatives &gt; 1:
+			unique_alternatives = unique_alternatives[: max_alternatives - 1]
+		else:
+			unique_alternatives = unique_alternatives[:1]
+
+		# Always add original absolute path as last resort
</file context>
Suggested change
unique_alternatives = unique_alternatives[:1]
unique_alternatives = []

✅ Addressed in 42a6294

Comment thread workflows/cli.py Outdated
f', default: {typer.style(str(default_value), fg=typer.colors.BLUE)}' if default_value is not None else ''
# Check if all inputs have defaults (can run without user input)
all_inputs_have_defaults = all(
getattr(inp, 'default', None) is not None or not inp.required for inp in workflow_definition.input_schema
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run-stored-workflow skips prompting for optional parameters with no defaults whenever the required parameters already have defaults, so those values cannot be provided anymore.

Prompt for AI agents
Address the following comment on workflows/cli.py at line 2505:

<comment>run-stored-workflow skips prompting for optional parameters with no defaults whenever the required parameters already have defaults, so those values cannot be provided anymore.</comment>

<file context>
@@ -2460,24 +2500,64 @@ def run_stored_workflow(
-					f&#39;, default: {typer.style(str(default_value), fg=typer.colors.BLUE)}&#39; if default_value is not None else &#39;&#39;
+			# Check if all inputs have defaults (can run without user input)
+			all_inputs_have_defaults = all(
+				getattr(inp, &#39;default&#39;, None) is not None or not inp.required for inp in workflow_definition.input_schema
+			)
+
</file context>
Suggested change
getattr(inp, 'default', None) is not None or not inp.required for inp in workflow_definition.input_schema
getattr(inp, 'default', None) is not None for inp in workflow_definition.input_schema
Fix with Cubic

Comment thread workflows/cli.py Outdated
default_value = getattr(input_def, 'default', None)
# Check if all inputs have defaults (can skip prompting)
all_inputs_have_defaults = all(
getattr(input_def, 'default', None) is not None or not input_def.required for input_def in input_definitions
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional workflow inputs without defaults can no longer be supplied: the new check treats "optional" as equivalent to "has a default", skips the entire input prompt, and silently drops those parameters whenever every required field has a default.

Prompt for AI agents
Address the following comment on workflows/cli.py at line 1228:

<comment>Optional workflow inputs without defaults can no longer be supplied: the new check treats &quot;optional&quot; as equivalent to &quot;has a default&quot;, skips the entire input prompt, and silently drops those parameters whenever every required field has a default.</comment>

<file context>
@@ -1223,58 +1223,78 @@ async def _run_workflow():
-				default_value = getattr(input_def, &#39;default&#39;, None)
+			# Check if all inputs have defaults (can skip prompting)
+			all_inputs_have_defaults = all(
+				getattr(input_def, &#39;default&#39;, None) is not None or not input_def.required for input_def in input_definitions
+			)
 
</file context>
Suggested change
getattr(input_def, 'default', None) is not None or not input_def.required for input_def in input_definitions
getattr(input_def, 'default', None) is not None for input_def in input_definitions
Fix with Cubic

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 4 files (reviewed changes from recent commits).

Prompt for AI agents (all 3 issues)

Understand the root cause of the following 3 issues and fix them.


<file name="workflows/test_max_alternatives_bug_fix.py">

<violation number="1" location="workflows/test_max_alternatives_bug_fix.py:34">
The max_alternatives=1 check only asserts the count of returned XPaths, so the regression test would still pass even if the optimizer returned a single (wrong) optimized XPath instead of the required absolute fallback.</violation>

<violation number="2" location="workflows/test_max_alternatives_bug_fix.py:50">
The max_alternatives=2 test only verifies the number of XPaths, so it would miss regressions where the absolute fallback is absent or duplicated, defeating the stated expectation of “1 optimized + 1 absolute”.</violation>

<violation number="3" location="workflows/test_max_alternatives_bug_fix.py:66">
The max_alternatives=3 regression test asserts only the total count, so it would not detect cases where the absolute fallback is missing or where fewer than two optimized selectors are produced, contradicting the expectation printed to the user.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

is_absolute = xpath == absolute_xpath
print(f' {i}. {xpath[:60]}{"..." if len(xpath) > 60 else ""} {"(absolute)" if is_absolute else "(optimized)"}')

if len(result_3) == 3:
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max_alternatives=3 regression test asserts only the total count, so it would not detect cases where the absolute fallback is missing or where fewer than two optimized selectors are produced, contradicting the expectation printed to the user.

Prompt for AI agents
Address the following comment on workflows/test_max_alternatives_bug_fix.py at line 66:

<comment>The max_alternatives=3 regression test asserts only the total count, so it would not detect cases where the absolute fallback is missing or where fewer than two optimized selectors are produced, contradicting the expectation printed to the user.</comment>

<file context>
@@ -0,0 +1,78 @@
+	is_absolute = xpath == absolute_xpath
+	print(f&#39;      {i}. {xpath[:60]}{&quot;...&quot; if len(xpath) &gt; 60 else &quot;&quot;} {&quot;(absolute)&quot; if is_absolute else &quot;(optimized)&quot;}&#39;)
+
+if len(result_3) == 3:
+	print(&#39;   ✅ PASS: Exactly 3 XPaths returned&#39;)
+else:
</file context>

✅ Addressed in f5fdca1

is_absolute = xpath == absolute_xpath
print(f' {i}. {xpath[:60]}{"..." if len(xpath) > 60 else ""} {"(absolute)" if is_absolute else "(optimized)"}')

if len(result_2) == 2:
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Nov 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The max_alternatives=2 test only verifies the number of XPaths, so it would miss regressions where the absolute fallback is absent or duplicated, defeating the stated expectation of “1 optimized + 1 absolute”.

Prompt for AI agents
Address the following comment on workflows/test_max_alternatives_bug_fix.py at line 50:

<comment>The max_alternatives=2 test only verifies the number of XPaths, so it would miss regressions where the absolute fallback is absent or duplicated, defeating the stated expectation of “1 optimized + 1 absolute”.</comment>

<file context>
@@ -0,0 +1,78 @@
+	is_absolute = xpath == absolute_xpath
+	print(f&#39;      {i}. {xpath[:60]}{&quot;...&quot; if len(xpath) &gt; 60 else &quot;&quot;} {&quot;(absolute)&quot; if is_absolute else &quot;(optimized)&quot;}&#39;)
+
+if len(result_2) == 2:
+	print(&#39;   ✅ PASS: Exactly 2 XPaths returned&#39;)
+else:
</file context>

✅ Addressed in f5fdca1

Comment thread workflows/test_max_alternatives_bug_fix.py Outdated
@sauravpanda sauravpanda merged commit 6f31bc6 into main Nov 18, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant