Chromie/stg 1209 by chromiebot · Pull Request #1640 · browserbase/stagehand

chromiebot · 2026-01-29T23:08:14Z

why

what changed

test plan

Summary by cubic

Add level-0 logging when the LLM returns an element ID with no matching xpath and handle these cases gracefully in act and observe. Implements STG-1209 to improve debuggability and prevent invalid element IDs from propagating.

Bug Fixes
- actHandler: log with category "action" when no xpath exists for the returned elementId; act() now fails gracefully with success: false and a clear message.
- observeHandler: log with category "observation" and filter out elements whose IDs are missing from the xpath map.
- Tests: added xpath-lookup-failure-logging.test to verify logging, act() failure behavior, and observe() filtering (including multiple missing IDs).

^{Written for commit 80b6b94. Summary will update on new commits. Review in cubic}

This commit adds tests that verify: - act() logs at level 0 when LLM returns element ID with no xpath - act() returns success: false when xpath lookup fails - observe() logs at level 0 and filters out actions when xpath lookup fails - observe() handles multiple elements with missing xpaths correctly The tests currently fail because the logging is not yet implemented. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

When the LLM returns an element ID that doesn't exist in the xpath map, we now log a clear level 0 message: "LLM returned ID x, there is no xpath keyed by this ID" Changes: - actHandler.ts: Added logging in normalizeActInferenceElement when xpath lookup fails, returning success: false - observeHandler.ts: Added logging when element ID has no xpath, filtering out those actions from the response This improves debuggability when the LLM returns invalid element IDs. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

changeset-bot · 2026-01-29T23:08:18Z

🦋 Changeset detected

Latest commit: 80b6b94

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

Name	Type
@browserbasehq/stagehand	Patch
@browserbasehq/stagehand-evals	Patch
@browserbasehq/stagehand-server	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

greptile-apps · 2026-01-29T23:10:46Z

Greptile Overview

Greptile Summary

Added level 0 (ERROR) logging when the LLM returns element IDs that have no corresponding xpath in the lookup map.

Changes

actHandler.ts: Added v3Logger call with level 0 when trimTrailingTextNode(xpath) returns undefined in normalizeActInferenceElement function (lines 471-478)
observeHandler.ts: Added identical level 0 logging when xpath lookup fails during element mapping (lines 151-159)
New test file: Comprehensive test coverage verifying that:
- Level 0 messages are logged when xpath lookup fails
- act() returns success: false when no xpath is found
- observe() filters out elements with missing xpaths
- Both handlers handle empty xpath maps correctly

Impact

Improves observability for debugging LLM inference issues by explicitly logging when the LLM returns invalid element IDs. The error logging will help identify when the accessibility tree snapshot and LLM response are misaligned.

Confidence Score: 5/5

Safe to merge - simple logging addition with comprehensive test coverage
The changes add defensive logging to existing error paths without modifying core logic. Tests thoroughly cover the new logging behavior and existing behavior is preserved.
No files require special attention

Important Files Changed

Filename	Overview
packages/core/lib/v3/handlers/actHandler.ts	Added level 0 error logging when LLM returns element ID with no corresponding xpath mapping
packages/core/lib/v3/handlers/observeHandler.ts	Added level 0 error logging when xpath lookup fails for element IDs returned by LLM
packages/core/tests/xpath-lookup-failure-logging.test.ts	Comprehensive test coverage for xpath lookup failure scenarios in both act and observe handlers

Sequence Diagram

sequenceDiagram
    participant Client
    participant ActHandler
    participant LLM
    participant XPathMap
    participant Logger

    Client->>ActHandler: act(instruction)
    ActHandler->>ActHandler: captureHybridSnapshot()
    ActHandler->>LLM: getActionFromLLM(instruction, domElements)
    LLM-->>ActHandler: element with ID "1-999"
    ActHandler->>ActHandler: normalizeActInferenceElement()
    ActHandler->>XPathMap: lookup elementId "1-999"
    XPathMap-->>ActHandler: undefined (not found)
    ActHandler->>Logger: v3Logger(level: 0, message: "LLM returned ID...")
    Logger-->>ActHandler: logged
    ActHandler->>ActHandler: return undefined (no action)
    ActHandler-->>Client: { success: false, message: "No action found" }

    Note over Client,Logger: Same flow applies to ObserveHandler<br/>but filters out invalid elements instead

greptile-apps

_{3 files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

cubic-dev-ai

No issues found across 3 files

Confidence score: 5/5

Automated review surfaced no issues in the provided summaries.
No files require special attention.

Architecture diagram

sequenceDiagram
    participant User
    participant Act as ActHandler
    participant Obs as ObserveHandler
    participant Snap as Snapshot Utility
    participant LLM as Inference (LLM)
    participant Log as v3Logger

    Note over User, Log: Act Flow: Handling Invalid Element IDs
    User->>Act: act(instruction)
    Act->>Snap: captureHybridSnapshot()
    Snap-->>Act: { combinedXpathMap, combinedTree }
    Act->>LLM: act(instruction, combinedTree)
    LLM-->>Act: { elementId: "1-999", method: "click" }

    Act->>Act: Lookup "1-999" in combinedXpathMap
    alt NEW: elementId not in map
        Act->>Log: NEW: log(level: 0, category: "action")
        Note right of Log: "no xpath keyed by this ID"
        Act-->>User: CHANGED: { success: false, message: "No action found" }
    else Valid elementId
        Act->>Act: Perform browser action
        Act-->>User: { success: true }
    end

    Note over User, Log: Observe Flow: Filtering Invalid Elements
    User->>Obs: observe(instruction)
    Obs->>Snap: captureHybridSnapshot()
    Snap-->>Obs: { combinedXpathMap }
    Obs->>LLM: observe(instruction)
    LLM-->>Obs: Array of elements [{ id: "1-0" }, { id: "1-999" }]

    loop For each element returned by LLM
        Obs->>Obs: Lookup ID in combinedXpathMap
        alt NEW: ID "1-999" missing from map
            Obs->>Log: NEW: log(level: 0, category: "observation")
            Note right of Obs: Filter out element from final list
        else ID "1-0" exists
            Note right of Obs: Include element in results
        end
    end
    Obs-->>User: Return filtered elements array

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Chromie Bot and others added 2 commits January 29, 2026 22:10

greptile-apps bot reviewed Jan 29, 2026

View reviewed changes

cubic-dev-ai bot reviewed Jan 29, 2026

View reviewed changes

chore: add changeset for xpath lookup failure logging

80b6b94

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

miguelg719 changed the base branch from main to contrib/1640 February 5, 2026 17:48

miguelg719 merged commit 8e4f8c2 into browserbase:contrib/1640 Feb 6, 2026
21 of 27 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chromie/stg 1209#1640

Chromie/stg 1209#1640
miguelg719 merged 3 commits intobrowserbase:contrib/1640from
chromiebot:chromie/STG-1209

chromiebot commented Jan 29, 2026 •

edited by cubic-dev-ai bot

Loading

Uh oh!

changeset-bot bot commented Jan 29, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Jan 29, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

chromiebot commented Jan 29, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

why

what changed

test plan

Summary by cubic

Uh oh!

changeset-bot bot commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

greptile-apps bot commented Jan 29, 2026

Greptile Overview

Greptile Summary

Changes

Impact

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

chromiebot commented Jan 29, 2026 •

edited by cubic-dev-ai bot

Loading

changeset-bot bot commented Jan 29, 2026 •

edited

Loading