-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use 4-char element ids instead of sequential integers #361
Use 4-char element ids instead of sequential integers #361
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Tested on my local as well, worked really well. I've couple of small comments.
There are some prettier and mypy errors as well, would be perfect if you could take a look at those. |
Fixed those, but for this skyvern/webeye/actions/handler.py:487: error: Invalid index type "int" for "dict[str, str]"; expected type "str" [index] Please met know, what was the intention behind mapping index -> id in the prompt, whether it was a typo (also considering that there is no
|
"value": str // the value of the option. MAKE SURE YOU USE THIS VALUE TO SELECT THE OPTION. DO NOT PUT ANYTHING OTHER THAN A VALID OPTION VALUE HERE | ||
"id": str // the id of the option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the option.id to prevent LLM hallucinating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can remove this. I don't think we're even passing the option id to the LLM.
skyvern/webeye/actions/handler.py
Outdated
@@ -484,7 +484,7 @@ async def handle_select_option_action( | |||
return [ActionFailure(e)] | |||
|
|||
try: | |||
option_xpath = scraped_page.id_to_xpath_dict[action.option.index] | |||
option_xpath = scraped_page.id_to_xpath_dict[action.option.id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is a legacy bug. We didn't store the option element as a single key in the scraped_page.id_to_xpath_dict
. So the option.id
here will never be found.
skyvern/skyvern/webeye/scraper/scraper.py
Lines 257 to 263 in 418473d
id_to_xpath_dict = {} | |
id_to_element_dict = {} | |
for element in elements: | |
element_id = element["id"] | |
# get_interactable_element_tree marks each interactable element with a unique_id attribute | |
id_to_xpath_dict[element_id] = f"//*[@{SKYVERN_ID_ATTR}='{element_id}']" | |
id_to_element_dict[element_id] = element |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah this is a big bug 😮 We should stop using id_to_xpath
dict and deprecate it IMO
this can trigger an error. Can you remove this? skyvern/skyvern/webeye/actions/handler.py Line 776 in 418473d
|
nice catch. this is a legacy problem after we finished the HTML element tree. |
after you finish all suggestions, can you solve the conflict? |
@LawyZheng we took over this PR, feel free to merge this on a new branch and then take over. |
"text": str, // Text for INPUT_TEXT action only | ||
"file_url": str, // The url of the file to upload if applicable. This field must be present for UPLOAD_FILE but can also be present for CLICK only if the click is to upload the file. It should be null otherwise. | ||
"download": bool, // Can only be true for CLICK actions. If true, the browser will trigger a download by clicking the element. If false, the browser will click the element without triggering a download. | ||
"option": { // The option to select for SELECT_OPTION action only. null if not SELECT_OPTION action | ||
"label": str, // the label of the option if any. MAKE SURE YOU USE THIS LABEL TO SELECT THE OPTION. DO NOT PUT ANYTHING OTHER THAN A VALID OPTION LABEL HERE | ||
"index": int, // the id corresponding to the optionIndex under the the select element. | ||
"index": int, // the index corresponding to the index under the select element. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"index": int, // the index corresponding to the index under the select element. | |
"index": int, // the index corresponding to the option index under the select element. |
We add a specific option index
field to the element tree. Let's keep this wording the same.
"value": str // the value of the option. MAKE SURE YOU USE THIS VALUE TO SELECT THE OPTION. DO NOT PUT ANYTHING OTHER THAN A VALID OPTION VALUE HERE | ||
"id": str // the id of the option. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we can remove this. I don't think we're even passing the option id to the LLM.
skyvern/webeye/actions/actions.py
Outdated
@@ -93,9 +93,10 @@ class SelectOption(BaseModel): | |||
label: str | None | |||
value: str | None | |||
index: int | None | |||
id: str | None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LawyZheng let's remove this too
skyvern/webeye/actions/handler.py
Outdated
@@ -484,7 +484,7 @@ async def handle_select_option_action( | |||
return [ActionFailure(e)] | |||
|
|||
try: | |||
option_xpath = scraped_page.id_to_xpath_dict[action.option.index] | |||
option_xpath = scraped_page.id_to_xpath_dict[action.option.id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah this is a big bug 😮 We should stop using id_to_xpath
dict and deprecate it IMO
Co-authored-by: LawyZheng <lawyzheng1106@gmail.com>
Co-authored-by: LawyZheng <lawyzheng1106@gmail.com>
Problem
This issue with element ids is tightly related to the iframes support. When element tree is built by injecting
domUtils.js
into each iframe, the resulting ids are being messed up, because at first the page element tree is built and then element tree of each iframe on that page, resulting in iframe children ids duplicating some ids of elements on the page.Solution
Use random 4-char ids for elements. This way the we can manipulate the tree without worrying about uniqueness of the element ids.