-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃殌 Remove Annotations and Tag All text elements (optionally) #8
Conversation
tarsier/core.py
Outdated
@@ -16,14 +16,15 @@ def __init__(self, ocr_service: OCRService): | |||
with open(self._JS_TAG_UTILS, "r") as f: | |||
self._js_utils = f.read() | |||
|
|||
async def page_to_image(self, driver: AnyDriver) -> Tuple[bytes, Dict[int, str]]: | |||
async def page_to_image(self, driver: AnyDriver, tagUninteractableText: bool = False) -> Tuple[bytes, Dict[int, str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
async def page_to_image(self, driver: AnyDriver, tagUninteractableText: bool = False) -> Tuple[bytes, Dict[int, str]]: | |
async def page_to_image(self, driver: AnyDriver, tag_text_elements: bool = False) -> Tuple[bytes, Dict[int, str]]: |
tarsier/core.py
Outdated
return {int(key): value for key, value in tag_to_xpath.items()} | ||
|
||
async def _remove_tags(self, adapter: BrowserAdapter) -> None: | ||
# await adapter.run_js(self._js_utils) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? We call this after tagging the page
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh do you mean the comment? Yes I think we could鈥攕ince it's private and only ever called after running _tag_page, the utils should alr be loaded
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah just the comment
tarsier/core.py
Outdated
script = "removeTags();" | ||
if isinstance(adapter, SeleniumAdapter): | ||
script = f"return window.{script}" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Abstractions shouldn't leak like this. Could make a call_method function or something in driver. Could also just pass in JS code directly to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yeah, true. Thoughts @awtkns?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Run black 猬涳笍 |
LGTM! |
Bonus: Since we are now annotating using spans we can now add CSS to the annotations so that vision model can see them easier
fixes #5 #7