Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Text Extraction] Ignore transparent (or nearly-transparent) elements when extracting text #25552

Merged
merged 1 commit into from
Mar 7, 2024

Conversation

whsieh
Copy link
Member

@whsieh whsieh commented Mar 6, 2024

e23f17a

[Text Extraction] Ignore transparent (or nearly-transparent) elements when extracting text
https://bugs.webkit.org/show_bug.cgi?id=270598
rdar://124102506

Reviewed by Megan Gardner and Abrar Rahman Protyasha.

When extracting visible text, ignore subtrees where the renderer is transparent (or nearly
transparent). To do this, we adjust `extractItemData` to return an enum (`SkipExtraction`)
indicating whether we should skip text extraction for just the current node, or for the entire
subtree; we then use this to skip subtrees where there is either no renderer (i.e. `display: none;`)
or the opacity is near 0.

* LayoutTests/fast/text-extraction/basic-text-extraction.html:
* Source/WebCore/page/text-extraction/TextExtraction.cpp:
(WebCore::TextExtraction::extractItemData):
(WebCore::TextExtraction::extractRecursive):
(WebCore::TextExtraction::extractRenderedText):

Canonical link: https://commits.webkit.org/275769@main

e13df56

Misc iOS, tvOS & watchOS macOS Linux Windows
βœ… πŸ§ͺ style βœ… πŸ›  ios βœ… πŸ›  mac βœ… πŸ›  wpe βœ… πŸ›  wincairo
βœ… πŸ§ͺ bindings βœ… πŸ›  ios-sim βœ… πŸ›  mac-AS-debug βœ… πŸ§ͺ wpe-wk2
βœ… πŸ§ͺ webkitperl βœ… πŸ§ͺ ios-wk2 βœ… πŸ§ͺ api-mac βœ… πŸ§ͺ api-wpe
βœ… πŸ§ͺ ios-wk2-wpt βœ… πŸ§ͺ mac-wk1 βœ… πŸ›  wpe-skia
βœ… πŸ§ͺ api-ios βœ… πŸ§ͺ mac-wk2 βœ… πŸ›  gtk
βœ… πŸ›  tv βœ… πŸ§ͺ mac-AS-debug-wk2 βœ… πŸ§ͺ gtk-wk2
βœ… πŸ›  tv-sim βœ… πŸ§ͺ mac-wk2-stress ❌ πŸ§ͺ api-gtk
βœ… πŸ›  πŸ§ͺ merge βœ… πŸ›  watch
βœ… πŸ›  watch-sim

@whsieh whsieh requested a review from cdumez as a code owner March 6, 2024 22:30
@whsieh whsieh self-assigned this Mar 6, 2024
@whsieh whsieh added the Platform Portability improvements and other general platform improvements not driven directly by site bugs. label Mar 6, 2024
@whsieh whsieh added the merge-queue Applied to send a pull request to merge-queue label Mar 7, 2024
@whsieh
Copy link
Member Author

whsieh commented Mar 7, 2024

Thanks for the reviews!

… when extracting text

https://bugs.webkit.org/show_bug.cgi?id=270598
rdar://124102506

Reviewed by Megan Gardner and Abrar Rahman Protyasha.

When extracting visible text, ignore subtrees where the renderer is transparent (or nearly
transparent). To do this, we adjust `extractItemData` to return an enum (`SkipExtraction`)
indicating whether we should skip text extraction for just the current node, or for the entire
subtree; we then use this to skip subtrees where there is either no renderer (i.e. `display: none;`)
or the opacity is near 0.

* LayoutTests/fast/text-extraction/basic-text-extraction.html:
* Source/WebCore/page/text-extraction/TextExtraction.cpp:
(WebCore::TextExtraction::extractItemData):
(WebCore::TextExtraction::extractRecursive):
(WebCore::TextExtraction::extractRenderedText):

Canonical link: https://commits.webkit.org/275769@main
@webkit-commit-queue
Copy link
Collaborator

Committed 275769@main (e23f17a): https://commits.webkit.org/275769@main

Reviewed commits have been landed. Closing PR #25552 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit e23f17a into WebKit:main Mar 7, 2024
@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Mar 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Platform Portability improvements and other general platform improvements not driven directly by site bugs.
Projects
None yet
5 participants