Skip to content

[AutoFill Debugging] Add a way for clients to extract only plain text while maintaining DOM structure#59605

Merged
webkit-commit-queue merged 1 commit intoWebKit:mainfrom
whsieh:eng/308832
Feb 28, 2026
Merged

[AutoFill Debugging] Add a way for clients to extract only plain text while maintaining DOM structure#59605
webkit-commit-queue merged 1 commit intoWebKit:mainfrom
whsieh:eng/308832

Conversation

@whsieh
Copy link
Member

@whsieh whsieh commented Feb 27, 2026

d9fb211

[AutoFill Debugging] Add a way for clients to extract only plain text while maintaining DOM structure
https://bugs.webkit.org/show_bug.cgi?id=308832
rdar://171240021

Reviewed by Abrar Rahman Protyasha.

Make some adjustments to the `_WKTextExtraction` SPI surface; see below for more details.

Test: TextExtractionTests.MinimalHTMLOutput

* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::TextExtractionAggregator::addResult):
(WebKit::TextExtractionAggregator::includeRects const):
(WebKit::TextExtractionAggregator::includeURLs const):
(WebKit::TextExtractionAggregator::usePlainTextOutput const):
(WebKit::TextExtractionAggregator::addNativeMenuItemsIfNeeded):

Rename several helper methods, in light of the new plain text output type.

(WebKit::addPartsForItem):
(WebKit::addTextRepresentationRecursive):
(WebKit::TextExtractionAggregator::onlyIncludeText const): Deleted.
* Source/WebKit/Shared/TextExtractionToStringConversion.h:
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(textExtractionOutputFormat):
(-[WKWebView _extractDebugTextWithConfigurationWithoutUpdatingFilterRules:assertionScope:completionHandler:]):
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h:

Deprecate `+configurationForVisibleTextOnly`, and replace it with a new enum value which represents
only plain text: `_WKTextExtractionOutputFormatPlainText`. Previously, making a new configuration
with `configurationForVisibleTextOnly` would yield a configuration that only supported _some_ of
the relevant configuration options. However, this is inconsistent with how the rest of the
extraction formatting options work, such as markdown — configuration flags that don't make sense for
the output format are simply ignored.

* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm:
(-[_WKTextExtractionConfiguration init]):
(+[_WKTextExtractionConfiguration configurationForVisibleTextOnly]):
(-[_WKTextExtractionConfiguration configureForMinimalOutput]):

Add a convenience method to reset all extraction configuration parameters back to values that ensure
minimal output text. For clients that need to ensure minimal output, but still might want to
preserve specific types of data, they can use `-configureForMinimalOutput` and then enable only what
they need.

In the future, we should consider making the default initialized `_WKTextExtractionConfiguration`
start at this minimal output, rather than defaulting to text tree (with various bits of information
included, such as bounding rects and URLs).

(-[_WKTextExtractionConfiguration setIncludeEventListeners:]):
(-[_WKTextExtractionConfiguration _initForOnlyVisibleText:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeURLs:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeRects:]): Deleted.
(-[_WKTextExtractionConfiguration setNodeIdentifierInclusion:]): Deleted.
(-[_WKTextExtractionConfiguration setEventListenerCategories:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeAccessibilityAttributes:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeTextInAutoFilledControls:]): Deleted.
(-[_WKTextExtractionConfiguration setOutputFormat:]): Deleted.
(-[_WKTextExtractionConfiguration setShortenURLs:]): Deleted.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h:
* Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm:
(TestWebKitAPI::TEST(TextExtractionTests, VisibleTextOnly)):
(TestWebKitAPI::TEST(TextExtractionTests, MinimalHTMLOutput)):

Canonical link: https://commits.webkit.org/308380@main

e9988d2

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows Apple Internal
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win ✅ 🛠 ios-apple
✅ 🧪 bindings ✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ❌ 🧪 win-tests ✅ 🛠 mac-apple
✅ 🧪 webkitperl ✅ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe ✅ 🛠 vision-apple
✅ 🧪 ios-wk2-wpt ✅ 🧪 api-mac-debug ✅ 🛠 gtk3-libwebrtc
✅ 🧪 api-ios ✅ 🧪 mac-wk1 ✅ 🛠 gtk
✅ 🛠 ios-safer-cpp ✅ 🧪 mac-wk2 ✅ 🧪 gtk-wk2
✅ 🛠 vision ✅ 🧪 mac-AS-debug-wk2 ✅ 🧪 api-gtk
✅ 🛠 🧪 merge ✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ✅ 🛠 playstation
✅ 🧪 vision-wk2 ✅ 🧪 mac-intel-wk2
✅ 🛠 tv ✅ 🛠 mac-safer-cpp
✅ 🛠 tv-sim
✅ 🛠 watch
✅ 🛠 watch-sim

@whsieh whsieh requested a review from cdumez as a code owner February 27, 2026 20:38
@whsieh whsieh self-assigned this Feb 27, 2026
@whsieh whsieh added the New Bugs Unclassified bugs are placed in this component until the correct component can be determined. label Feb 27, 2026
@whsieh
Copy link
Member Author

whsieh commented Feb 27, 2026

Thanks for the review!

Looks like I also progressed an existing layout test (which needs to be rebaselined)

@whsieh whsieh added the merge-queue Applied to send a pull request to merge-queue label Feb 28, 2026
… while maintaining DOM structure

https://bugs.webkit.org/show_bug.cgi?id=308832
rdar://171240021

Reviewed by Abrar Rahman Protyasha.

Make some adjustments to the `_WKTextExtraction` SPI surface; see below for more details.

Test: TextExtractionTests.MinimalHTMLOutput

* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::TextExtractionAggregator::addResult):
(WebKit::TextExtractionAggregator::includeRects const):
(WebKit::TextExtractionAggregator::includeURLs const):
(WebKit::TextExtractionAggregator::usePlainTextOutput const):
(WebKit::TextExtractionAggregator::addNativeMenuItemsIfNeeded):

Rename several helper methods, in light of the new plain text output type.

(WebKit::addPartsForItem):
(WebKit::addTextRepresentationRecursive):
(WebKit::TextExtractionAggregator::onlyIncludeText const): Deleted.
* Source/WebKit/Shared/TextExtractionToStringConversion.h:
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(textExtractionOutputFormat):
(-[WKWebView _extractDebugTextWithConfigurationWithoutUpdatingFilterRules:assertionScope:completionHandler:]):
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h:

Deprecate `+configurationForVisibleTextOnly`, and replace it with a new enum value which represents
only plain text: `_WKTextExtractionOutputFormatPlainText`. Previously, making a new configuration
with `configurationForVisibleTextOnly` would yield a configuration that only supported _some_ of
the relevant configuration options. However, this is inconsistent with how the rest of the
extraction formatting options work, such as markdown — configuration flags that don't make sense for
the output format are simply ignored.

* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm:
(-[_WKTextExtractionConfiguration init]):
(+[_WKTextExtractionConfiguration configurationForVisibleTextOnly]):
(-[_WKTextExtractionConfiguration configureForMinimalOutput]):

Add a convenience method to reset all extraction configuration parameters back to values that ensure
minimal output text. For clients that need to ensure minimal output, but still might want to
preserve specific types of data, they can use `-configureForMinimalOutput` and then enable only what
they need.

In the future, we should consider making the default initialized `_WKTextExtractionConfiguration`
start at this minimal output, rather than defaulting to text tree (with various bits of information
included, such as bounding rects and URLs).

(-[_WKTextExtractionConfiguration setIncludeEventListeners:]):
(-[_WKTextExtractionConfiguration _initForOnlyVisibleText:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeURLs:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeRects:]): Deleted.
(-[_WKTextExtractionConfiguration setNodeIdentifierInclusion:]): Deleted.
(-[_WKTextExtractionConfiguration setEventListenerCategories:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeAccessibilityAttributes:]): Deleted.
(-[_WKTextExtractionConfiguration setIncludeTextInAutoFilledControls:]): Deleted.
(-[_WKTextExtractionConfiguration setOutputFormat:]): Deleted.
(-[_WKTextExtractionConfiguration setShortenURLs:]): Deleted.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h:
* Tools/TestWebKitAPI/Tests/WebKitCocoa/TextExtractionTests.mm:
(TestWebKitAPI::TEST(TextExtractionTests, VisibleTextOnly)):
(TestWebKitAPI::TEST(TextExtractionTests, MinimalHTMLOutput)):

Canonical link: https://commits.webkit.org/308380@main
@webkit-commit-queue
Copy link
Collaborator

Committed 308380@main (d9fb211): https://commits.webkit.org/308380@main

Reviewed commits have been landed. Closing PR #59605 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit d9fb211 into WebKit:main Feb 28, 2026
@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Feb 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

New Bugs Unclassified bugs are placed in this component until the correct component can be determined.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants