Skip to content

AX: Crash when navigating character-by-character via VoiceOver in text with multi-byte glyphs (e.g. emojis)#46124

Merged
webkit-commit-queue merged 1 commit into
WebKit:mainfrom
twilco:eng/AX-Crash-when-navigating-character-by-character-via-VoiceOver-in-text-with-multi-byte-glyphs-e-g-emojis
Jun 2, 2025
Merged

AX: Crash when navigating character-by-character via VoiceOver in text with multi-byte glyphs (e.g. emojis)#46124
webkit-commit-queue merged 1 commit into
WebKit:mainfrom
twilco:eng/AX-Crash-when-navigating-character-by-character-via-VoiceOver-in-text-with-multi-byte-glyphs-e-g-emojis

Conversation

@twilco
Copy link
Copy Markdown
Contributor

@twilco twilco commented May 30, 2025

9a59ea9

AX: Crash when navigating character-by-character via VoiceOver in text with multi-byte glyphs (e.g. emojis)
https://bugs.webkit.org/show_bug.cgi?id=293753
rdar://152255108

Reviewed by Vitor Roriz.

In AccessibilityRenderObject::textRuns, when we build text runs for use off the main-thread, part of that process
is measuring and storing character widths. It's important that the character widths we cache match the length of the
string. This matters particularly when considering multi-byte glyphs like certain emojis, which can expand the length
of the string multiple times (e.g. some emojis are 10 characters long!) but only take up one position in the GlyphBuffer.

To handle this, we padded our cached character widths with zeros based on the result of GlyphBuffer::uncheckedStringOffsetAt,
which should tell us how big the gaps created by multi-byte glyphs are. Unfortunately, ComplexTextController did not
compute this value correctly when building the glyph buffer. It only added the glyph index as the string offset, and
the glyph index is only relevant in the context of a single ComplexTextRun, not the overall string. This is easily
fixed by also adding the ComplexTextRun::stringLocation to get the real string offset for the glyph. This fixes the
crash by ensuring we pad our character widths vector as intended, thus avoiding an out-of-bounds access assert.

This commit also fixes an issue where bounding boxes for text containing multi-byte glyphs became completely wrong.
This happened because when we converted an NSRange (a location, and length based on the whole length of the string,
including sub-characters of a multi-byte glyph), we walked over the multi-byte glyphs as one "location", meaning
we would generate a text marker with the wrong offset.

Fix this by passing a new parameter to AXTextMarker::findMarker that forces walks through sub-characters of multi-byte
glyphs, rather than walking over them atomically. We'll need to use this in any place an accessibility API takes an
NSRange.

* LayoutTests/accessibility/mac/bounds-for-range-multibyte-glyphs-expected.txt: Added.
* LayoutTests/accessibility/mac/bounds-for-range-multibyte-glyphs.html: Added.
* LayoutTests/platform/mac-wk1/TestExpectations:
* Source/WebCore/accessibility/AXTextMarker.cpp:
(WebCore::AXTextMarker::nextMarkerFromOffset const):
(WebCore::AXTextMarker::findMarker const):
* Source/WebCore/accessibility/AXTextMarker.h:
* Source/WebCore/accessibility/AXTextRun.cpp:
(WebCore::AXTextRuns::localRect const):
* Source/WebCore/accessibility/mac/WebAccessibilityObjectWrapperMac.mm:
(computeTextBoundsForRange):
* Source/WebCore/platform/graphics/ComplexTextController.cpp:
(WebCore::ComplexTextController::enclosingGlyphBoundsForTextRun):

Canonical link: https://commits.webkit.org/295703@main

d8bbd07

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win
✅ 🧪 bindings ✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ⏳ 🧪 win-tests
✅ 🧪 webkitperl ✅ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe
✅ 🧪 ios-wk2-wpt ✅ 🧪 mac-wk1 ✅ 🛠 wpe-cairo
✅ 🧪 api-ios ✅ 🧪 mac-wk2 ✅ 🛠 gtk
✅ 🛠 vision ✅ 🧪 mac-AS-debug-wk2 ✅ 🧪 gtk-wk2
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ❌ 🧪 api-gtk
✅ 🛠 🧪 merge ✅ 🧪 vision-wk2 ✅ 🧪 mac-intel-wk2 ✅ 🛠 playstation
✅ 🛠 tv ✅ 🛠 mac-safer-cpp
✅ 🛠 tv-sim
✅ 🛠 watch
✅ 🛠 watch-sim

@twilco twilco self-assigned this May 30, 2025
@twilco twilco added the Accessibility For bugs related to accessibility. label May 30, 2025
@twilco twilco force-pushed the eng/AX-Crash-when-navigating-character-by-character-via-VoiceOver-in-text-with-multi-byte-glyphs-e-g-emojis branch 2 times, most recently from 58083e5 to d8bbd07 Compare May 30, 2025 17:30
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ForceSingleOffsetMovement::Yes here? Don't we always want the next glyph?

Copy link
Copy Markdown
Contributor Author

@twilco twilco May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take this example:

Abc⁉️def🧑‍🧑‍🧒‍🧒ghi👫Abc

String length is 27, but number of rendered glyphs is 15.

VoiceOver requests NSAccessibilityBoundsForRangeParameterizedAttribute with NSRange { location: 25, length: 1 }. This would be character "b"

To compute the start marker (markerToLocation in the PR), we call AXTextMarker { backingObject, 0 }.nextMarkerFromOffset(range.location, ForceSingleOffsetMovement::Yes).

This walks forward 25 times to try to produce the right marker. But there are only 15 glyphs. We walk past "A", "b", and "c" just fine, decrementing counter from the initial 25, to 22.

Now we get to ⁉️. This is a 2 byte / character glyph. If we walk over it atomically, we move two places forward in the string, but only decrement counter once. Now we're at counter 21.

Subtract another 3 for ASCII "def". Counter = 18

Next is 🧑‍🧑‍🧒‍🧒. This is 10 bytes / characters (or 11? can't remember). We walk it atomically, meaning our counter is 17. So despite moving forward 10 / 11 characters in the string, we're only decrementing our counter once.

"ghi" -> counter 14

👫 -> two bytes I think, counter 13 because we walked it atomically

"Abc" -> counter 10

So we were asked to provide a text marker for location 25, but we're all the way at the end of the string, and have 10 more decrements left to go. This means we produce the wrong marker (one with offset 27)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is reasonable, I was having a look if we would need something like that on WidthIterator as well, but I think not.

ComplexTextController splits the original string into substrings for each ComplexTextRun when font transition happens. Currently the offset registered in the glyphbuffer is in relation to this ComplexTextRun substring. I think it makes sense to change that as the owner of ComplexTextController has access to the glyph buffer and to the original string but not to the individual ComplexTextRuns

@twilco twilco added the merge-queue Applied to send a pull request to merge-queue label Jun 2, 2025
…t with multi-byte glyphs (e.g. emojis)

https://bugs.webkit.org/show_bug.cgi?id=293753
rdar://152255108

Reviewed by Vitor Roriz.

In AccessibilityRenderObject::textRuns, when we build text runs for use off the main-thread, part of that process
is measuring and storing character widths. It's important that the character widths we cache match the length of the
string. This matters particularly when considering multi-byte glyphs like certain emojis, which can expand the length
of the string multiple times (e.g. some emojis are 10 characters long!) but only take up one position in the GlyphBuffer.

To handle this, we padded our cached character widths with zeros based on the result of GlyphBuffer::uncheckedStringOffsetAt,
which should tell us how big the gaps created by multi-byte glyphs are. Unfortunately, ComplexTextController did not
compute this value correctly when building the glyph buffer. It only added the glyph index as the string offset, and
the glyph index is only relevant in the context of a single ComplexTextRun, not the overall string. This is easily
fixed by also adding the ComplexTextRun::stringLocation to get the real string offset for the glyph. This fixes the
crash by ensuring we pad our character widths vector as intended, thus avoiding an out-of-bounds access assert.

This commit also fixes an issue where bounding boxes for text containing multi-byte glyphs became completely wrong.
This happened because when we converted an NSRange (a location, and length based on the whole length of the string,
including sub-characters of a multi-byte glyph), we walked over the multi-byte glyphs as one "location", meaning
we would generate a text marker with the wrong offset.

Fix this by passing a new parameter to AXTextMarker::findMarker that forces walks through sub-characters of multi-byte
glyphs, rather than walking over them atomically. We'll need to use this in any place an accessibility API takes an
NSRange.

* LayoutTests/accessibility/mac/bounds-for-range-multibyte-glyphs-expected.txt: Added.
* LayoutTests/accessibility/mac/bounds-for-range-multibyte-glyphs.html: Added.
* LayoutTests/platform/mac-wk1/TestExpectations:
* Source/WebCore/accessibility/AXTextMarker.cpp:
(WebCore::AXTextMarker::nextMarkerFromOffset const):
(WebCore::AXTextMarker::findMarker const):
* Source/WebCore/accessibility/AXTextMarker.h:
* Source/WebCore/accessibility/AXTextRun.cpp:
(WebCore::AXTextRuns::localRect const):
* Source/WebCore/accessibility/mac/WebAccessibilityObjectWrapperMac.mm:
(computeTextBoundsForRange):
* Source/WebCore/platform/graphics/ComplexTextController.cpp:
(WebCore::ComplexTextController::enclosingGlyphBoundsForTextRun):

Canonical link: https://commits.webkit.org/295703@main
@webkit-commit-queue webkit-commit-queue force-pushed the eng/AX-Crash-when-navigating-character-by-character-via-VoiceOver-in-text-with-multi-byte-glyphs-e-g-emojis branch from d8bbd07 to 9a59ea9 Compare June 2, 2025 19:02
@webkit-commit-queue
Copy link
Copy Markdown
Collaborator

Committed 295703@main (9a59ea9): https://commits.webkit.org/295703@main

Reviewed commits have been landed. Closing PR #46124 and removing active labels.

@webkit-commit-queue webkit-commit-queue merged commit 9a59ea9 into WebKit:main Jun 2, 2025
@webkit-commit-queue webkit-commit-queue removed the merge-queue Applied to send a pull request to merge-queue label Jun 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Accessibility For bugs related to accessibility.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants