Skip to content

Add support for multiple-char operators#59196

Open
Ahmad-S792 wants to merge 1 commit intoWebKit:mainfrom
Ahmad-S792:eng/Add-support-for-multiple-char-operators
Open

Add support for multiple-char operators#59196
Ahmad-S792 wants to merge 1 commit intoWebKit:mainfrom
Ahmad-S792:eng/Add-support-for-multiple-char-operators

Conversation

@Ahmad-S792
Copy link
Contributor

@Ahmad-S792 Ahmad-S792 commented Feb 22, 2026

439c815

Add support for multiple-char operators
https://bugs.webkit.org/show_bug.cgi?id=124828
rdar://170907545

Reviewed by NOBODY (OOPS!).

When an operator contains a base character followed by combining characters
(e.g., "≠" which is "=" + U+0338), use the base character for dictionary
lookup. This matches MathML Core spec behavior and other browser engines.

Previously, parseOperatorChar() only handled single code points via
convertToSingleCodePoint(), which returned nullopt for multi-character
operators, causing them to use default spacing instead of the base
character's dictionary properties.

The fix iterates through code points and checks if all characters after
the first are combining marks (Unicode categories Mn, Mc, Me). If so,
the first character is used for dictionary lookup.

* LayoutTests/imported/w3c/web-platform-tests/mathml/presentation-markup/operators/operator-dictionary-combining-expected.txt: Progression
* Source/WebCore/mathml/MathMLOperatorElement.cpp:
(WebCore::MathMLOperatorElement::parseOperatorChar):

439c815

Misc iOS, visionOS, tvOS & watchOS macOS Linux Windows Apple Internal
✅ 🧪 style ✅ 🛠 ios ✅ 🛠 mac ✅ 🛠 wpe ✅ 🛠 win ✅ 🛠 ios-apple
✅ 🧪 bindings ✅ 🛠 ios-sim ✅ 🛠 mac-AS-debug ✅ 🧪 wpe-wk2 ❌ 🧪 win-tests ✅ 🛠 mac-apple
✅ 🧪 webkitperl ✅ 🧪 ios-wk2 ✅ 🧪 api-mac ✅ 🧪 api-wpe ✅ 🛠 vision-apple
✅ 🧪 ios-wk2-wpt ❌ 🧪 api-mac-debug ✅ 🛠 gtk3-libwebrtc
✅ 🧪 api-ios ✅ 🧪 mac-wk1 ✅ 🛠 gtk
✅ 🛠 ios-safer-cpp ✅ 🧪 mac-wk2 ✅ 🧪 gtk-wk2
✅ 🛠 vision ✅ 🧪 mac-AS-debug-wk2 ✅ 🧪 api-gtk
✅ 🛠 vision-sim ✅ 🧪 mac-wk2-stress ✅ 🛠 playstation
✅ 🧪 vision-wk2 ✅ 🧪 mac-intel-wk2
✅ 🛠 tv ✅ 🛠 mac-safer-cpp
✅ 🛠 tv-sim
✅ 🛠 watch
✅ 🛠 watch-sim

https://bugs.webkit.org/show_bug.cgi?id=124828
rdar://170907545

Reviewed by NOBODY (OOPS!).

When an operator contains a base character followed by combining characters
(e.g., "≠" which is "=" + U+0338), use the base character for dictionary
lookup. This matches MathML Core spec behavior and other browser engines.

Previously, parseOperatorChar() only handled single code points via
convertToSingleCodePoint(), which returned nullopt for multi-character
operators, causing them to use default spacing instead of the base
character's dictionary properties.

The fix iterates through code points and checks if all characters after
the first are combining marks (Unicode categories Mn, Mc, Me). If so,
the first character is used for dictionary lookup.

* LayoutTests/imported/w3c/web-platform-tests/mathml/presentation-markup/operators/operator-dictionary-combining-expected.txt: Progression
* Source/WebCore/mathml/MathMLOperatorElement.cpp:
(WebCore::MathMLOperatorElement::parseOperatorChar):
@Ahmad-S792 Ahmad-S792 self-assigned this Feb 22, 2026
@Ahmad-S792 Ahmad-S792 added the MathML For bugs specific to MathML. label Feb 22, 2026
// Try single code point first.
if (auto codePoint = StringView(trimmed).convertToSingleCodePoint()) {
auto character = codePoint.value();
// The minus sign renders better than the hyphen sign used in some MathML formulas.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this comment removed? That seems unrelated to this patch

}

// Handle base character followed by combining character(s).
// Per MathML Core spec, use the base character for dictionary lookup.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From https://w3c.github.io/mathml-core/#dfn-algorithm-to-determine-the-category-of-an-operator I only see the case when there are two characters with the first one being U+0338 COMBINING LONG SOLIDUS OVERLAY or U+20D2 COMBINING LONG VERTICAL LINE OVERLAY (other cases were not in the MathML 3 dictionary), so we can probably simplify.

That probably means test coverage is not enough, as we are not testing that the category for multiple combining or other combining characters would be "Default" instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Will explore.

if (firstChar == hyphenMinus)
firstChar = minusSign;
operatorChar.character = firstChar;
operatorChar.isVertical = isVertical(operatorChar.character);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to factor out the shared code in a helper function/lambda.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

MathML For bugs specific to MathML.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants