-
Notifications
You must be signed in to change notification settings - Fork 29
Description
From MS Office Team:
The Unicode BiDi Algorithm (explained here: http://www.unicode.org/reports/tr9/tr9-31.html) is used by Text Rendering engines to properly display LTR and RTL text together in the same paragraph. In short, each character has a classification (mostly L and R, plus a bunch of neutral types), and the sequence of character classifications is used to determine which runs are displayed as LTR or RTL.
Mostly what I think we would need is the ability to query, for a given character, whether its direction is LTR or RTL after the Unicode BiDi Algorithm has been applied. I don’t know where the best place to put this API would be, perhaps the Range object? Something like this:
Range.getBiDiTextDir() : int
Return value is an enum in the range (LTR, RTL, MIXED), where MIXED is returned when the Range contains both LTR and RTL characters. (The caller would need to try again with a smaller range to get something useful in this case. MIXED would never be returned for a single character.)
I have no idea if this is the best API design but I’m just hoping to get across what would be useful to us.
For context, the reason this is important to us is that while HTML has the concept of “neutral text direction” (whenever the CSS property direction: ltr or direction: rtl is NOT specified), some editors like Word do not. Every run is either marked as LTR or RTL. If we can’t correctly detect the text as it’s being typed, then we may save it incorrectly into the document, and Word will display text direction differently than what the user saw as they were typing in the browser. There are also some performance optimizations we could make around text selection if we knew where in a paragraph the text changed direction, but this is less of a compelling reason than content fidelity.