Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Khmer shaping does not match Uniscribe or CoreText #667
Brought up by @mcdurdin at https://twitter.com/MarcDurdin/status/941516844195749888
Font handling of consonant shifters can be different depending on the font which is why the Android test with an older version of Noto Khmer showed everything rendering exactly the same. The new version of Noto Khmer doesn't allow that to happen (only 7-12 above look correct even though they aren't), but either way the shaper shouldn't allow every one of these to shape the same.
@MakaraSok is currently working on providing additional data for this issue. Due to some ambiguity in the Unicode and OpenType specifications for Khmer, it is not immediately clear which combinations should be permitted and which should be blocked, especially in the boundary between syllables. Also important to consider how minority languages use the Khmer script to avoid blocking them in any fixes.
There are also 12 more basic examples for this syllable (screenshot from presentation):
added a commit
Jan 5, 2018
The Khmer encoding was designed to have only one matra per syllable, despite have commonality with Thai writing habits in that respect. There should be a looming issue with the Khom script (i.e. the variety of the Khmer script used in Thailand for Pali and some writing in the Tai vernaculars), as all three of the apparent signs combinations <E, [I, II, Y]> have been in use at some time since the middle of the 19th century, and it is likely that they have been used for a 2-way length contrast, as in the Lao Lao script writing system. There has also been variation in Khmer, though the temptation is to dismiss that as glyph variation.