Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upText layout should refer to character indices, not byte offsets #215
Comments
|
This has been fixed for a while now. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
In general, all text layout should be in terms of character offsets, and refer to/save data in glyph storage for determining ligatures, clusters, and line breaking. Line breaking is currently done on whitespaces in the original string (~[u8]), which causes inter-word breaks when there is not a 1-to-1 correspondence between bytes, glyphs, and/or characters.
This should fix bad line breaking, invalid indexing into the glyph store, and unblock rendering of multi-byte characters. A Japanese test case is
src/tests/mojira.html.