Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can DOM ranges split grapheme clusters and surrogate pairs? #933

Open
xfq opened this issue Dec 12, 2020 · 1 comment
Open

Can DOM ranges split grapheme clusters and surrogate pairs? #933

xfq opened this issue Dec 12, 2020 · 1 comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.

Comments

@xfq
Copy link

xfq commented Dec 12, 2020

https://dom.spec.whatwg.org/#ranges

For Text nodes, it seems that the offset of a boundary point is code unit (rather than grapheme cluster) based, and surrogate pairs might be split.

It would be useful to add a note to remind web developers and specs writers (like css-highlight-api, for example) that grapheme clusters and surrogate pairs might be split, preferably with an example. The note should contain a strong warning against splitting and surrogate pairs.

If possible, DOM should normatively prevent the splitting of surrogate pairs or make it non-conformant.

(This comment is part of a review on behalf of the W3C i18n WG.)

@xfq xfq added the i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on. label Dec 12, 2020
@annevk
Copy link
Member

annevk commented Dec 14, 2020

I'm somewhat supportive, but I also feel like this is something that should be pointed out in ECMAScript, if at all. I guess in theory one could expect the DOM to have created some higher level of abstraction, but I'm not sure why one would think that.

Speaking of surrogates, Safari seems to handle document.body.append("\uD800", "\uDC00", "\uD800\uDC00"); somewhat poorly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n-needs-resolution Issue the Internationalization Group has raised and looks for a response on.
Development

No branches or pull requests

2 participants