-
Notifications
You must be signed in to change notification settings - Fork 642
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[css-text-3] line-break property is not CJK-specific #1252
Comments
Can you define the rules that ICU applies or point at a reference? We already allow UAs to do anything appropriate, but we have some mandated rules for CJK. We can add some for Finnish. |
ICU segmentation is implemented using language-specific manifest files, where each language can describe whether or not it supports the For example, the file for en-US says that it doesn't have any special On the other hand, the file for Finnish says that it has special handling for loose, normal, and strict behaviors. Those manifest files reference rules defined in these rules files. These rules files get compiled at build-time into bytecode which automatically gets loaded and interpreted at runtime. |
@litherum I can't reference a pile of cryptic code files, particularly ones that no longer seem exist? Can you explain specifically what behavior you want to add to the spec? |
The spec says
And then immediately says
ICU moved the previous Finnish-language behavior into all languages, for all line-breaking modes, so this specific example is no longer relevant. However, ICU line breaking rules are different for all locales depending on whether or not you're in One example is that, in all locales, a series of adjacent U+2024 ONE DOT LEADER won't have a line break candidates between them, but in loose line breaking mode, they do. This is true for all characters with the IN ("Inseparable") line-breaking property, which is U+2024 ONE DOT LEADER, U+2025 TWO DOT LEADER and U+2026 HORIZONTAL ELLIPSIS. My recommendation is to remove the text about "only CJK codepoints are affected". |
That text is in a note, and refers only to the requirements listed above. In those requirements, when the language is unspecified, only CJK codepoints are indeed affected. Additionally, there is normative text saying "The precise set of rules in effect for each of loose, normal, and strict is up to the UA and should follow language conventions." Also, the very next sentence after the note where you ask for "only CJK codepoints are affected" to be removed is "As UAs can add additional distinctions between strict/normal/loose modes, these values can exhibit other differences as well." So, from a normative standpoint, what you want to allow is already allowed. From an editorial standpoint, I'd also argue that it is already clear, and I disagree that removing "only CJK codepoints are affected" is a good idea, as it would make that note nonsensical. With all that considered, if you have editorial improvement to suggest, I think we should consider it to limit future confusion, but otherwise, I would like to close as wontfix. |
That's fair. My proposal is for editorial, not for normative text. When reading the note, I was confused for two reasons:
|
@litherum I've reworded the note. Let me know if it's satisfactory? |
Merged, thanks! |
The line-break property is currently designed for CJK. However, for browsers which use ICU (such as WebKit and Blink, and possibly others), the property is applied to Finnish too. The spec should mention this.
cc @fantasai
The text was updated successfully, but these errors were encountered: