Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[css-text-3] line-break property is not CJK-specific #1252

Closed
litherum opened this issue Apr 20, 2017 · 9 comments
Closed

[css-text-3] line-break property is not CJK-specific #1252

litherum opened this issue Apr 20, 2017 · 9 comments
Assignees
Labels
Closed Accepted as Editorial Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. css-text-3 Current Work i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Testing Unnecessary Memory aid - issue doesn't require tests Tracked in DoC

Comments

@litherum
Copy link
Contributor

The line-break property is currently designed for CJK. However, for browsers which use ICU (such as WebKit and Blink, and possibly others), the property is applied to Finnish too. The spec should mention this.

cc @fantasai

@tabatkins tabatkins added the css-text-3 Current Work label Apr 20, 2017
@fantasai
Copy link
Collaborator

Can you define the rules that ICU applies or point at a reference? We already allow UAs to do anything appropriate, but we have some mandated rules for CJK. We can add some for Finnish.
http://drafts.csswg.org/css-text-3/#line-break-property

@litherum
Copy link
Contributor Author

litherum commented Mar 8, 2018

ICU segmentation is implemented using language-specific manifest files, where each language can describe whether or not it supports the lb behavior.

For example, the file for en-US says that it doesn't have any special lb behavior.

On the other hand, the file for Finnish says that it has special handling for loose, normal, and strict behaviors.

Those manifest files reference rules defined in these rules files. These rules files get compiled at build-time into bytecode which automatically gets loaded and interpreted at runtime.

@xfq xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Mar 13, 2018
@fantasai
Copy link
Collaborator

@litherum I can't reference a pile of cryptic code files, particularly ones that no longer seem exist? Can you explain specifically what behavior you want to add to the spec?

@litherum
Copy link
Contributor Author

The spec says

only CJK codepoints are affected, unless the text is marked as Chinese or Japanese, in which case some additional common codepoints are affected.

And then immediately says

a UA ... could choose to map different levels of strictness in Thai line-breaking to these keywords

ICU moved the previous Finnish-language behavior into all languages, for all line-breaking modes, so this specific example is no longer relevant.

However, ICU line breaking rules are different for all locales depending on whether or not you're in loose mode or not. See the differences between this file and this file.

One example is that, in all locales, a series of adjacent U+2024 ONE DOT LEADER won't have a line break candidates between them, but in loose line breaking mode, they do. This is true for all characters with the IN ("Inseparable") line-breaking property, which is U+2024 ONE DOT LEADER, U+2025 TWO DOT LEADER and U+2026 HORIZONTAL ELLIPSIS.

My recommendation is to remove the text about "only CJK codepoints are affected".

@litherum litherum changed the title [css-text-3] line-break property should mention Finnish language [css-text-3] line-break property is not CJK-specific Sep 16, 2018
@frivoal
Copy link
Collaborator

frivoal commented Oct 2, 2018

@litherum

My recommendation is to remove the text about "only CJK codepoints are affected".

That text is in a note, and refers only to the requirements listed above. In those requirements, when the language is unspecified, only CJK codepoints are indeed affected. Additionally, there is normative text saying "The precise set of rules in effect for each of loose, normal, and strict is up to the UA and should follow language conventions." Also, the very next sentence after the note where you ask for "only CJK codepoints are affected" to be removed is "As UAs can add additional distinctions between strict/normal/loose modes, these values can exhibit other differences as well."

So, from a normative standpoint, what you want to allow is already allowed. From an editorial standpoint, I'd also argue that it is already clear, and I disagree that removing "only CJK codepoints are affected" is a good idea, as it would make that note nonsensical.

With all that considered, if you have editorial improvement to suggest, I think we should consider it to limit future confusion, but otherwise, I would like to close as wontfix.

@litherum
Copy link
Contributor Author

litherum commented Nov 1, 2018

That text is in a note, and refers only to the requirements listed above. In those requirements, when the language is unspecified, only CJK codepoints are indeed affected.

That's fair. My proposal is for editorial, not for normative text.

When reading the note, I was confused for two reasons:

  1. It says non-CJK text doesn't make a distinction among the levels of strictness, except for the C and J cases, which is 2/3 of CJK.
  2. It's unclear whether or not "the requirements listed above" includes "The precise set of rules in effect for each of loose, normal, and strict is up to the UA." That sentence is, after all, above the note.

@fantasai
Copy link
Collaborator

fantasai commented Dec 4, 2018

@litherum I've reworded the note. Let me know if it's satisfactory?

@litherum
Copy link
Contributor Author

#3420

@fantasai
Copy link
Collaborator

Merged, thanks!

@fantasai fantasai added Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. and removed Commenter Response Pending labels Dec 11, 2018
@frivoal frivoal added the Testing Unnecessary Memory aid - issue doesn't require tests label Apr 25, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed Accepted as Editorial Commenter Satisfied Commenter has indicated satisfaction with the resolution / edits. css-text-3 Current Work i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. Testing Unnecessary Memory aid - issue doesn't require tests Tracked in DoC
Projects
None yet
Development

No branches or pull requests

5 participants