diff --git a/css-text-3/Overview.bs b/css-text-3/Overview.bs index 88bbc7d3792..53968540d3b 100644 --- a/css-text-3/Overview.bs +++ b/css-text-3/Overview.bs @@ -2048,13 +2048,18 @@ Order of Operations white-space-processing-018.xht -

For other values of 'white-space', segment breaks are collapsible. - Any collapsible segment break immediately following another collapsible segment break +

For other values of 'white-space', segment breaks are collapsible, + and are collapsed as follows: + +

    +
  1. First, any collapsible segment break immediately following another collapsible segment break is removed. - Then any remaining segment break is +
  2. Then any remaining segment break is either transformed into a space (U+0020) or removed - depending on the context before and after the break: + depending on the context before and after the break. + The rules for this operation are UA-defined in this level. +
  3. Otherwise, the [=segment break=] is converted to a space (U+0020). @@ -2183,7 +2187,6 @@ Order of Operations - - Note: The white space processing rules have already + + + ISSUE(5086): Should space-discarding punctuation have a stronger influence over mismatched before/after contexts? + + ISSUE(5017): Should we classify punctuation and/or symbols as a category of space-ambiguous characters? (Currently spaces are discarded only if both sides are space-discarding; ambiguous characters would defer to the other side.) + +CUT SEGMENT BREAK TRANSFORM --> + + Note: The white space processing rules have already removed any [=tabs=] and [=spaces=] around the [=segment break=] - before these checks take place. + before this context is evaluated. +
The purpose of the segment break transformation rules @@ -2210,9 +2221,10 @@ Order of Operations Here is an English paragraph that is broken into multiple lines in the source code so that it can - more easily read in a text editor. + be more easily read and edited + in a text editor. -

Here is an English paragraph that is broken into multiple lines in the source code so that it can be more easily read in a text editor.

+

Here is an English paragraph that is broken into multiple lines in the source code so that it can be more easily read and edited in a text editor.

Eliminating a line break in English requires maintaining a [=space=] in its place.
@@ -2233,21 +2245,16 @@ Order of Operations - The segment break transformation rules thus use adjacent context + The segment break transformation rules can use adjacent context to either transform the segment break into a space or eliminate it entirely.
-

Comments on how well these rules would work in practice would - be very much appreciated, particularly from people who work with - Thai and similar scripts. - Note that browser implementations do not currently follow these rules consistently - (although IE does in some cases transform the break, - and Firefox follows the first two bullet points).

- - ISSUE(5086): Should space-discarding punctuation have a stronger influence over mismatched before/after contexts? - - ISSUE(5017): Should we classify punctuation and/or symbols as a category of space-ambiguous characters? (Currently spaces are discarded only if both sides are space-discarding; ambiguous characters would defer to the other side.) + Note: Historically, HTML and CSS have unconditionally converted [=segment breaks=] to spaces, + which has prevented content authored in languages such as Chinese + from being able to break lines within the source. + Thus UA heurstics need to be conservative about where they discard [=segment breaks=] + even as they strive to improve support for such languages.

Tab Character Size: the 'tab-size' property

@@ -5921,6 +5928,7 @@ Characters and Properties but take their other properties from the first combining character in the sequence. + +CUT SEGMENT BREAK TRANSFORM --> -

Appendix G. +

Appendix F. Identifying the Content Writing System

This appendix is normative.

@@ -6187,7 +6195,7 @@ Identifying the Content Writing System Note: Mere omission of the [=writing system=] information when the [=content language=] is declared means the that the [=writing system=] is implied, not unknown. -

Appendix H. +

Appendix G. Small Kana Mappings