[css-text-3] Switch line-breaking handling of atomic inlines #4949

fantasai · 2020-04-14T01:03:32Z

CSS Text tried to define that atomic inlines behave like ID characters with respect to line breaking (e.g., breaks between an atomic inline and a closing parenthesis or comma is forbidden). We had to alter that to allow breaks between atomic inlines and nbsp due to compat. More recently, #4576 found that there were sites depending on the "always breakable" behavior of atomic inlines with punctuation as well. This means by default, atomic inlines need to always allow breaks before and after the atomic inline, regardless of what character is there.

The problem is that this is an unnatural line breaking pattern for things like emoticons, gaiji, or other images that are intended to behave like text. We need some way to switch atomic inlines into this mode.

Two proposals were raised on the thread in #4576:

Re-use the inheritable line-break for this purpose: values other than auto (the initial value) treat atomic inlines like ID.
Introduce a new value for the non-inherited wrap-before and wrap-after properties in CSS Text Level 4 to make this distinction.

A third option would be to introduce yet another line-breaking property dedicated to this problem. Fourth option is not to solve. (Personally I do not prefer these solutions as we have way too many line-breaking controls already, and I do think this is a problem worth solving.)

What do we want to do here?

The text was updated successfully, but these errors were encountered:

fantasai · 2020-04-14T01:04:29Z

@frivoal opened PR #4755 for the first option, fwiw.

kojiishi · 2020-04-14T07:55:00Z

My preference is 3rd atm.

Also fine with 4th, defer until we hear actual voices wanting it. We already heard some requests to customize line breaking behavior. By deferring, we will have opportunity to design features that can suffice more requests.

litherum · 2020-04-14T15:36:52Z

I have a slight preference for 2 because these existing properties are already designed to control line breaking around atomic inlines, so they are a natural place to put this functionality.

fantasai · 2020-04-14T19:29:42Z

@litherum Note, the wrap properties control line breaking around any inline, not just atomic inlines.

kojiishi · 2020-04-20T05:15:01Z

I don't have good confidence whether wrap-before is the right way to go or not. Defining/disallowing break opportunities at element boundaries is a complex problem.

I prefer to customize line breaking classes of a character (or a meta character that represents atomic inlines in this case) then use UAX#14 rules as is. If that's not good enough, we can consider customizing UAX#14 rules. But I'm hesitating to define behavior at element boundaries.

@jfkthame I think what I described above is the same as what you described in some other issues about line breaking behavior at element boundaries, but I'm not certain if I understood your opinion there. WDYT about this issue?

jfkthame · 2020-04-20T10:15:27Z

I started writing a long comment here that I thought was going to be in support of PR #4755, but by the end I found myself thinking that maybe the wrap-* properties are the right solution. So I guess I'm leaning towards proposal 2.

Could we define additional values for wrap-* that don't directly force or prohibit a break, but instead override the line-breaking class of the inline? So wrap-before: ideographic would mean the element behaves as class ID for the purpose of determining whether a break is allowed before it, etc.

So the initial values wrap-before: auto; wrap-after: auto; would give the web-compatible legacy "always breakable" behavior for an inline image; wrap-before: ideographic; wrap-after: ideographic; would give the behavior CSS Text tried to specify as default (without needing any specific line-break value); and other values could be defined to correspond to other line-breaking classes if there's any demand for them.

(It'd be nice to have a shorthand that sets both wrap-before and wrap-after to a single value: img { wrap: ideographic; } to opt in to the ID-like behavior. But maybe wrap is too short and non-specific, given how many assorted line-breaking/wrapping controls we have.)

kojiishi · 2020-04-30T03:43:38Z

I'm good with that too if wrap-* becomes to what @jfkthame suggested. One minor thing though, can it be wrap-before: 'A' to make it "wrap like the the character 'A'", instead of mapping to the line breaking class? This is easier for ICU line breaker to handle.

fantasai · 2020-05-06T14:48:34Z

Maybe wrap-as for the shorthand?

fantasai · 2020-05-06T14:50:11Z

I think picking a character to emulate can be an implementation detail; as long as the UA picks a character from the correct category, there's no difference in behavior.

jfkthame · 2020-05-06T14:51:09Z

wrap-as sounds workable to me.

css-meeting-bot · 2020-05-06T15:06:19Z

The CSS Working Group just discussed Switch line-breaking handling of atomic inlines.

RESOLVED: add "wrap-as" and values, details TBD later

The full IRC log of that discussion

<astearns> topic: Switch line-breaking handling of atomic inlines
<astearns> github: https://github.com//issues/4949
<fremy> fantasai: we had defined atomic inlines to work like ideographic characters
<fremy> fantasai: but that is unfortunatley not web compatible
<fremy> fantasai: even if this would be a nicer behavior
<fremy> fantasai: but since forever, atomic inlines have allowed breaking opportunities
<fremy> fantasai: so we accepted our fate
<fremy> fantasai: but there are use cases for the correct behavior though
<fremy> fantasai: so there was a question of how to swtich to that behavior
<fremy> fantasai: line-break not being auto ===> atomic treated as ID
<fremy> fantasai: another option: wrap-before/after to control wrapping before a particular inline, so you could have values to prevent/avoid
<fremy> fantasai: one of them could be this smart behavior
<fremy> fantasai: so, do we want to introduce a switch of behavior toggle
<fremy> fantasai: and if so, which option?
<fremy> fantasai: an issue would be that this won't be very visible to most languages
<fremy> fantasai: and koji was afraid some people might set it, then have big effects for CJK languages
<fremy> fantasai: the other option is more targetted
<fantasai> s/big/subtle/
<myles> q+
<fremy> fantasai: but it has the downside you have to target each element independtly
<astearns> q?
<astearns> ack fantasai
<Zakim> fantasai, you wanted to ask if 'contain:layout' trapping scroll snapping is actually what we wnt
<fremy> florian: one other issue is that the line breaking properties currently don't exist anywhere
<fremy> florian: so adding new behavior to them is wishful thinking
<koji> +q
<xfq_> ack my
<fremy> myles: in all the ebooks that use images-as-text I have seen, they use a class on these images
<astearns> ack myles
<faceless2_> +1 to myles
<fremy> myles: so the rule to target them all is very easy
<astearns> ack koji
<fremy> koji: in the github issue, we said it's fine with the property, but we want a different feature
<fremy> koji: the proposal was to pretend that atomic inline was a line-breaking class
<fremy> koji: and as we discussed in other issues, we have to resolve the ambiguity between elements boundaries
<fremy> koji: and maybe that should be discussed in that context
<fremy> koji: I like that idea that was proposed on github
<fremy> koji: I talked to ICU people to see if that would be possible
<fremy> koji: but that didn't get an approval
<faceless2_> q+
<fremy> koji: so they suggested to pick a specific character instead
<fremy> fantasai: I'm fine with selecting one specific character we consider to be representative of ID
<fremy> fantasai: it would be confusing for people to have to pick on char
<fremy> fantasai: the mapping can be implementation detail
<fantasai> s/on/a/
<fremy> koji: i agree
<astearns> ack faceless2_
<fremy> faceless2: I agree with koji, that proposal is quite flexible
<fremy> florian: if this means we are going to prioritize implementing these properties, I agree
<fremy> florian: but this is a very useful case for us
<fremy> florian: and just pushing it to a new level doesn't do much for us
<faceless2_> We've implemented already I believe.
<faceless2_> pending testing, of course...
<fremy> myles: priority of the feature > stage of the spec
<fremy> myles: we should design the feature well, not worry to much about which spec level we put things in
<fantasai> +1
<fremy> florian: yes, but what we are wanting to do is tie this to a new property nobody implemented
<fremy> florian: and we don't know if that property itself will survive or still function in the same way
<fremy> myles: I think it's true, but if this happens, we can revisit later
<fremy> astearns: I agree with myles here
<fremy> astearns: also, it's very separate to how line-break works today
<fremy> astearns: this extra switch doesn't sound like very good design to me
<fremy> florian: ok, I rescind earlier's me comment
<fremy> astearns: sounds like we are in agreement to resolve to add one more value to wrap-before/after, which would specify which chararcter we want to emulate
<fremy> astearns: is that correcT?
<fremy> faceless2_: does that make sense as a single property?
<fremy> koji: yes, maybe we want only want property, a "wrap" shorthand
<fremy> fantasai: but he also mentioned that it was rather non-specific as a name
<fremy> fantasai: and could be confusing
<fremy> fantasai: also, this wouldn't encompass "wrap-inside"
<fremy> fantasai: but maybe "wrap-as: ideographic"
<fremy> koji: I like that naming
<fremy> koji: maybe we can have different ideas
<fremy> koji: but one nice thing is if you apply on an inline box, we can have each side apply to the first/last character of the inline
<fremy> fantasai: yes
<fremy> florian: I dont like wrap-as: avoid
<fremy> florian: maybe wrap-outside: ideographic/avoid ?
<fremy> fantasai: I like that
<fremy> fantasai: I am worried about changing the class of the chars though
<fremy> fantasai: because it also affects the breaking between first and second
<fremy> fantasai: so I would say "for the purpose of breaking before" the first character
<fremy> fantasai: (abc) + wrap-outside: avoid should not affect breaking between a and b
<fremy> koji: not sure I see what is wrong
<fremy> fantasai: because that is affecting the inside of the element
<fremy> fantasai: while we are trying to change the behavior outside
<fremy> koji: yeah i understood correctly
<fremy> koji: I have use case for that I think
<fremy> koji: elements never break, unless it's inline block
<fremy> myles: but this issue is about atomics?
<fantasai> https://www.w3.org/TR/css-text-4/#wrap-before
<fremy> fantasai: yeah but wrap-before applies to inlines too
<fremy> fantasai: so we need to define an effect for them as well
<fremy> koji: hence what I proposed
<fremy> fantasai: then I would prefer another property
<fremy> fantasai: I really don't find the proposal to change the breaking inside for changing the behavior outside
<fremy> astearns: and that would allows combinations too?
<fremy> fantasai: yes, but there is no combination that makes sense
<fremy> fantasai: (flex is special, and the others don't care about character class)
<fremy> fantasai: but if that's not possible to implement
<fremy> fantasai: then we need another property
<fremy> myles: yes, it's worth talking about implementatibility
<fremy> myles: when we compute the line breaking opportunities, we have a big string, and opportunities
<fremy> myles: the model we propose with before/after is not compatible with how line breaker work today
<fremy> myles: so I am in favor of a single property that works on both sides
<faceless2_> q+
<fremy> astearns: if it doesn't really make sense to have separate switches for both sides
<fremy> astearns: then a new property that affects both is better
<faceless2_> a &#0a; b
<astearns> ack faceless2_
<fremy> astearns: correct?
<fremy> faceless2_: we had one use case where this didn't apply to an atomic inline
<fremy> faceless2_: (...)_
<fremy> fantasai: yeah, I don't think we were proposing to remove the properties alltogether
<fremy> fantasai: just that for the specific use case of atomic inlines, we should have a separate one
<faceless2_> My example above was a case where suppressing line-breaking before a non-atomic inline was useful - in that example we would want to prevent the break before the , due to the force break inside it.
<fremy> astearns: ok, so what I am hearing is support for "wrap-as" with values for atomic inline
<fremy> myles: and editors need to figure out interactions with the rest
<fremy> fantasai: I don't think it
<fremy> fantasai: .... is too difficult
<fremy> koji: what about the values? a string would be nice?
<fremy> fantasai: I am ok with the spec behavior described as that
<fremy> fantasai: but I would rather specify keywords
<fremy> fantasai: that would be map to some specific strings
<fremy> myles: was the proposal for the string to be a single char?
<fremy> myles: or "ideographic"
<fremy> koji: no, the char between quotes
<fremy> myles: then I think I agree with florian and fantasai
<fremy> fantasai: and I don't think people will even see this behavior as using ideographic
<faceless2_> -1000 to nomal
<fremy> florian: "normal"?
<fremy> astearns: doesn't mean much to me though
<fremy> fantasai: I think it's decent name; "normal" is ID just because
<fremy> fantasai: it happens ID is the best char to map to to have the desired behavior
<fremy> astearns: proposed resolution is to add "wrap-as" and values, details TBD later
<fremy> astearns: RESOLVED: add "wrap-as" and values, details TBD later
<fremy> florian: level 3?
<fremy> fantasai: no ^_^

css-meeting-bot · 2020-05-06T15:16:02Z

The CSS Working Group just discussed Break.

The full IRC log of that discussion

<fremy> Topic: Break
<fantasai> I was thinking 'wrap-as: break-all | normal'
<fantasai> with break-all as the initial value
<fantasai> or something like that I guess it's not clear it only applies to objects
<fantasai> :/
<myles> i disagree with these names
<myles> we can discuss it in github i guess
<fantasai> myles, basically I think we should be clear with the initial value that it breaks everything
<fantasai> and that the other value is treating it as text-like
<myles> fantasai: how about `break-all | ideographic`
<fantasai> I don't like using ideographic because it sounds like the wrong thing to use for most people who will want it
<fantasai> It sounds like only CJK will want to use that value, but in fact it's useful in many more contexts...
<myles> i expect most people will want to use the break-all value
<fantasai> we didn't choose to emulate ID because of CJK, we chose to emulate ID because it happened to have the correct line-breaking behavior
<fantasai> myles, I don't think so
<fantasai> break-all is the default, but it doesn't give sensible behavior in running text
<fantasai> it breaks against nbsp
<fantasai> it breaks against )
<fantasai> it results in very awkward breaks if you actually use it in running text
<myles> right, most images are images. most images don't look like inline text
<myles> they should break on both sides by default
<jfkthame> advantage of `ideographic` is the clear mapping to the unicode line-break algorithm
<fantasai> jfkthame, yes, but that's helpful to implementers not to users :)
<jfkthame> we could use `ID` if you don't want it to sound so clearly CJK-ish
<myles> i think it's helpful to users. it tells them "what kind of text this image should behave as"
<fantasai> myles, most images aren't used as inline-level content in effect
<fantasai> myles, most people don't know about line-breaking rules for languages other than their own
<astearns> github: https://github.com//issues/4949
<fantasai> myles, ideographic is extremely cryptic
<jfkthame> in the event we add more values (e.g. like closing-punctuation, opening-punct, etc) we'll care about that mapping being clear
<myles> we may want to add "alphabetic" one day, and having it be `break-all | normal | alphabetic` doesn't make any sense
<fantasai> myles, to the extent that images are mixed just with other images, they will continue to break
<myles> right, and that's not a bug
<jfkthame> i fear that if we try to do something other than follow the unicode classes we may paint ourselves into an awkward corner
<fantasai> myles, to the extent that they're mixed with punctuation, they should follow kinsoku rules
<fantasai> myles, treating as ID does both of these things
<myles> only if they're supposed to be texty
<fantasai> myles, breaking "([image])" inside the parens is never ok
<myles> disagree
<TabAtkins> ScribeNick: TabAtkins
<myles> if the image is a picture of a tree
<myles> then i want it to break on both sides
<fantasai> why????
<TabAtkins> don't break the forest for the trees
<fantasai> that makes no sense
<myles> cause it doesn't look like text
<fantasai> you put it in parens
<fantasai> don't care what it looks like, I can't imagine anyone wanting that to break
<myles> that is how all browsers behave on all content today. hard to argue it isn't a sensible default
<fantasai> if you didn't put it in parens, whatever.
<TabAtkins> Yeah, having a ( at the end of a line, then the tree and ) at the start of the next line, seems like it woudlo be broke-looking

litherum · 2020-05-06T15:22:41Z

I'd like to propose the syntax wrap-as: normal | ideographic where normal is the initial value.

kojiishi · 2020-05-07T16:12:19Z

I'd like to propose the syntax wrap-as: normal | ideographic where normal is the initial value.

I like it. How about adding a few more? Not wanting to break before or after might be useful for some types of images.

Also great to define whether the table cell width calculation quirk should be applied or not for values other than normal.

fantasai · 2020-05-14T23:44:14Z

I don't like it, because ideographic does not convey the useful things to authors who don't use CJK. We chose to match ID class because it has the right behavior, not because we wanted to match CJK. This kind of naming is helpful to people who implement a line-breaker, not to authors using the property.

fantasai · 2020-05-14T23:47:02Z

Alternate syntax: wrap-as: break-all | letter | word where 'letter' behaves like AL and 'word' behaves like ID.

'break-all' is the initial value for legacy reasons, and breaks against everything including nbsp. (I would not call the initial value's behavior "normal" except insofar as it's legacy behavior, it does very weird things when mixed with text. Breaking within "IMGnbspIMG" or "(IMG)" is very very weird.)

kojiishi · 2020-05-15T07:27:38Z

I'm fine with either break-all or normal, but letter and word don't work well for scripts that do not use spaces to delimit words. Alphabetic and ideographic looks more correct to me. @r12a, do you have suggestions?

As above, I would like to add open, close, and exclamation. Many chat apps use images for Emoji, and exclamation mark images are common. There are ~40 classes in UAX#14, probably all of them are too much, but these 3 are useful.

kojiishi · 2020-05-15T07:32:40Z

I don't like it, because ideographic does not convey the useful things to authors who don't use CJK.

We can add emoji alias if people outside CJK don't seem to understand what ideographic would behave.

kojiishi · 2020-05-15T07:59:22Z

graphic-symbol, from wikipedia.

frivoal · 2023-10-20T01:26:52Z

@kojiishi Do you think we need the subtle differences between the CL, CP, and EX classes? Theoretically, I suppose we could come up with use cases for most UAX14 classes (as you could have picture-based representation of mostly anything), but in practice, it seems to me that one type of closing (probably based on CL) is good enough, and it is much simpler for authors to only need to deal with one.

fantasai added Agenda+ F2F css-text-3 Current Work css-text-4 labels Apr 14, 2020

fantasai mentioned this issue Apr 14, 2020

[css-text] Atomic inlines being equivalent to ID for line breaking is not web-compatible #4576

Closed

xfq added the i18n-tracker Group bringing to attention of Internationalization, or tracked by i18n but not needing response. label Apr 14, 2020

plehegar mentioned this issue Apr 15, 2020

[css-text-3] Switch line-breaking handling of atomic inlines w3c/i18n-activity#883

Open

frivoal mentioned this issue May 6, 2020

[css-text-3] replaced elements and atomic inlines with non default line-break #4755

Closed

astearns removed the Agenda+ F2F label May 6, 2020

xfq mentioned this issue May 7, 2020

[css-text-3] replaced elements and atomic inlines with non default line-break w3c/i18n-activity#859

Closed

fantasai removed the css-text-3 Current Work label May 25, 2020

fantasai added the Needs Edits label Jun 15, 2023

fantasai added the Needs Thought label Oct 20, 2023

frivoal removed the Needs Edits label May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[css-text-3] Switch line-breaking handling of atomic inlines #4949

[css-text-3] Switch line-breaking handling of atomic inlines #4949

fantasai commented Apr 14, 2020

fantasai commented Apr 14, 2020 •

edited by frivoal

kojiishi commented Apr 14, 2020

litherum commented Apr 14, 2020

fantasai commented Apr 14, 2020 •

edited

kojiishi commented Apr 20, 2020

jfkthame commented Apr 20, 2020

kojiishi commented Apr 30, 2020 •

edited

fantasai commented May 6, 2020

fantasai commented May 6, 2020

jfkthame commented May 6, 2020

css-meeting-bot commented May 6, 2020 •

edited by astearns

css-meeting-bot commented May 6, 2020

litherum commented May 6, 2020 •

edited

kojiishi commented May 7, 2020

fantasai commented May 14, 2020 •

edited

fantasai commented May 14, 2020 •

edited

kojiishi commented May 15, 2020

kojiishi commented May 15, 2020

kojiishi commented May 15, 2020

frivoal commented Oct 20, 2023

[css-text-3] Switch line-breaking handling of atomic inlines #4949

[css-text-3] Switch line-breaking handling of atomic inlines #4949

Comments

fantasai commented Apr 14, 2020

fantasai commented Apr 14, 2020 • edited by frivoal

kojiishi commented Apr 14, 2020

litherum commented Apr 14, 2020

fantasai commented Apr 14, 2020 • edited

kojiishi commented Apr 20, 2020

jfkthame commented Apr 20, 2020

kojiishi commented Apr 30, 2020 • edited

fantasai commented May 6, 2020

fantasai commented May 6, 2020

jfkthame commented May 6, 2020

css-meeting-bot commented May 6, 2020 • edited by astearns

css-meeting-bot commented May 6, 2020

litherum commented May 6, 2020 • edited

kojiishi commented May 7, 2020

fantasai commented May 14, 2020 • edited

fantasai commented May 14, 2020 • edited

kojiishi commented May 15, 2020

kojiishi commented May 15, 2020

kojiishi commented May 15, 2020

frivoal commented Oct 20, 2023

fantasai commented Apr 14, 2020 •

edited by frivoal

fantasai commented Apr 14, 2020 •

edited

kojiishi commented Apr 30, 2020 •

edited

css-meeting-bot commented May 6, 2020 •

edited by astearns

litherum commented May 6, 2020 •

edited

fantasai commented May 14, 2020 •

edited

fantasai commented May 14, 2020 •

edited