Join GitHub today
[css-text-4] Allow for paragraph-level line breaking #672
There are a few algorithms which attempt to choose line breaks within a paragraph in order to maximize the beauty of the paragraph as a whole. Such algorithms try to do some subset of:
Obviously, it is impossible to satisfy all these desires simultaneously for arbitrary paragraphs. The exact algorithm should not be specced for these reasons:
Instead, there should be a way for a web author to opt-in to paragraph-level layout for beautiful paragraphs.
I'm not sure what the best mechanism for this is. Perhaps a new value to the text-wrap property? Perhaps a new property? Perhaps something else?
Do you think there is anything in the specification currently preventing browsers from implementing paragraph-level layout already? I know they don't, but if they could, maybe we don't need an opt in at all, and any browser willing to go through the necessary implementation complexity can just turn it on by default.
I think he meant it to be an opt-in, since turning this on will hit the performance quite severely. Adobe InDesign has a concept of pluggable composer, and provide 4 different composers; combinations of English/Japanese and line/paragraph levels. AFAIU, the composer in InDesign provides line breaker and justification algorithm.
@kojiishi is correct. It would be incorrect to do this automatically for both performance and correctness. Opting everything in automatically would be a performance regression, and the web would be upset if we just start changing every line break on every page to be dramatically different.
Line breaks aren't controlled directly by CSS anyway - line breaks change browser to browser and platform to platform and that's OK. So I think it would be fine for browsers to incrementally improve their line breaking without an opt-in.
But I agree that the performance characteristics of full-paragraph composers are likely to require an opt-in. InDesign doesn't actually go full-paragraph, it only does a set number of lines at a time. I think browsers should be able to experiment with composers, and even disagree about what works best. So an opt-in would probably need to be pretty generic, perhaps a single-line composer as a default and a multi-line composer as the opt-in.
Point taken about performance. However I think if we introduce a switch, it should be between auto and on, not between off and on: CSS UAs generally do greedy line breaking as the default, and that's ok, but if an implementer wants to provide a better default (e.g. it is a print UA not concerned with performance, or it has a not-great-but-better-than-greedy algorithm that performs fine...), that should be allowed.
With that in mind:
Houdini should not be required to have beautiful paragraphs.
I also don't see a use case for
Eventually, adopting more customizability for smart paragraph breaking is totally reasonable in a later level. However, it's too early to spec something like that in this level. However, the keyword "on" is compatible because, in later levels, we could list additional keywords which would appear after the "on" value.
It sounds like Florian is leaning toward a new property. Therefore, the current proposal is:
paragraph-layout: auto | on initial value: auto
Obviously we can bike-shed the name too.
Right, that's kind of what I have in mind. At the same time, you suggestion to use text-wrap is quite reasonable also, as it seems the separate property would not do anything when
So, to decided if we need more values to the existing property or a new one:
All in all, I think I could go either way, but I now lean a bit more towards a value to text-wrap. And given that text-wrap has not been implemented anywhere yet I think, we can bikeshed the property name and existing values if needed to make everything fit better together.
I'd like to take my comments above "prefer a separate property" part back; i.e., I'm fine with a new value to
I admit I had wrong understanding on how
So the block-level
I think we're all good.
"Smart" is not a good name because it doesn't have any semantic meaning. I like Koji's "multi-line" idea because it is more accurate than "paragraph" (because I expect most implementations will do this in a sliding window rather than for the entire paragraph)
So the new proposal is:
The conceptual difference between "balance" and "multi-line" is that "balance" is intended for titles where "multi-line" is intended for body text.
The proposal sounds sane to me. I’d like also to make it clear that browsers should be allowed (but not required to) make
Basically, I think that the two values should have the following meeting:
Well, to do anything useful, sure, but since the value implies a UA defined algorithm, they can do whatever they want anyway, and we cannot test the difference. So putting must on a non testable statement isn't doing very much.
If an implementation does not support the value, it will fall back to whatever the cascade says it should fallback to, which may or may not be
Allowing implementations to support
It will probably be possible to come up with a very basic test - given a width constraint of about ten characters and some content that looks like:
a a a a a
a multi-line composer is going to choose to break before the last 'a' in order to avoid the short second line length.
And I disagree that we need to get a resolution on a conference call to add something to a working draft. I see enough consensus in this thread to make the change. Informing those in the group not following this thread that there's something new is worthwhile, but that's best done with the edits in place. Asking anyone to resolve on something that's not yet written down isn't fair.
I had a proposal adopted by CSS at one point to add a couple of properties for this... One was to say whether you expect text to be editable in the future. The reason for this is that some line-breaking algorithms are not suitable for interactive use. They might be slow, or, worse, adding a word might affect the position of the insertion point, moving it backwards or forawrds by one ore more (horizontal or vertical) lines. So you need a hint to say, although this text isn't marked as editable now, the app might change that, and if it does, don't reflow the page to make it editable! The other was to let designers specify a preferred algorithm and give parameters to it; this might be better left not done for now, because I think we need experiments. My own research in the past has suggested the best compromise in many circumstances is a modified first-fit that operates on an n-line window. This works massively better than Knuth-Plass for unattended operation because it doesn't have the poor edge-case behaviours of Knuth-Plass found e.g. in TeX. Liam
If we want to follow through with this, we either need to define the initial value (
That's for the
For the fast-and-stable case, should we just leave it up to browsers, or require a particular approach, or or leave it up to browsers and suggest particular approach? If we suggest/require something, should it be greedy line breaking, or some variant of greedy line-breaking that allows for prioritization of soft wrap opportunities, or something else?
I don't have yet a strong opinion on what the best answer is, but I'm leaning toward
On Tue, 2016-11-29 at 17:23 -0800, Florian Rivoal wrote: > > One was to say whether you expect text to be editable in the > future. If we want to follow through with this, we either need to define the initial value (`wrap`) to be that value, or to have 3 values: - one that's guaranteed stable and fast (`wrap`?) - one that promises nice layout (`multi-line?`) - an initial value that lets the browser pick where it wants to be on the stability-performance-niceness spectrum (`auto`?)
I think that future-editability should probably not be conflated with wrapping. An alternate approach might be for HTML to offer values for content-editable that included "never" (read-only DOM subtree), "scripted" to mean it could happen, and "yesbaby". But that makes it harder to write an application for editing arbitrary documents. So previously I'd proposed a separate property for it. But yes, all I really care about here is that browsers can improve paragraph layout without screwing up Web editing applications. My original proposal: https://lists.w3.org/Archives/Public/www-style/2013Mar/0183.html As adopted: https://lists.w3.org/Archives/Public/www-style/2013Apr/0246.html I didn't do this in the end mostly because of Houdini (and talking to people) but looks like it's time. What's important seems to be (1) authors can demand a stable algorithm for an element/subtree so that starting to edit doesn't cause a reflow; (2) page authors/designers can express that good line-breaking is important (3) people can experiment with algorithms (e.g. polyfills, prefixed values) until we learn better what works. There's been a lot of research in the past on this stuff, but the area of automatic layout (i.e. the page author isn't there to make changes and try again) combined with interactive screen display makes line-breaking very different than for paper, and very different from e.g. TeX's environment.
> [...] modified first-fit that operates on an n-line window. That's for the `multi-line` case where we're trying to get the nicest result, right?
Yes. It's almost as fast as simple wrapping, slightly worse than linear on the number of words in the paragraph, but also slightly harder to code :-) as you need to handle n previous lines (an n of as small or 2 or 3 makes a huge improvement over "ifrst fit/greedy", though).
For the fast-and-stable case, should we just leave it up to browsers, or require a particular approach, or or leave it up to browsers *and* suggest particular approach?
I think the latter. As Fantasai said when it was discussed before, authors should not have to opt in to higher quality. I suggest an n-line optimization. It's fast and does a good job, *but* it is not monatonic: adding a word at the end (or anywhere) can in some cases reduce the number of lines in a paragraph. That means the insertion point can jump backwards as you're editing. That's only mildly disruptive on a graphic designer's 30" page layout display but no good on a mobile device or in a Web app where there might be limited display space for the text, hence wanting for authors to be able to opt out. Another approach that's been tried is to add a delay, so you don't reflow the paragraph while it's being edited, but that's a nightmare for copy-fitting and also has accessibility problems for people who lose their place when it does finally reflow. Note that the TeX Knuth-Plass algorithm is worse than polynomial (they say it's NP-complete) on the number of words in the paragraph, which might be risky for a Web browser. [...]
- the algorithm is UA defined. It should bias towards nice layout, if necessary at the expense of speed and stability. Liam's favorite suggestion is offered in a note as one reasonable approach to do that.
Hmmm. I'm a bit confused. What you're saying seem to be arguments supporting what I proposed (“leaning toward[...]”) or something close, but I can't really tell if you are indeed supporting it or explaining why you think we need something else.
What I suggested doesn't really do anything specific about that, but it seems perfectly compatible with doing that via Houdini.
@astearns I have two issues with your edit. It seems to ignore the part of the discussion after Liam chimed in.
Unfortunately, you haven't replied to the comments proposing these, so it is difficult to know from what angle to argue.
I can open new issues and repeat the argument there if you want, but at the same time, the context is here, so it seems easier to discuss here.
I agree with Liam that "future-editability should probably not be conflated with wrapping," and more strongly that editability concerns should not constrain wrapping choices. In some cases you want to edit with stable upstream line breaks, and in other cases you want to edit with the best line breaks over all the content. So a separate property expressing editing preferences is warranted. It should be possible to satisfy both text-wrap:multiline and an edit preference set to stable by starting an n-line window with the line just above the cursor.
I didn't consider adding a note with algorithm suggestions. I expect that Liam's n-line optimization is the right suggestion, but maybe someone can come up with something better. Perhaps a separate issue would help?
referenced this issue
Dec 14, 2016
I don't see how this follows from that. A separate value would do as well.
Reopening for discussion. I stand by the position that new values here should get WG discussion and consensus. Consensus in an issue is enough only if the only people who care are paying attention in the issue, and I don't believe that's the case here and it usually isn't for adding new features.
I'm trying to get my head around where/when/if
Because line breaking and multi-line justification methods are inherently linked, I turn on justification with
...which to my mind implies that
I think what I'm trying to ask is: should
The Working Group just discussed
The full IRC log of that discussion<dael> Topic: Allow for paragraph-level line breaking
<dael> github: https://github.com//issues/672#issuecomment-379723234
<dael> fantasai: astearns added a new feature and there was limited discussion. I feel we should have more check in with the group.
<dael> astearns: Issue is pretty long.
<dael> fantasai: Discussion was from myles asking to opt into more expensive line breaking algo with better results. Discussion landing on adding a switch. text-wrap property in L4 is a longhand of white space that only controls wrapping. Has values to say wrap, don't wrap, try to balance lines for same length. Proposal was add a value called multi-line that does one of these more expensive algos.
<dael> fantasai: We've talked about fancy wrapping algorithms before.
<dael> eae: It's not that we don't want to impl, it's not a priority.
<dael> astearns: I added it to the spec as a this would be nice to have.
<dael> fantasai: Alternative is you do it anyways and you don't need an opt in.
<dael> eae: Opt in is nice since it's quite a bit mroe costly.
<dael> myles: You can't have this on by default.
<dael> myles: THis will be way slower.
<dael> fantasai: If concern is about perf we might want a note to say if someone impl and discovers no perf issues.
<dael> florian: If you want to do it and have made it fast you can apply to both.
<dael> astearns: When I added the value the defualt says you may consider multi lines and the UA may optimize for speed. And for multiline algo should counsider multiiple lines and should bias for layout over speed.
<dael> RIck: UA could descide this is giant and we won't do it anyway?
<dael> myles: Yes. There's 100 criteria for a beautiful paragraph and it's impossible to do all. Browsers should be free to pick and choose.
<dael> fantasai: Sounds like what you want is text-wrap:expensive
<dael> fantasai: If that's what it's conveying let's convey that. There are lots of other thigns to consider.
<dael> myles: But balance is also expensive. Different algo for different reasons.
<dael> fantasai: Wrapa nd multiline are trying for the same effect.
<dael> florian: You're not asking for slow. You're asking for pretty even if it's slow.
<dael> myles: It isn't one spectrum, there are different algo with different purpose
<dael> jpamental: As a designer there's many things to optimize on. I'd like the pretty to have everything laid out, but I also recognize that's hard. Maybe target which things you want to prioritize but that's a different solution. Assigning priority to the thing you want to address.
<dael> myles: WE could make it more complicated but there's 0 impl so simple would be good.
<dael> astearns: Concerns with leaving this prospective value in?
<dael> fantasai: If it's not clear waht we're aiming at I don't know if it makes sense to have a keyword. If someone wants to impl a fancy line wrapping algo they can come back and ask for a switch and what they optimize for and then we can deiscss waht switch we want.
<dael> astearns: I think it's a case that a shared characteristic is that it considers more then one line at a time. A multiline value that can encompass a range os strat is appropriate. I feel that deciding on this is the pretty strategy for the web and getting everyone to agree on a algo won't happen. I wanted to leave it a little open so people can experiement with what they'd trade.
<dael> fantasai: I think multiline isn't clear.
<dael> myles: Let's call it pretty
<dael> fantasai: Sure.
<dael> florian: We started with paragraph and we shifted to multiline because it doesn't always consider the whole paragrpah.
<dael> myles: maybe not pretty because titles don't want this.
<dael> astearns: I'm happy to bikeshed on name. multiline is pretty well known by the breed of people that use indesign.
<dael> florian: There was a thing for a switch...for editing purposes you don't want the algo.
<dael> astearns: Separate issue for that.
<dael> fantasai: In taht case you want stable and something else.
<dael> astearns: And there's a note at the end of the section on that problem.
<dael> myles: If we were to impl we prob would make multiline same as wrap.
<dael> florian: We're pretyt much repeating this issue and the issue says we don't want a sep property from wrap but wrap-sable and wrap-pretty.
<dael> fantasai: I like it as a modifier.
<dael> dbaron: Agree wrap should be in the same.
<dael> myles: Should balance also have wrap?
<dael> Rick: multi-line line breaks would be just fine
<dael> florian: If you say wrap UA must not consider multiple lines. If you call for stable and you don't want regiggle you shouldn't consider. If you turn of the editing you want stable.
<dael> astearns: We're at time. I've got some of this discussed. Do you want a resolution fantasai ?
<dael> fantasai: I think we agree it's a modifier on wrap not a sep keyword.
<dael> astearns: I believe there is a sep issue on editing and multi line
<dael> astearns: Resolution that we shoudl change value to wrap-nice, wrap-better...something?
<dael> astearns: Issue is open we can bikeshed there.
<dael> fantasai: We can make it a modified on wrap.
<dael> koji: If multiline is modifier balance should be one too.
<dael> astearns: We're at time. I don't hear consenses and I would object to putting wrap as a modifier on all of the properties.
On Wed, 2018-04-11 at 09:02 -0700, CSS Meeting Bot wrote: The Working Group just discussed `Allow for paragraph-level line breaking`.
Just a couple of points to add that may help: (1) Knuth-Plass is not suitable for content-editable text without a LOT of work on UX, because editing a word even slightly might make the text format into a different number of lines and move the insertion point somewhat distressingly. It's been done e.g. years ago in InterViews, but it's really best just to use the normal first-fit algorithm for editable content. I'd suggested at one point a property to say that an element's content might be edited in the future even if it doesn't contain content- editable, so that setting content-editable wouldn't need to trigger a reflow. that suggestion was adopted by the CSS WG bt never made it into text 4, and i didn't push because of the Houdini work just getting started. (2) if you use first fit with a 2-line buffer, and move a single word down from a tight line onto the next line if the next line is looser is really super fast, makes a huge improvement, and plays much more nicely with content-editable. (2b) you can extend the 2-line buffer to n-lines and do a better job of averaging out the line lengths (or the space sizes for justified text), but it's not neccessary to use large values of n (7 is probably enough) especially if it's a floating window (lines 1...n, then 2...n+1, and so on, or move down by max(n/2 - 1, 1) lines each time) and it's still linear on the number of lines, where Knuth-Plass is NP-complete on the number of words in the block and can quickly get expensive.…
I'm using the Knuth-Plass algorithm with only some small modifications. The main issues and sources of overhead I've encountered with a user-space solution are:
What I'm planning to do next is to take a look at the CSS Layout API to understand whether it is possible to use that to reduce the duplication of work.
referenced this issue
Apr 30, 2018
On Wed, 2018-08-08 at 18:14 -0700, fantasai wrote: @liamquin Thanks very much for your detailed comment on line-breaking algorithms in #672 (comment) ! Do you have any good references we can link to from the spec, or should I try to summarize your points in a note?
I don't have references i'm afraid. I've never seen any on the approach i describe - i came up with it myself based on how hand composition worked, but i'm sure many other people have too. I understand that InDesign uses an n-line moving window with n probably 9 or so. I certainly don't mind reviewing a note, but no longer have anyone paying my way to participate in the CSS WG. There are plenty of references on problems with Knuth-Plass and interactive editing, and people have tried lots of experiments to try and make it work, but the combination of (1) it not being stable (the insertion point moves around distressingly) and (2) corner cases that for TeX-for-print the author has to correct by hand, make it not really ideal. In an editor the insertion point can be kept stable most of the time by not reflowing the paragraph immediately, e.g. not until a paragraph loses focus, and that compromise might work in a browser. It's mostly a problem when you are editing in the middle of existing text. Knuth and Plass published a paper, probably early 1980s, proving their modified algorithm was NP-complete.…