Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{granularity: "line"} promotes reimplementing paragraph layout in script #49

Closed
litherum opened this issue Nov 2, 2018 · 41 comments
Closed

Comments

@litherum
Copy link

litherum commented Nov 2, 2018

The only use case I can imagine for line break iterators would be people trying to do their own paragraph layout themselves (e.g. eventually painting into a canvas).

The best way to perform paragraph layout in a browser is to use HTML elements and CSS. An author trying to do it themself with Javascript would almost certainly be both slower, less correct, and less accessible than doing it with the browser's engine.

This probably isn't true for the other segmenters - I can think of plenty of use cases for the other ones, but if there is wide adoption of line breaking, specifically, it would be unfortunate for the Web.

@littledan
Copy link
Member

littledan commented Nov 2, 2018

This is indeed the point a major goal of the API.

I thought it would be useful for the cases where HTML and CSS doesn't quite cut it by itself, for example certain rich text editors. I hope people don't use it for cases where it is unnecessary. Maybe we can be more explicit about this in the documentation.

In the context of the ongoing development of Houdini custom layout APIs, it seems only right if we expose the primitives for line breaking.

About accessibility, it is definitely true that extra work would be required to retain accessibility if a site is doing custom line breaking, but do you see the platform as missing any primitives to implement it?

@vsemozhetbyt
Copy link

This can be used not only in Web API but also in Node.js for a line by line text processing if I understand correctly.

@litherum
Copy link
Author

litherum commented Nov 2, 2018

Node.js modules are not web standards.

Historically, “I hope people don’t use it for cases that are unnecessary” hasn’t worked on the Web. People abuse our APIs all the time.

One of the difference between the Houdini APIs and this is that the Houdini APIs improve your ability to do something you already were able to do before. Before custom paint, there was canvas; before custom layout, there was absolute positioning, etc. There are currently no facilities on the web for performing line breaking.

@littledan
Copy link
Member

@litherum Well, this subsumes Chrome's non-standard Intl.v8BreakIterator, which I was unable to remove previously because its usage was too high. That's one reason I started pursuing this standard, unlike other non-standard Chrome Intl features which I just removed.

It's possible for JavaScript itself to implement line breaking without platform support, and there are many npm modules which do so, such as css-line-break.

@vsemozhetbyt
Copy link

I do not think Web platform should monopolize JavaScript or veto new futures unless they break the Web. JavaScript is general-purpose language now and maybe one of the Perl replacement for the text processing. So it would be a pity to limit its possibilities with a perpetual caution against a possible misuse in the Web.

@littledan
Copy link
Member

I don't think we have to think of this as either-or. We have strong motivating use cases both on and off the web platform.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Nov 26, 2018

Consider the following condition: assume we eventually removes {granularity: "line"} from this API, then those who WANT to layout the line themselves in JS will then be forced to and still can misuse the {granularity: "word"} in place of the lacking of {type: "line"} in this API, and they will still be able to layout as what they want to do, but with worst result, which will sort of work in the English / Latin based page but poorly support the Japanese/Chinese pages. Isn't that an even worst API to promote?

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Nov 27, 2018

The objection of supporting {granularity: "line"} is based on the assumption that ECMA402 is always operated under in a world which also have HTML and CSS. But that is simply not true. The entire ECMA402 (see https://ecma-international.org/ecma-402/) has no notion of CSS or HTML and is a lower level library that has less constraints than the functionality in a CSS/HTML based environment. For example, all the ECMA 402 functionality are accessible inside web worker and web worker has no access to DOM nor CSS. With {granularity: "line"} support for Intl.Segmenter, Javascript in Web Worker could break the line based on Intl.Segmenter and other constraints (not necessary font metrics or width in pixels).

Also notice not all the web application are operated under HTML and CSS, and therefore in those environment HTML+CSS are not simply not accessible. For example, let's say an user want to support Chinese / Japanese rendering in https://delphic.me.uk/tutorials/webgl-text
Currently that page has a hack to line break for English text due to whitespace, and the reason they are doing so is because there are no other line breaking support under that environment. With Intl.Segmenter which support {granularity: "line"} we can change the implementation to use Intl.Segmenter instead. Notice it is not "reimplementing" since there are NO implementation of line layout which is accessible in that context. Without our Intl.Segmenter, how could the caller properly line layout the text in that 3D rotation object?

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Nov 27, 2018

Another example is considering javascript in a Web Worker receiving text from the UI thread and create PDF file as output. The rendering context will be PDF, not HTML with CSS. It will require a line breaking support for the line layout but it cannot depend on HTML/CSS since there are no need to have HTML + CSS to generate PDF file. For example, take a look at https://parall.ax/products/jspdf . The jsPDF will need to line layout the text into PDF, not into HTML and the implementation of the jsPDF is in JavaScript and therefore can access ECMA 402. I am not sure it can currently support multiple line layout, but if one day it want to, it will need to have line breaking support other than one bind with HTML+CSS. For example, switch to the "String Splitter" example on that page to see such usage. It currently won't display Chinese/Japanese correctly but that is due to the fact of the lack of Chinese / Japanese pre-install with PDF viewer and could be addressed by adding embedded fonts.

@litherum
Copy link
Author

litherum commented Nov 27, 2018

those who WANT to layout the line themselves in JS will then be forced to and still can misuse the {granularity: "word"} in place of the lacking of {type: "line"} in this API

This sounds like an argument for not having {granularity: "word"} in the API either.

You're right that people can do line segmentation badly even without {granularity: "line"}. That's currently true today; people can do line segmentation badly by implementing it in JavaScript.

My proposal doesn't remove functionality that apps have today. It simply doesn't cater to the problematic use cases.

The objection of supporting {granularity: "line"} is based on the assumption that ECMA402 is always operated under in a world which also have HTML and CSS.

We can all agree that the Web is a major client (perhaps certainly the most major) of ECMA402. The standard library of Javascript should be compatible with all major clients. An API that is not compatible with the major clients should not be in the standard library, but should instead be in another library.

I'm not arguing against having libraries that do line breaking. I'm arguing that it shouldn't be part of the standard library.

Without our Intl.Segmenter, how could the caller properly line layout the text in that 3D rotation object?

By using the transform CSS property.

Another example is considering javascript in a Web Worker receiving text from the UI thread and create PDF file as output.

All major browsers / operating systems support printing to PDF. I don't understand this use case.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Nov 28, 2018

You're right that people can do line segmentation badly even without ...

So you now agree that your argument of "user may misusage it to line segmentation badly" disadvantage concern is NON-UNIQUE because such behavior could happen with or without this function in the library, right? In other words, regardless we add this function into the library or NOT, it won't produce any worst result to the web with this addition.

The objection of supporting {granularity: "line"} is based on the assumption that ECMA402 is always operated under in a world which also have HTML and CSS.

We can all agree that the Web is a major client (perhaps certainly the most major) of ECMA402.

  1. You position is Web is the same as HTML + CSS only, but that is NOT true. fillText in Canvas IS a W3C standard for the Web (see https://www.w3.org/TR/2dcontext/#dom-context-2d-filltext ) so allowing JS to support necessary functionality in Canvas IS supporting the "Web". Is your position that fillText() in Canvas is considered NOT part of "Web"?

An API that is not compatible with the major clients should not be in the standard library, but should instead be in another library.

  1. Could you point out in which part of this function "not compatible" with any pre-existing feature on the web specification? Which HTML and CSS is this function "not compatible" with?

  2. Early on in the meeting you said your concern is "misusage" by the web developer and you never mention any "incompatible" issue during the meeting. Are you shifting your position that you also consider "incompatibility" of this API with pre-existing web spec?

Without our Intl.Segmenter, how could the caller properly line layout the text in that 3D rotation object?

By using the transform CSS property.

  1. Could you point out how to use transform CSS property on 3D canvas fillText()? Is there any pre-existing W3C spec already support transform on 3D canvas fillText()? or is there any W3C standard activity propose to support transform on 3D canvas fillText()? Also, I am not aware "transform" control any line breaking of text in CSS.

  2. Is you position that webgl NOT part of the web? Could you show me how WebGL can render text with multiple line break/layout?

All major browsers / operating systems support printing to PDF. I don't understand this use case.
You're right that people can do line segmentation badly even without

  1. I am not talking about "printing PDF" or "rendering PDF" here. I am talking about "generating PDF" from JS inside a browser and feed to the browser to display here. I am talking about empowering JS to take text and GENERATE a PDF here.

@sffc
Copy link

sffc commented Nov 30, 2018

I prefer keeping line break in the proposal.

Reasons for my position include:

  1. Having a line break option keeps ECMAScript features consistent with the features available in the Unicode standard.
  2. Line break requires a lot of data; people currently doing line breaking have to either load the full data or delegate to a server, which is bad for the web.
  3. Practically, it's a lot of work to make one of these proposals. It's basically free to add the feature right now. If we don't add the feature now, it becomes a lot more work to add later.

The objection of supporting {granularity: "line"} is based on the assumption that ECMA402 is always operated under in a world which also have HTML and CSS.

We can all agree that the Web is a major client (perhaps certainly the most major) of ECMA402. The standard library of Javascript should be compatible with all major clients. An API that is not compatible with the major clients should not be in the standard library, but should instead be in another library.

Our official charter says:

This Standard defines the application programming interface for ECMAScript objects that support programs that need to adapt to the linguistic and cultural conventions used by different human languages and countries.

It says nothing about the browser versus Node.js, just "programs".

@littledan
Copy link
Member

littledan commented Nov 30, 2018

A number of us from Google, Apple, Mozilla and Igalia got together and discussed this issue. We concluded that line breaking would be better developed as a Houdini API, since it inherently has to do with paragraph layout. I plan to follow up with a PR to remove line breaking from this proposal.

@sffc
Copy link

sffc commented Nov 30, 2018

It sounds like a lot of smart people reached this conclusion, so I'll defer to that, even though I'm not sure I agree.

However, in general I'm not comfortable with a subcommittee of a subcommittee making a major decision without conferring back for consensus to the full subcommittee.

@gsathya
Copy link
Member

gsathya commented Nov 30, 2018

However, in general I'm not comfortable with a subcommittee of a subcommittee making a major decision without conferring back for consensus to the full subcommittee.

+1 we need to discuss this in the Intl meeting before making the spec change.

mathiasbynens added a commit to mathiasbynens/proposal-intl-segmenter that referenced this issue Nov 30, 2018
@jungshik
Copy link

I'd not repeat points made by @FrankYFTang and @vsemozhetbyt .

An API that is not compatible with the major clients should not be in the standard library, but should instead be in another library.

I don't get why having line-breaking support is incompatible with web. I'd not characterize opening up a door for a possible misuse as 'incompatible'.

@mathiasbynens
Copy link
Member

There’ll be less risk of such an API being misused once the Web Platform provides a dedicated text/font layout API (outside of ECMAScript). If we exclude { granularity: 'line' } now, we can move forward with Intl.Segmenter. Then, once such a text/font layout API becomes widely available, we can possibly revisit adding { granularity: 'line' } to Intl.Segmenter.

@FrankYFTang
Copy link
Contributor

If the conclusion is for such remove, I would propose we also change
https://ecma-international.org/ecma-402/#conformance
to include
"The options property granularity in the Segmenter constructor."

in the end so any JavaScript engine decide to still ship with { granularity: 'line' } will still be considered as conforming to the ECMA402 spec the the browser choose to do so.

@mathiasbynens
Copy link
Member

in the end so any JavaScript engine decide to still ship with { granularity: 'line' } will still be considered as conforming to the ECMA402 spec the the browser choose to do so.

I strongly object to this. Interoperability of implementations is critical.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Dec 2, 2018

in the end so any JavaScript engine decide to still ship with { granularity: 'line' } will still be considered as conforming to the ECMA402 spec the the browser choose to do so.

I strongly object to this. Interoperability of implementations is critical.

Then we must fight tooth and nail to keep { granularity: 'line' } in the spec so our browser can ship with it with both conformance, interoperability and also fulfill customer need.

We really have to ask ourself one important question- are we taking a humbling position to empower / entrust that some of the developers will do the right thing to call these APIs, or taking an arrant position to assume developers will likely to misuse these API?

I believe the solution of depending on CSS/HTML to solve every line layout is reasonable for most of the case, but in reality not necessary fulfill all possible requirement for our developer. For example, some developer may need to line layout the text by themselves because they have to mimic the line layout behavior of some proprietary software dominated the markets for several decades and it may not be practical to turn all the behavior (even a buggy one) of the line layout of this software they intend to mimic into a standard (since it is buggy and we should not encourage the misusage of those)

@litherum
Copy link
Author

litherum commented Dec 2, 2018

Then we must fight tooth and nail to keep { granularity: 'line' } in the spec so our browser can ship with it with both conformance, interoperability and also fulfill customer need.

IIRC your browser already ships this without conformance or interoperability.

@littledan
Copy link
Member

OK, let's discuss this further in the next Intl meeting. Let's see if we can bring in some more attendees of that breakout to explain the logic. Cc @tabatkins

Note that supporting an additional granularity would be a spec violation, as GetOption is specified to throw an exception on an invalid option (but browsers have tons of know spec violations, so maybe this isn't the end of the world). If we end up going with the line segmentation API deferred until Houdini, I would suggest that Chrome might want to keep its legacy Intl.v8BreakIterator around a bit longer, rather than adding an additional non-standard API.

@bathos-wistia
Copy link

If this functionality ends up being part of Houdini, it seems to follow that it won’t be available natively in non-browser envs, correct? It’d continue being userland implementations?

@mathiasbynens
Copy link
Member

IIRC your browser already ships this without conformance or interoperability.

You recall incorrectly. Intl.Segmenter hasn't yet shipped in Chrome; see the V8 feature flag which is "staged", not "shipped".

@littledan
Copy link
Member

@mathiasbynens I read that as a reference to Intl.v8BreakIterator.

@FrankYFTang
Copy link
Contributor

so... one main reason @litherum propose to drop "line" is because he afraid of "misuage from web developers" and "incompatible to web platform"
I want to clarify something here:

  1. The "HTML Canvas 2D Context" API https://www.w3.org/TR/2dcontext/ IS a W3C API for the "web platform" and an official "W3C Recommendation 19 November 2015". and IN that API, there is ALREADY a measuretext API available.
    https://www.w3.org/TR/2dcontext/#dom-context-2d-measuretext
    There are no reason to wait for a new API to justify the need of line in Intl.Segmenter. Anyone who are using "HTML Canvas 2D Context" will need to use Intl.Segmenter with line to properly layout line in the canvas.
  2. All major browsers already support "HTML Canvas 2D Context" API.
  3. Different people may have different view of "HTML Canvas 2D Context" API. But their opinions won't change the TWO important FACTS above.
  4. Chrome already ship v8BreaIterator for a while. So... IF the assumption that some developer may misuage a line break facility is TRUE, then we should be able to see the sky falling, right? Could someone point out where are such misuage on the web? If we cannot observe such misusage- won't that be also a counter EV to show people won't misuse the Intl.Segmenter neither?

@tabatkins
Copy link

So, here's the deal. Of the four possible break/segment types, three of them give a useful semantic meaning to the segments between breaks - graphemes, words, and sentences are all meaningful units on their own. One, the line-break type, does not have a meaningful semantic for the segmentation; the stuff between successive breaks are just "fragments of text that linebreak atomically".

The upshot of this is that the three "semantic" categories are useful for lots of things beyond layout. You can do word counting, or highlight the entire sentence that a find-in-page match is in, etc. A lot of these things can be done purely in JS or with DOM operations, not invoking the layout engine at all.

The line segments, however, don't have this. The sole use for these segments is to collect a sequence of them, see if they'll fit into a container without breaking, and then add more until they do (then back off one to have "a line worth" of text). This operation has no meaning without layout of some kind; you can't tell just by looking at the segments how many you'll need, you need to lay out the text and take measurements. Unless you have monospace text (or something like a dot-font for an LED display that has known character widths and no kerning), doing this requires you to invoke a text layout engine, and thus (generally speaking) requires a browser doing layout.

Further, if you are doing text layout like this, you can't even use these line-break segments for it. Actually laying out text properly requires, for example, properly handling bidi - this means you'll need to skip around in the segment list in a somewhat complicated way that does not match the indexing. Giving people an API that acts like you can just accumulate segments in index order is handing them a footgun.

Even further in the same concern, note that there is a special argument (line break style) intended solely for configuring the "line" break type; it has no meaning for the other break types. This reproduces the effect of one of CSS's properties for controlling line-breaking; as you can see from https://drafts.csswg.org/css-text/ and https://drafts.csswg.org/css-text-4/, however, there are many options that you have to worry about if you want to linebreak and lay out text professionally - whitespace handling, emergency breaking, hyphenation, etc. So the proposal as it stands carves out a funny special-case for the "line" break style only, and does so very incompletely. This is not what we want to leave authors with!

This is the difference Apple is concerned about. If the sole use-case for the "line" break type is to do text layout, and there are zero use-cases for it outside of this, and it doesn't even let you do text layout well in the first place, then why is it here? This functionality belongs in a properly designed text-layout-related API, like Houdini's inline layout stuff coming down the pipeline, which can handle all the weird corner cases properly for you and provide you with all the information you actually need (instead of silly hacks like doing actual text measurement in an off-screen iframe...). Then Intl.Segmenter can remain a more narrowly-focused and coherent design capturing actual semantic groupings of text, each with a multitude of uses outside of text layout.

@FrankYFTang
Copy link
Contributor

FrankYFTang commented Dec 13, 2018

there are many options that you have to worry about if you want to linebreak and lay out text professionally - whitespace handling, emergency breaking, hyphenation, etc.

Notice, when the web was invented and used with html3, none of these "many options that you have to worry about" you mentioned were addressed but still got used widely, but it does handle line break of different languages. This history prove it is far more important to break the line linguistically correct than handle all these secondary style issues you mentioned above. So it is surely important to provide this functionality alone for the use case for an important existing web api- the https://www.w3.org/TR/2dcontext/ . It does not mean the issues you mentioned are not important, but they are far less important than breaking line linguistically correct. Also there are no need to be address in the same time. (just as browser support HTML3 not even 4 can be widely accepted without them 20 years ago)

litherum 's starting argument is "The only use case I can imagine for line break iterators would be people trying to do their own paragraph layout themselves (e.g. eventually painting into a canvas)."

I cannot see why it is not important or not appropriate. All major browser support canvas and it is W3C standard. Why should we remove this facility to not supporting this use case that all browser, including Apple, already support? What is wrong to increase such usage in the web? There are complex usage and there are simple use case. The existence of complex use case should not be the reason to remove the need to address simple use case. And most users are under simple use case anyway.

litherum said "The best way to perform paragraph layout in a browser ... "
Notice this is his personal opinion as a blanket statement without considering the context of each condition. We should empower the web developer to decide what is "the best way" in their application. We should not speak for them. It is not our job, as ECMA402 committee to decide what is the best for them. They may have other constraints (and also other facility to support) they need to consider for their line layout.

@tabatkins
Copy link

@FrankYFTang

Note that measureText() is a genuinely bad API for text layout purposes. It is not suitable for text layout purposes! It only handles single-directional text, all in a single font and style, laid out in one line with no breaks.

measureText() works fine for its actual purpose - figuring out the dimensions of the canvas text you just drew (because canvas text drawing is limited in the exact same way). But it's not reasonable to talk about as something that, combined with a line-break iterator, will let you do any sort of professional text layout. You can't use it if you want to handle italics or bold, or sub/sup, or size variations, or changing fonts, or bidi, or all the sophisticated linebreaking options needed for good text layout, like hanging punctuation or hyphenation. Most or all of these are things that, say, Google Docs needs to worry about!

Notice, when the web was invented and used with html

If we're shipping this feature with the goal of matching HTML3-era text layout, then, uh, I think we're doing everyone a huge disservice. That said, HTML3 still had far more sophisticated text layout than what measureText() + line-break iterator allows, since it at bare minimum had <font>, <i>, <b>, <sup>, and <sub>. So this API is actually only serviceable for something even weaker than HTML3-era layout. It's absolutely not useful for something like Slides or Docs; they'll have to continue doing their current line-measuring hacks.

Remember, we're not saying never do text layout in JS! We have proper text-layout APIs coming down the pipeline right now - they'll likely be finished and shipping in a year or two! It's just that this specific API is very, very weak for text-layout use-cases, but has no other use beyond text-layout, so it doesn't actually pay for itself, and doesn't have a growth path to become more suitable in the future. (That is, an actual usable text-layout API will look totally different; there's no evolutionary path available here.)

@vsemozhetbyt
Copy link

vsemozhetbyt commented Dec 13, 2018

FWIW, I would not say line breaks are all layout and no semantics. As for human-readable texts, I can think of poetry (where not so rarely line breaks are the only formal prosodic, semantic and syntactic delimiters). Also, think of various lists, table of contents etc. As for semi-human-semi-machine-readable texts, you can think of various line-delimited configs or DSL formats (for example, in some digital dictionary formats, line breaks have key meaning). And JavaScript is not web-only language anymore.

@litherum
Copy link
Author

{granularity: "line"} does not describe line breaks. It describes line break opportunities.

@vsemozhetbyt
Copy link

I do not think this is an all-changing difference in the mentioned contexts. We cannot predict all the possible ways and links between opportunities and realization.

@vsemozhetbyt
Copy link

What can I use to reformat hard-broken plain text outside of Web rendering instead of this API?

@tabatkins
Copy link

It is indeed an all-changing difference, because you're talking about a 100% different things. "Line breaks", as in, a place where a line is purposely broken, are indeed meaningful; they're also indicated by a line-break character.

This API under discussion has nothing to do with that; it tells you where, in a string of text, it's appropriate to insert a soft line break, and continue on a further line.

In English, for example, if we ignore hyphenation entirely, this will roughly divide up a string into words, plus some additional breaking around punctuation like dashes.

Like, given the string "The over-world beckons.", it would return a sequence like ["The", "over-", "world", "beckons."]. That has nothing to do with meaningful linebreaks as in poetry, DSLs, etc.

@tabatkins
Copy link

What can I use to reformat hard-broken plain text outside of Web rendering instead of this API?

Not this, except in special circumstances.

If you're trying to reformat monospace "plain text", with absolutely no style variations or bidi or anything, then this API is sufficient. You just collect as many segments as will fit on a line.

If your font is variable-width, this is insufficient. You have to pair it with measureText() so you can tell how wide the segments end up.

If you have anything more complex than this, this API does not help at all. The future Houdini Text Layout API will do the job.

@vsemozhetbyt
Copy link

So the question is: is a possible danger of web abuse more significant than the mentioned API sufficiency for common cases in the wider context to such an extent that we need to remove it from the language level and outsource to some web framework?

@tabatkins
Copy link

"common cases" is overselling, I think. ^_^

And no, abuse is one aspect of it. It's also simply insufficient to do reasonable text layout. The core thing it's moderately useful for would be to lay out text in a monospace console, and even then, as stated up in my first post, it doesn't provide enough knobs to do that well. (It only has a single knob to twiddle, the equivalent of the line-break CSS property.) It doesn't expose the equivalent of word-break or overflow-wrap, or any hyphenation functionality, which is needed for good linebreaking even in a console environment. It also doesn't handle unbreakable segments, like a chunk of code that should stay together (important for console-printing use-cases!), because it only cares about raw text, not a higher-level markup.

This API simply isn't fit for purpose, as far as I can tell. It exposes a single aspect of a larger problem, but without providing a more complete solution for that larger problem, this doesn't do anything sufficiently useful to justify specifying, implementing, testing, and shipping it.

@bathos-wistia
Copy link

bathos-wistia commented Dec 14, 2018

@tabatkins thanks for explaining the reasons so clearly.

Incidentally,

If you're trying to reformat monospace "plain text", with absolutely no style variations or bidi or anything, then this API is sufficient. You just collect as many segments as will fit on a line.

is a good description of typical box-drawing in a terminal, which is what made this interesting to me for Node usage. But the argument that it doesn’t belong alongside the other segmentation functionality on Intl (and may be inadequate in any case) is convincing.

@sffc
Copy link

sffc commented Dec 14, 2018

It's absolutely not useful for something like Slides or Docs; they'll have to continue doing their current line-measuring hacks.

The Docs team has told us in the past that they want the line-break API, and it's part of the reason why V8 has been shipping it already as a non-standard feature in Intl.v8BreakIterator.

If this feature does not make it into the standard now, I think it's safe to say that until the Houdini text layout API gets revisited at some undetermined point in the future, Chromium users will continue to get better performance than other browsers on sites like Google Docs.

From a theoretical point of view, I can see tabatkins and litherum's points, but on this issue I tend to feel that the demonstrated practical needs outweigh the theoretical costs.

EDIT: See following comment. It appears that the Docs team had used v8BreakIterator for this purpose in the past, but is no longer using it for line breaks.

@jungshik
Copy link

Line breaking iterator alone is never sufficient for line-layout. It's just one piece of information for properly laying out paragraphs and other forms of text. Obviously, it has to be used together with font / text measurement.

Those against including line breaking argued that including line breaking would lead to misuse/abuse. Well, my prediction is that NOT including line breaking will lead some folks to come up with their own device using word/grapheme break iterators included in Intl.Segmenter. And, the result would be worse.

As for https://drafts.css-houdini.org/, when is it expected to be spec'd out and implemented by major players? Font metrics API proposals have come and gone since 2010...

BTW, the current CSS line breaking does not work well for multi-line heading, movie/song/book titles, product names, ad copy for CJK because for those applications, part-of-speech tag is also necessary and browsers are not likely to have that info available anytime soon, which means whatever Houdini does is not sufficient.

Segmentation along with PoS has to be combined with font/text measurement to support a satisfactory line-breaking for multi-line heading, song, movie titles and ad copy.

See some examples at https://github.com/google/budou .

@sffc
Copy link

sffc commented Dec 15, 2018

The Docs team has told us in the past that they want the line-break API, and it's part of the reason why V8 has been shipping it already as a non-standard feature in Intl.v8BreakIterator.

Correction: with Jungshik's help, I did some more investigation of Google code, and it seems that although Docs may have used v8BreakIterator for line breaking in the past, I cannot find any instances right now where they are currently using it in line break mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants