Incremental/async line breaking #610

raphlinus · 2018-04-10T22:56:38Z

This is a design discussion for changes around line breaking. It's closely related to #565 (width measurement) and #58 (word wrap). I wanted to split out the discussion regarding generation and rendering of soft line breaks from width measurement, as the latter involves the front-end and the former is almost entirely a core consideration.

Current state

The current prototype of line breaking is (mostly) incremental, but synchronous. Basically, that means that the recomputation of breaks might be inexpensive, as the "small delta" property is preserved, but the computation is done on every edit, before rendering. Incremental computation is not always possible, for example on document load and when changing the wrap width. In the current prototype, wrapping is pretty fast, but incorporating width measurement will slow things down.

To summarize very briefly, there's an optional Breaks structure in view. When not present, hard breaks are extracted from the text rope. When present, it overrides those breaks. The after_edit method in View updates the breaks, keeping them synchronized. The Breaks structure is an instance of rope, which is conceptually equivalent to a sequence of offsets for the breaks.

On edit, operational transformation preserves the breaks outside the edited range, deleting all the breaks inside. Then the rewrap method recomputes the breaks inside that range. It's a fairly clever algorithm that tries to touch only the lines that are absolutely needed, ie in the case of a very long paragraph it does not rewrap the entire paragraph.

See rope science 6 for a little discussion of the thinking that went into this algorithm, as well as plans for async.

Going async

Opening or rewrapping a large document will be a poor experience, as the core will block for potentially very large amounts of time. Our approach is to go async, as is done for syntax highlighting.

In an async model, rewrapping is computed in chunks. For pending chunks, the breaks are not necessarily accurate. The algorithm is eventually consistent, so at some point the set of pending chunks becomes empty. In these intermediate states, the document is still rendered, and interaction is still possible.

Deleting all the breaks for newly inserted text would be a very bad experience, as very long lines would result. Therefore, one change I propose is to separate hard and soft breaks. Hard breaks would be derived from the text rope, as is done in the no-breaks case, and the Breaks data structure would only contain soft breaks. The two sources of breaks would be merged on rendering (and for related computations such as converting between offsets and visual line numbers). One consequence is that there would no longer be two code paths (corresponding to the fact that Breaks is an Option); the non wrapped case would be represented by Breaks simply being empty.

Other than that, the breaks state for a view (and it is view specific, multiple views into the same buffer would each have their own breaks state) would grow to contain a "frontier," representing pending rewrap computation. The frontier consists of a number of ranges. The beginning of the range is always a break (whether hard or soft). The breaks within a range are not guaranteed to be valid. Re-wrapping may cascade past the end of a range, but once a break is found that synchronizes with a break outside the frontier (including any hard break), rewrapping may end. This logic is actually quite similar to the existing
rewrap logic.

TODO: it could be useful to write down a mathematical predicate for the invariant.

Progress is made by advancing the beginning of some range, computing valid breaks and editing the breaks structure. To rewrap the entire document (for example, on load), the invariant is established by setting a range for the entire document.

Prioritization

How many ranges? The existing rewrap logic has just one, and this is an appealing simplification, but to provide the best experience it makes sense to consider more than one range, and also prioritize the processing. The number of ranges might still be bounded (ie on multiple cursor edits, we might not get a fine-grained frontier).

This is all motivated by drag-resizing of the window. If the document is scrolled near the beginning, then rewraps will happen with nonzero but small lag, which should be a decent experience. However, in a large document scrolled near the end, the lag might be significant.

A solution is to break the frontier into ranges, corresponding to the visible viewport, the area above, and the area below. These would be prioritized so that the viewport would be processed first.

An important refinement would be stabilizing the scroll position so that jumping up and down would be minimized when the region above the viewport is rewrapped.

Prioritization can be done later, as a refinement; usability should be pretty good for small documents, and it's also not critical for initial document load.

The text was updated successfully, but these errors were encountered:

raphlinus mentioned this issue Apr 19, 2018

Add experimental width-based wrapping #607

Merged

cmyr added the discussion label May 19, 2018

raphlinus mentioned this issue Oct 11, 2018

Remove Interval data structure #914

Closed

This was referenced Oct 17, 2018

word wrap destroys the paragraph breaks below xi-editor/xi-mac#301

Closed

Roadmap (October 2018) #937

Open

This was referenced Nov 30, 2018

Unify width and column wrapping #1021

Merged

Word wrap checkpoint 2 (visual line iterator) #1027

Merged

This was referenced Dec 15, 2018

Incremental word wrap initial impl #1041

Merged

Make word wrap incremental #494

Closed

cmyr closed this as completed in #1041 Dec 17, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental/async line breaking #610

Incremental/async line breaking #610

raphlinus commented Apr 10, 2018

Incremental/async line breaking #610

Incremental/async line breaking #610

Comments

raphlinus commented Apr 10, 2018

Current state

Going async

Prioritization