buffer improvement with piece tree by rebornix · Pull Request #41172 · microsoft/vscode

rebornix · 2018-01-05T06:10:20Z

This is an early version of adopting piece table with red black tree as new text buffer. It still contains quite a few hacks (model builder, textsource) and applyEdits is not returning good info and leads to whole page re-tokenization, but we can still get an idea of how a chunk based, tree structure buffer works in vscode.

Under the hood it's a piece table, which has two string buffers to hold the original content and newly added content. Instead of using a doubly-linked list to store the pieces, here I use a red black tree in order to get a good average time complexity of most operations.

In addition to basic red black tree node properties, each node has

a reference to a Piece (a mapping to buffer)
the content size of its left subtree
new line count of its left subtree

I didn't use a piece's offset in the document as key because that would lead to an inorder traversal of a lot nodes when inserting/deleting. The latter two properties above can help us find the position to insert/delete a node, and make sure after each operation, we only need to do some updates from the modified node to tree root, at most.

Todo:

Right now the bottleneck of this implementation is CRLF. As we don't do splits by /\r\n|\r|\n/ and the system is heavily relying on line break counts, we need to do some validation for each operation as it may split CRLF to CR and LF. When the changed_buffer size is large, this validation is costly.

Recording of some crazy file (large.c is 50MB, 1.4MM lines; Heap-xyz is 50MB, 3.3MM lines)

Left: Piece Table, Right: Insider

alexdima · 2018-01-05T17:00:58Z

@rebornix I wonder if all the initial loading speed advantage is due to the memory optimization (avoiding sliced strings) that I pushed a while ago. How does it look when setting AVOID_SLICED_STRINGS to false for the current implementation ?

rebornix · 2018-01-05T19:08:55Z

@alexandrudima you are right, memory optimization is the major thing that makes the line buffer slow in file opening.

Disable AVOID_SLICED_STRINGS for files which have a lot of lines can improve the speed of opening. After disabling it, the major thing becomes split, line splitting is more costly than simply constructing line starts

Line Buffer

PT

They are close so from performance perspective, current implementation is good. The cost goes to memory.

…rt later.

…ill revert later." This reverts commit 96e79f7.

rebornix added 4 commits January 4, 2018 19:23

safe point

f4c8522

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

a731f89

safe point

3fca169

adopt ITextBuffer

aa7d2a7

rebornix assigned alexdima and rebornix Jan 5, 2018

alexdima added 2 commits January 5, 2018 18:39

Merge remote-tracking branch 'origin/master' into rebornix/buffer-pt

ef2b742

Remove cast; organize & rename new code

aa5412e

rebornix and others added 20 commits January 5, 2018 11:47

normalize line feed.

7ecd566

Update CR Count.

4476909

Get Value in range should normalize line feed.

06a8392

Fix getPositionAt.

118bb8a

getRangeAt parameters are start+len, not start+end.

06a66b7

Ensure getPostion returns valid value.

7426b1a

fix wrong linestarts

63ce4b5

move piecetable logic out of text buffer

b02b2b5

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

ee34158

disable failing CR test cases temporarily for custom build. will reve…

96e79f7

…rt later.

benchmarks

ec64163

Revert "disable failing CR test cases temporarily for custom build. w…

2d12f2e

…ill revert later." This reverts commit 96e79f7.

rename benchmark getLineContent

b03e50c

move auto test helpers out

c43426c

Add benchmarks for Save, Replace, Random and Seqential edits.

3bf13da

format benchmark output

85017ba

refactor, remove repeated computation.

7c90d50

Extract createNewPiece method.

59d371f

Extract CR/LF validation method.

63986a7

update piecetable insert method signature.

4fa88af

rebornix and others added 18 commits January 15, 2018 20:13

safe point before new line starts.

4aacd87

Add test cases for piece buffer.

dfde8bf

Avoid CR at the end of chunk.

d502615

Remove unnecessary prefixsum related types.

916c27c

EOL normalization.

783d474

remove duplicate code.

6e1765e

Move RawContentChange events creation to TextModel

696d318

Test pass.

0784ba0

RBTree invariants validation.

dadeb22

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

82b6766

Rename to PieceTree which reflects its data structure more precisely.

09986eb

Refactor piece tree to hide tree related logic from Buffer API.

1ad6051

Better implementation for buffer equals and charCodeAt.

b99e537

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

489876e

implement operations reduce and correct search and replace benchmark.

a8376ea

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

d0e03d1

Refactor benchmark utilies.

567bf6e

Use uint16array to store numbers smaller than 65536

771da41

rebornix changed the title ~~[WIP] buffer improvement with new model~~ buffer improvement with piece tree Jan 18, 2018

rebornix added 8 commits January 18, 2018 15:15

Merge remote-tracking branch 'upstream/master' into rebornix/buffer-pt

0660197

CreateSnapshot for piece tree.

8c47809

Fix cases for iterator.

23b732a

disable piece tree.

97eb945

avoid unnecessary code change.

b91616b

add detailed comments for piece tree snapshot.

de36a96

inline rbTree get/set NodeColor.

8e9a69c

Add edcore to benchmark; Support iterations in benchmark.

8888586

rebornix merged commit e54a781 into master Jan 19, 2018

rebornix mentioned this pull request Jan 24, 2018

Is this project going to be maintained? rubyide/vscode-ruby#239

Closed

github-actions bot locked and limited conversation to collaborators Mar 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer improvement with piece tree#41172

buffer improvement with piece tree#41172
rebornix merged 55 commits intomasterfrom
rebornix/buffer-pt

rebornix commented Jan 5, 2018 •

edited

Loading

Uh oh!

alexdima commented Jan 5, 2018 •

edited

Loading

Uh oh!

rebornix commented Jan 5, 2018 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rebornix commented Jan 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexdima commented Jan 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rebornix commented Jan 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rebornix commented Jan 5, 2018 •

edited

Loading

alexdima commented Jan 5, 2018 •

edited

Loading

rebornix commented Jan 5, 2018 •

edited

Loading