Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MergeTree: Reimplement list #11625

Merged
merged 6 commits into from Aug 23, 2022

Conversation

anthony-murphy
Copy link
Contributor

Description

The pre-existing list class was inconsistent in terms of typing, naming and semantics which made it very hard to use and extend. It also had a performance issue where whether an item was in a list, which is frequently used by local references was O(N).

This change fully re-implements the list class. It remains a circularly linked list, but separates the list and node implementation for better typing, and adds a reference to the head node to each list node to support O(1) determining if a node is in a particular list. Additionally, all methods have been implemented to match array methods in terms of shape and semantics as closely as possible to ease usage for those familiar with javascript arrays. The new implementation does regress the unused clear method, which is now O(N) rather than O(1) as each not must be detached from the head node as well.

@anthony-murphy anthony-murphy requested a review from a team as a code owner August 22, 2022 22:25
@github-actions github-actions bot added area: dds Issues related to distributed data structures public api change Changes to a public API base: next PRs targeted against next branch labels Aug 22, 2022
@@ -3,142 +3,127 @@
* Licensed under the MIT License.
*/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is best view in split most, or view the file directly, as it is a complete re-write

@msfluid-bot
Copy link
Collaborator

msfluid-bot commented Aug 22, 2022

@fluid-example/bundle-size-tests: +17.53 KB
Metric NameBaseline SizeCompare SizeSize Diff
aqueduct.js 392.45 KB 394.76 KB +2.32 KB
connectionState.js 680 Bytes 680 Bytes No change
containerRuntime.js 191.92 KB 197.48 KB +5.55 KB
loader.js 151.12 KB 151.06 KB -57 Bytes
map.js 42.63 KB 47.38 KB +4.75 KB
matrix.js 131.63 KB 134.98 KB +3.35 KB
odspDriver.js 150.23 KB 150.11 KB -127 Bytes
odspPrefetchSnapshot.js 38.39 KB 38.35 KB -41 Bytes
sharedString.js 152.42 KB 154.2 KB +1.77 KB
Total Size 1.25 MB 1.27 MB +17.53 KB

Baseline commit: cbb9ed6

Generated by 🚫 dangerJS against 40a67ef

Copy link
Contributor

@Abe27342 Abe27342 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, mostly nits.

public headNode: HeadNode<T>;
private readonly _list?: List<T>;
constructor(list: List<T> | undefined) {
this.headNode = this;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this different from the sugar you use to initialize _next and _prev? assuming not, i'd prefer consistent style

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. as when called by the DataNode which inherit from head node they will pass this and set it as whatever head they are bound to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ha. but looks like i don't even use it that way 🤦‍♂️. i wrote this a while back, and just peeling off changes. i'll make it more consistent

packages/dds/merge-tree/src/collections/list.ts Outdated Show resolved Hide resolved
@@ -279,8 +279,9 @@ export class LocalReferenceCollection {
public removeLocalRef(lref: LocalReferencePosition): LocalReferencePosition | undefined {
if (this.has(lref)) {
assertLocalReferences(lref);
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
ListRemoveEntry(lref.getListNode()!);
lref.getListNode()?.list?.remove(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider assigning lref.getListNode() to a local

return false;
}, true);
const taken: SegmentGroup[] = [];
while (taken.length < count && node) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should implement this in O(N) since it does look like it's used in potential production codepaths, even if for potentially rare scenarios (group ops--not sure how much people use that). Should just be able to use .push and then .reverse at the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group are mainly use for reconnecting, as during rebase things could split.

is the problem that unshift is enumerates all elements, which requires another loop? i think i can even skip the reverse by using an initialized array, and a for loop

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, unshift is O(N) as it moves all of the array elements over. Using an initialized array + for loop is fine too.

anthony-murphy and others added 2 commits August 22, 2022 18:41
Co-authored-by: Abram Sanderson <Abram.sanderson@gmail.com>
@@ -113,18 +113,18 @@ export class Client {
* It is used to get the segment group(s) for the previous operations.
* @param count - The number segment groups to get peek from the tail of the queue. Default 1.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to your change, but I wouldn't know without looking at the code what order the returned SegmentGroups were in. It would be helpful to update the comment, optionally.

@anthony-murphy anthony-murphy merged commit d69ff5a into microsoft:next Aug 23, 2022
@anthony-murphy anthony-murphy deleted the reimplement-list branch August 23, 2022 20:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: dds Issues related to distributed data structures base: next PRs targeted against next branch public api change Changes to a public API
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants