Skip to content

Tree: EditManager summary#12513

Merged
yann-achard-MS merged 23 commits into
microsoft:mainfrom
yann-achard-MS:edit-manager-summary-take2
Oct 21, 2022
Merged

Tree: EditManager summary#12513
yann-achard-MS merged 23 commits into
microsoft:mainfrom
yann-achard-MS:edit-manager-summary-take2

Conversation

@yann-achard-MS
Copy link
Copy Markdown
Contributor

@yann-achard-MS yann-achard-MS commented Oct 17, 2022

Changes

This PR adds EditManager data to the set of data being summarized.

Motivation

This is necessary in order to ensure that clients which load a document while it is being edited are able to interpret the edits correctly.

Implementation Notes

This PR does not add support for generating summaries from clients with local edits. EditManager could be updated to support this (see comments in editManager.ts).

The size of this additional summary data is currently unbounded because the EditManager data is unbounded (grows for each edit). This is being addressed separately, and the fix will result in the additional summary data becoming bounded as well.

@github-actions github-actions Bot added area: dds Issues related to distributed data structures base: main PRs targeted against main branch labels Oct 17, 2022
@yann-achard-MS yann-achard-MS marked this pull request as ready for review October 18, 2022 17:18
@yann-achard-MS yann-achard-MS requested a review from a team as a code owner October 18, 2022 17:18

private localSessionId?: SessionId;

readonly computationName: string = "EditManager";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing access modifier here

super();
}

public setSessionId(id: SessionId): void {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could call this initSessionId to make it more implicit that it can only be set once, but it's totally fine as-is too.

/**
* The storage key for the blob in the summary containing EditManager data
*/
const blobKey = "EditManagerBlob";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's nothing wrong with this as they are, but you don't have to worry about these keys colliding with other indexes since they get scoped to each index. So if you wanted to, you could just call them "Blob" and "String", for example.

}
}

export function encodeSummary<TChange>(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this stringifySummary? Fluid summarizers use encode in a way that doesn't include stringifying. I don't know if it's a good naming convention but maybe we can avoid confusion by matching it.

trackState?: boolean,
telemetryContext?: ITelemetryContext,
): ISummaryTreeWithStats {
const builder = new SummaryTreeBuilder();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method can use the createSingleBlobSummary helper

telemetryContext?: ITelemetryContext,
): Promise<ISummaryTreeWithStats> {
const editDataBlobHandle = await this.editDataBlob.get();
const builder = new SummaryTreeBuilder();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method can use the createSingleBlobSummary helper

services: IChannelStorageService,
parse: SummaryElementParser,
): Promise<void> {
const [hasString, hasBlob] = await Promise.all([
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it will really make a noticeable difference in practice, but I think if you did:

if (await services.contains(blobKey)) {
    const handleBuffer = await services.readBlob(blobKey);
    ...
} else {
    assert(await services.contains(stringKey), "EditManager data is required in summary");
    schemaBuffer = await services.readBlob(stringKey);
}

The flow is a little bit more minimal. Or maybe even:

...
} else {
    schemaBuffer = await services.readBlob(stringKey).catch(() => undefined) ?? fail("EditManager data is required in summary"));
}


await provider.ensureSynchronized();

// Stop the processing of incoming changes on tree2 so that it does not learn about the deletion of Z
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tree2 -> tree3

Copy link
Copy Markdown
Contributor

@noencke noencke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarization code looks good!

Comment on lines +183 to +192
validateTree(tree1, ["A", "B", "C"]);
validateTree(tree2, ["A", "B", "C"]);
validateTree(tree3, ["A", "B", "C"]);
// tree4 should only get the correct end state if was able to get the adequate
// EditManager state from the summary. Specifically, in order to correctly rebase the insert
// of B, tree4 needs to have a local copy of the edit that deleted Z, so it can
// rebase the insertion of B over that edit.
// Without that, it will interpret the insertion of B based on the current state, yielding
// the order ACB.
validateTree(tree4, ["A", "B", "C"]);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
validateTree(tree1, ["A", "B", "C"]);
validateTree(tree2, ["A", "B", "C"]);
validateTree(tree3, ["A", "B", "C"]);
// tree4 should only get the correct end state if was able to get the adequate
// EditManager state from the summary. Specifically, in order to correctly rebase the insert
// of B, tree4 needs to have a local copy of the edit that deleted Z, so it can
// rebase the insertion of B over that edit.
// Without that, it will interpret the insertion of B based on the current state, yielding
// the order ACB.
validateTree(tree4, ["A", "B", "C"]);
const expectedValues = ["A", "B", "C"];
validateTree(tree1, expectedValues);
validateTree(tree2, expectedValues);
validateTree(tree3, expectedValues);
// tree4 should only get the correct end state if was able to get the adequate
// EditManager state from the summary. Specifically, in order to correctly rebase the insert
// of B, tree4 needs to have a local copy of the edit that deleted Z, so it can
// rebase the insertion of B over that edit.
// Without that, it will interpret the insertion of B based on the current state, yielding
// the order ACB.
validateTree(tree4, expectedValues);

@yann-achard-MS yann-achard-MS merged commit 5fc5083 into microsoft:main Oct 21, 2022
sharptrip pushed a commit to sharptrip/FluidFramework that referenced this pull request Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: dds Issues related to distributed data structures base: main PRs targeted against main branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants