New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issues related to Removed Segments in MergeTree #11678
Fix issues related to Removed Segments in MergeTree #11678
Conversation
…to main-fix-mt-remove
…y/FluidFramework-1 into main-fix-mt-remove
…to main-fix-mt-remove
…to main-fix-mt-remove
…to next-fix-mt-remove
|
this is basically the same change that has been in draft for a while now, i just reduced to scope a bit, and put it under an option for easier rollout control. |
⯅ @fluid-example/bundle-size-tests: +17.29 KB
Baseline commit: b7b2bd3 |
| @@ -81,7 +81,7 @@ function runConflictFarmTests(opts: IConflictFarmConfig, extraSeed?: number): vo | |||
| } | |||
| mt.seedWithArray(seedArray); | |||
|
|
|||
| const clients: TestClient[] = [new TestClient()]; | |||
| const clients: TestClient[] = [new TestClient({ mergeTreeUseNewLengthCalculations: true })]; | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be concerned about losing coverage for the non-mergeTreeUseNewLengthCalculations case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't mix it, however the results of mixing it won't be worse that than the results of having it off, as without it in the rare cases this fixes, we will already have decoherence. I've turned it on here just to keep coverage, it is off for the replay tests which cover very similar cases, and on in only a couple specific regression cases in applyMsg.spec.ts.
…to next-fix-mt-remove
Co-authored-by: Abram Sanderson <Abram.sanderson@gmail.com>
…y/FluidFramework-1 into next-fix-mt-remove
| // ensure all clients have seen the same ops | ||
| assert.equal( | ||
| c.getCurrentSeq(), | ||
| this.clients[0].getCurrentSeq(), | ||
| `Client ${c.longClientId} current seq does not match client ${this.clients[0].longClientId}`, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any value in printing the seq numbers in the message in addition to the client ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think so. this should only ever fire in test development, and signals that the test code the processes messages is incorrect. the sequence numbers themselves won't tell the developer much without debugging
| it("Local insert after acked local delete", () => { | ||
| const clients = createClientsAtInitialState( | ||
| { initialState: "ZZ", options: { mergeTreeUseNewLengthCalculations: true } }, | ||
| "A", "B", "C"); | ||
|
|
||
| const logger = new TestClientLogger(clients.all); | ||
|
|
||
| let seq = 0; | ||
|
|
||
| const op1 = clients.C.makeOpMessage(clients.C.removeRangeLocal(0, 1), ++seq); | ||
| clients.C.applyMsg(op1); | ||
|
|
||
| const op2 = clients.B.makeOpMessage(clients.B.removeRangeLocal(1, 2), ++seq); | ||
|
|
||
| const op3 = clients.C.makeOpMessage(clients.C.insertTextLocal(0, "C"), ++seq); | ||
|
|
||
| const op4 = clients.B.makeOpMessage(clients.B.insertTextLocal(1, "B"), ++seq); | ||
|
|
||
| clients.A.applyMsg(op1); | ||
| clients.B.applyMsg(op1); | ||
|
|
||
| const messages = [op2, op3, op4]; | ||
| while (messages.length > 0) { | ||
| const msg = messages.shift()!; | ||
| clients.all.forEach((c) => c.applyMsg(msg)); | ||
| } | ||
|
|
||
| logger.validate(); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I understanding correctly, that in these tests we don't check against a hardcoded expected final state/string? logger.validate() only seems to check that all clients have the same end state as the first one... but do we have any guarantees that that one is right in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a correct understanding. we have other test like the replay tests that capture the state and ensure it doesn't diverge over time.
it wouldn't be a bad idea to capture the end state in these cases as we can know it and validate against it. Locally i actually have a change to the logger that takes in a base text. I'll file a bug to get these tests updated once that capability is added.
AB#1803
Handling removed segments in MergeTree was incorrect, specifically when to consider them in the case of conflict. This case was complicated by the fact that clients can have different removed segments depending on their lifetime. For instance, a new client loading from summary could have no removed segments, while long running clients could have many and different removed segments yet to be cleaned up by zamboni.
Previously there was logic added to treat the length of some deleted segments as undefined. This change was made as any segment of length 0 is considered when resolving a conflict, and this allowed the merge tree to differentiate removed segments from segments with sequence numbers greater than the reference sequence number of some incoming operation. This improved the situation, as it allowed us to exclude those segments from conflict resolution.
However, the heuristic to determine when to consider a removed segment for conflict resolution or not was incorrect, and that is what is improved upon here. Specifically, removed segments who's removed sequence number in within the collab window, are now consider for conflict resolution, and those outside the collab window are not. One key point is we do not consider the removed sequence number itself in conflict resolution, we only consider the sequences insert sequence number as segment order has always been determined by insert order which is an invariant we cannot break.
This works because all clients, regardless of lifetime must agree on the state of all segments in the collab window, as these segments are not yet finalized in the sense that they could be above or below the reference sequence number for other clients, so all clients must be able to reason over them.
This change also brings in some new unit tests that would previously fail, and does a bit of unit test refactoring to make them more consistent.
For now, i've put this new length calculation under and option, as it will regress some existing assumptions around undo redo, but to fix undo redo i want to be able to test with this change
fixes #9703
related #1009 AB#45 AB#841