Skip to content

Override minimum sequence number of deltamanager#20856

Merged
tyler-cai-microsoft merged 11 commits into
microsoft:mainfrom
tyler-cai-microsoft:override-minimum-sequence-number-of-deltaManager
May 6, 2024
Merged

Override minimum sequence number of deltamanager#20856
tyler-cai-microsoft merged 11 commits into
microsoft:mainfrom
tyler-cai-microsoft:override-minimum-sequence-number-of-deltaManager

Conversation

@tyler-cai-microsoft
Copy link
Copy Markdown
Contributor

@tyler-cai-microsoft tyler-cai-microsoft commented Apr 25, 2024

AB#7843

Override the minimumSequenceNumber of the DeltaManager from the ContainerRuntime to the layers below so that DDSes maintain the proper local state window so that they can properly apply and process local ops when rebasing or applying stashed ops.

@github-actions github-actions Bot added base: main PRs targeted against main branch area: runtime Runtime related issues labels Apr 25, 2024
@msfluid-bot
Copy link
Copy Markdown
Collaborator

msfluid-bot commented Apr 25, 2024

Could not find a usable baseline build with search starting at CI a0b0357

Generated by 🚫 dangerJS against 94b136c

Comment thread packages/runtime/container-runtime/src/containerRuntime.ts Outdated
Comment thread packages/runtime/container-runtime/src/containerRuntime.ts Outdated
Comment thread packages/runtime/container-runtime/src/deltaManagerProxies.ts Outdated
Comment thread packages/runtime/container-runtime/src/containerRuntime.ts Outdated
}

export class DeltaManagerPendingOpsProxy extends BaseDeltaManagerProxy {
public get minimumSequenceNumber(): number {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only one method where MSN shows up. I think we need to review API and see what else needs to be faked.
For example, DM.lastMessage.minimumSequenceNumber should match what this thing returns (although it's worth pointing out that it will "magically" change its value once we reconnect, as MSN will move forward).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this, do you think there are other parts? I did an investigation based on looking at minimumSequenceNumber and ISequenceDocumentMessage usage in the codebase.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My investigation was based on two main paths - summarization and processing ops. Do you think the ScheduleManager needs to have this "proxied" view of the world?

Comment thread packages/runtime/container-runtime/src/pendingStateManager.ts Outdated
Comment thread packages/runtime/container-runtime/src/deltaManagerProxies.ts Outdated
Comment thread packages/runtime/container-runtime/src/pendingStateManager.ts Outdated
@github-actions github-actions Bot added the area: tests Tests to add, test infrastructure improvements, etc label May 1, 2024
@github-actions github-actions Bot added the public api change Changes to a public API label May 1, 2024
// It allows us to lie to the layers below so that they can maintain enough local state for rebasing ops.
if (useDeltaManagerOpsProxy) {
const pendingOpsDeltaManagerProxy = new DeltaManagerPendingOpsProxy(
summarizerDeltaManagerProxy,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use outerDeltaManager instead of summarizerDeltaManagerProxy here. Allows easier recording of adapters (if needed)

const { message, local } = messageWithContext;

// Intercept to reduce minimum sequence number to the delta manager's minimum sequence number.
messageWithContext.message.minimumSequenceNumber = this.deltaManager.minimumSequenceNumber;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we ensure that we have enough validation in place that everything works fine?
At minimum, I'd assert here that MSN moves (with this change) only backwards. I.e. we could replace it with smaller number only on message.
Ideally, we would have an assert somewhere that it's noop if we have no pending messages or already connected and went through resubmit flow.

const minPendingSeqNum = this.pendingStateManager.minimumPendingMessageSequenceNumber;
if (
minPendingSeqNum !== undefined &&
minPendingSeqNum < this.deltaManager.minimumSequenceNumber
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it happen that minPendingSeqNum > this.deltaManager.minimumSequenceNumber ?
I think not, and if it's the case - we should assert that.

Copy link
Copy Markdown
Contributor Author

@tyler-cai-microsoft tyler-cai-microsoft May 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, adding a comment to explain a scenario for such an occurrence. Just submitting a regular op should do the trick, as this is based on ref seq number. Any other lagging client can cause min seq number to be lesser than expected.

@tyler-cai-microsoft tyler-cai-microsoft enabled auto-merge (squash) May 6, 2024 20:41
// D-C-AB
// E-HIJ-FG-D-C-AB
// ^----------^
sharedString3.insertText(0, "AB");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I think this test may be sensitive to sequencing order of the ops in each segment, which is no longer deterministic due to the e2e nature (you can take pains to make it so, but I'm not sure).

I think we should probably lean more heavily on unit test coverage rather than e2e in this realm anyway, and I see you added some to pending state manager, so I'm not super concerned.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I can see what I can do to fix this. You're right this can lead to test flakiness

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A PR CI hit this failure. I'll send you the result. This might need to be disabled.

@tyler-cai-microsoft tyler-cai-microsoft enabled auto-merge (squash) May 6, 2024 21:24
@tyler-cai-microsoft tyler-cai-microsoft merged commit 9ed3bf3 into microsoft:main May 6, 2024
tyler-cai-microsoft added a commit that referenced this pull request May 8, 2024
There was a flakey test created by this,
#20856

Since the test really isn't to test the service, and we require the
service to behave in a certain way when it comes to ordering ops (the
service does not need to promise a specific order, just that they are
ordered), we will just use local server as we can predict the behavior
quite predictably.
tyler-cai-microsoft added a commit that referenced this pull request Aug 29, 2024
There is no task associated with this.

Relevant reference PR:
#20856

@agarwal-navin and I had an offline conversation about this comment, and
it was hard to understand. I've now improved the comment for it to be
easier to understand.

---------

Co-authored-by: Navin Agarwal <45832642+agarwal-navin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: runtime Runtime related issues area: tests Tests to add, test infrastructure improvements, etc base: main PRs targeted against main branch public api change Changes to a public API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants