Skip to content

fix(Staging-Mode): Handle rollback case for DocumentSchemaMessages #24776

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 6, 2025

Conversation

markfields
Copy link
Member

@markfields markfields commented Jun 6, 2025

Description

A DocumentSchema message could be sent in Staging Mode*, which means we need to handle what happens if it's rolled back.

Turns out it's easy - no internal state changes when generating the op to be submitted, only when processing. Except for a boolean flag indicating whether we are waiting for a pending op, or if we should still try to generate one. So all that's needed is to reset this flag when rolling back.
 
* e.g. if the very first thing the user does on boot (in a session where a schema upgrade is needed) is to enter staging mode. Document Schema messages aren't submitted until another op is, so it would end up staged along with that op in this case.

Reviewer Guidance

This replaces #24729. This branch includes #24774 as well, so blocking this on merging that first.

@Copilot Copilot AI review requested due to automatic review settings June 6, 2025 18:18
@github-actions github-actions bot added base: main PRs targeted against main branch area: runtime Runtime related issues labels Jun 6, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes the rollback handling for DocumentSchema messages sent in staging mode by resetting the op pending flag instead of using the previous onDisconnect behavior. Key changes include:

  • Updating tests to use pendingOpNotAcked instead of onDisconnect.
  • Refactoring DocumentsSchemaController to use an opPending flag for controlling schema message generation.
  • Adjusting ContainerRuntime logic to properly resubmit or rollback DocumentSchemaChange ops and improving related telemetry and summarizer client handling.

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
packages/runtime/container-runtime/src/test/documentSchema.spec.ts Tests updated to call pendingOpNotAcked to reset pending op state.
packages/runtime/container-runtime/src/summary/documentSchema.ts Replaces generateOp with opPending flag and updates related method logic.
packages/runtime/container-runtime/src/pendingStateManager.ts Adds documentation for opMetadata, localOpMetadata, and sequenceNumber.
packages/runtime/container-runtime/src/gc/garbageCollection.ts Enhances comments regarding telemetry and GC op behavior.
packages/runtime/container-runtime/src/containerRuntime.ts Refactors summarizer client flag usage and op resubmission, including squashing logic.
Comments suppressed due to low confidence (1)

packages/runtime/container-runtime/src/summary/documentSchema.ts:691

  • Consider expanding the method documentation for pendingOpNotAcked to clarify that it resets the opPending flag for rollback operations in staging mode, differentiating it from the older onDisconnect behavior.
public pendingOpNotAcked(): void {

@markfields markfields marked this pull request as draft June 6, 2025 18:19
Comment on lines -2735 to -2747
if (!canSendOps) {
this.documentsSchemaController.onDisconnect();
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vladsud - Can you confirm that this call was only needed in the case where we had a pending Document Schema op which failed to make it to the ordering service? i.e. the resubmit case? I moved this call (and renamed it) to resubmit, just wanted to make sure I don't also need it here for some reason I can't see yet.

cc @scottn12 and @ChumpChief as other people familiar with Document Schema

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to read the code to refresh my memory, but yes, your assumption looks correct.

@markfields markfields merged commit 8edb885 into microsoft:main Jun 6, 2025
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: runtime Runtime related issues base: main PRs targeted against main branch Feature_StagingMode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants