-
Notifications
You must be signed in to change notification settings - Fork 550
improvement(staging-mode): Clarify Squash/Rollback behavior for ContainerMessageType.GC #24774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR clarifies how GC operations behave in Staging Mode and refactors the resubmit logic to treat squash support as an exceptional case.
- Enhance
IPendingMessage
with clearer metadata fields for op submission and sequencing. - Add explanatory comments around GC tombstone telemetry and staging behavior.
- Refactor
ContainerRuntime
to introduceisSummarizerClient
, wire it through lifecycle paths, and separate squash vs. non-squash resubmit flows.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
File | Description |
---|---|
packages/runtime/container-runtime/src/pendingStateManager.ts | Added localOpMetadata , opMetadata , and sequenceNumber fields with comments. |
packages/runtime/container-runtime/src/gc/garbageCollection.ts | Expanded comment to list the telemetry events preceding tombstone load errors. |
packages/runtime/container-runtime/src/containerRuntime.ts | Introduced isSummarizerClient ; updated monitoring context; refactored reSubmitBatch , reSubmitWithSquashing , and reSubmit to distinguish squash paths and GC/DocumentSchemaChange handling. |
Comments suppressed due to low confidence (4)
packages/runtime/container-runtime/src/pendingStateManager.ts:57
- The
localOpMetadata
field is typed asunknown
. Consider defining and using a more specific interface to improve type safety and documentation of expected metadata shape.
localOpMetadata: unknown;
packages/runtime/container-runtime/src/gc/garbageCollection.ts:1048
- [nitpick] The comment starts with lowercase "i.e."; for consistency and readability, consider capitalizing the first letter or rephrasing to begin a new sentence.
// i.e. this will be preceded by one of these telemetry events;
packages/runtime/container-runtime/src/containerRuntime.ts:1189
- Public or private properties injected in multiple lifecycle paths benefit from a doc comment explaining their purpose, especially
isSummarizerClient
since it alters teardown and resubmit behavior.
private readonly isSummarizerClient: boolean;
packages/runtime/container-runtime/src/containerRuntime.ts:4680
- The new squash vs. non-squash code paths (
reSubmitWithSquashing
vs.reSubmit
) introduce conditional behavior for GC and schema ops. Ensure unit tests cover each branch, including GC being forwarded or dropped under squash.
const resubmitFn = squash
* for correlation to detect container forking. | ||
*/ | ||
private reSubmitBatch( | ||
batch: PendingMessageResubmitData[], | ||
{ batchId, staged, squash }: PendingBatchResubmitMetadata, | ||
): void { | ||
assert( | ||
this._summarizer === undefined, | ||
0x8f2 /* Summarizer never reconnects so should never resubmit */, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you added a hex code here? Is this copilot generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just a move from LHS line 4687
// Just drop it, but log an error, this is not expected and not ideal, but not critical failure either. | ||
// Currently the only expected type here is TombstoneLoaded, which will have been preceded by one of these events as well: | ||
// GC_Tombstone_DataStore_Requested, GC_Tombstone_SubDataStore_Requested, GC_Tombstone_Blob_Requested | ||
this.mc.logger.sendErrorEvent({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to log more details here such as the id of the object, pkg, etc? That will help debugging if and when we see this log. As of now, debugging this will be very hard.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'll be ok because there will be a corresponding GC_Tombstone_DataStore_Requested
(or other object) event with more info
Description
GC ops can be made while in Staging Mode, and we need to be very clear with how they're handled.
GC_OpDiscarded
to highlight this caseinStagingMode
to all ContainerRuntime logs, includingGC_Tombstone_DataStore_Requested
etc, to further aid investigationsAlso refactor the Resubmit code so it's clear that being able to support squash is an exceptional case - most op types don't need to deal with it.
Reviewer Guidance
This was pulled out of #24729 since it's really a separate change.