New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement insertion order support for SharedMap #17580
Implement insertion order support for SharedMap #17580
Conversation
⯅ @fluid-example/bundle-size-tests: +32.91 KB
Baseline commit: af58ac2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got through reading existing tests today: I'll start looking through the implementation tomorrow. One comment on the test plan--we should add ordering consistency validation to the map fuzz tests assertMapsAreEquivalent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have some comments on top-level state that I'd like clarifications on before thinking too deeply about the logic changes to mapKernel.
| /** | ||
| * Object to store the message Id's for all set op's | ||
| */ | ||
| private readonly pendingSetTracker: Map<string, number[]> = new Map(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this different from pendingKeys? Seems like it serves a similar purpose and it'd be great to only store one.
If you can unify them, please update the doc comment on pendingKeys to note the meaning of number[] as you've done here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose of pendingKeys is to maintain message IDs associated with all pending ops on a specific key, including both set and delete. In contrast, pendingSetTracker is dedicated solely to tracking pending set messages. There do exist some overlap in functionality between these two data structures.
If we want to unify them, one option is to modify the design of pendingDeleteTracker from a Map<string, number> (which currently stores only the count of pending delete messages) to a Map<string, number[]>. This change would allow pendingDeleteTracker to also store the IDs of pending delete messages. As a result, the combination of pendingSetTracker and the enhanced pendingDeleteTracker could replace the need for pendingKeys completely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's probably conceptually simpler to organize the state more like you've suggested. I'd propose a slightly different organization: map from key to an ordered list of 'events' that occurred at that key. We could use pending message id for those events. Then we'd also maintain lookup from message id to whatever op metadata we need.
I'd rather tackle this kind of thing in a follow-up PR though (maybe before we bring it to main, maybe after, not sure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking pretty good. I've reviewed the logic for set/delete and took a cursory glance at summarization concerns. Still need to dig deeper into:
- clear
- resubmit
- local rollback
- stashed ops (would be ok settling that in a different work item; logic in message handlers looks sane I just doubt we have good coverage of it)
- finish summarization review
| previousValue: ILocalValue; | ||
|
|
||
| /** | ||
| * All associated pending message id's of a local set op, or the insertion index of an ack'd key, or both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious how this description of the state maps to the type presented. Could you clarify more?
In general nested arrays are a bit of a code smell without qualification (except when dealing with fundamentally N-D data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added examples to explain it
packages/dds/map/src/mapKernel.ts
Outdated
| if (op.type === "set") { | ||
| this.ackPendingSetOp(op, pendingMessageId); | ||
| } else if (op.type === "delete") { | ||
| // Adjust the keys order if it is already ack'd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment doesn't look like it applies here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added more detailed comments
| } | ||
|
|
||
| pendingSetIds.shift(); | ||
| if (pendingSetIds.length === 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover comment
packages/dds/map/src/mapKernel.ts
Outdated
| if (this.pendingKeys.size > 0) { | ||
| this.clearExceptPendingKeys(); | ||
| return; | ||
| } | ||
| this.clearCore(local); | ||
| this.localKeysIndexTracker.clear(); | ||
| this.pendingSetTracker.clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: it's kinda pre-existing, but I feel like if-else control flow here is simpler than the early return for this.pendingKeys.size > 0.
| /** | ||
| * Object to store the message Id's for all set op's | ||
| */ | ||
| private readonly pendingSetTracker: Map<string, number[]> = new Map(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like it's probably conceptually simpler to organize the state more like you've suggested. I'd propose a slightly different organization: map from key to an ordered list of 'events' that occurred at that key. We could use pending message id for those events. Then we'd also maintain lookup from message id to whatever op metadata we need.
I'd rather tackle this kind of thing in a follow-up PR though (maybe before we bring it to main, maybe after, not sure)
| @@ -386,16 +602,44 @@ export class MapKernel { | |||
|
|
|||
| /** | |||
| * Populate the kernel with the given map data. | |||
| * | |||
| * If there exixts data created during the detached stage without assigned creation index, it is essential to populate these indices | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than populate implicitly in summary load here, we should populate this explicitly while generating the attach summary. This will make sure we're able to distinguish documents generated with modern versions of the code rather than legacy ones once we move to using something attributable (seq) rather than an index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it is already covered in the "scenario 2" mentioned in the comments of the function getSerializedStorage. The creation indices are populate explicitly when generating the attach summary.
| * If there exixts data created during the detached stage without assigned creation index, it is essential to populate these indices | ||
| * (starting from 0) and keep track of their count. | ||
| * | ||
| * Additionally, for loading data in the old format lacking an index field, it is necessary to populate the corresponding creation index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, agreed this is important.
| * | ||
| * There are two scenarios in which the creation index may be absent: | ||
| * 1. When the client is detached, we defer assigning the creation index until the data is about to be summarized. | ||
| * 2. When the client is attached, but data was created during the detached stage, the creation indices for that data are still missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting. I think that might be an inaccuracy in the harness--the attach summary should be generated before attaching.
| type: "delete", | ||
| pendingMessageId, | ||
| previousValue: localOpMetadata.previousValue, | ||
| previousIndex: localOpMetadata.previousIndex, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"firstPendingSetId" is probably better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
even if it's maybe a little inaccurate (it's the first pending set after the most recently submitted delete)
| pendingMessageId, | ||
| previousValue: localOpMetadata.previousValue, | ||
| previousIndex: localOpMetadata.previousIndex, | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not undefined was already checked above
| }; | |
| if (pendingSetIds.length === 0) { |
| previousValue: localOpMetadata.previousValue, | ||
| previousIndex: localOpMetadata.previousIndex, | ||
| }; | ||
|
|
||
| this.submitMessage(op, localMetadata); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems not right that we're using a message id rather than an index here.
…even on setup (microsoft#17964) ## Description The initial implementation of the performance runs would skip drivers only on the load and summarize events but would still run once during initial setup. AFR is timing out even on that step for Matrix tests. In order to avoid this timeout, we are now enforcing the skip even during the initial setup.
## Description In order to better understand the scenarios where T9S is triggering NoJoinOps due to clients disappearing, we need the verbose logs to be logged for all the containers and not in 50% of them. When we have a fix for it or If the amount of space on Kusto increases dramatically we might re-visit and go back to 50%.
… Connection Manager being disposed before new connection setup (microsoft#17962) ## Description Latest event (fluid:telemetry:RouterliciousDriver:DeltaConnection:ConnectDocumentSuccess) to try to address the NoJoinOps happening in tinylicious seems to indicate a new client is added, join op is sent over the wire but there is no additional logging for that clientid. The current theory is the connection manager being disposed of while waiting for the await docService.connectToDeltaStream to return. We are also adding additional verbose logs to better understand if it still happens.
Fix few errors found during alpha/beta publishing. The changes are around: 1. setPackageTypesField flub command 2. Resolving an error validation check.
Update api-extractor-base config to not generate rollups by default. Enable rollups in tree2, devtools, api-extractor-documenter
It turned out that `lib` build is missing for the `@fluid-experimental/property-proxy` package since of misconfiguration. This PR fixes the issue.
…n garbage collector (microsoft#17948) This PR moves the gc tomsbtone configs initalization into one place in garbage collector configs. Currently, the same configs are initialized in DataStores, DataStoreContext and GarbageCollector making it possible for them to get out of sync. Also, changing it requires changing them everywhere making changes more complex and prone to errors.
…#17972) SharedListNode.insert*() needed adjustment to accept the new factory-created content. SharedListNode.move*() needed adjustment to accept to accept a SharedListNode as source. Testing also uncovered and fixed a couple of minor bugs in the move implementation: * Use 'boxedAt()' when validating that the items in the source list comply with the destination type. * The adjustment of the destination index wasn't quite right. --------- Co-authored-by: Noah Encke <78610362+noencke@users.noreply.github.com>
## Description Related to ADO:6011. Also, it was brought to my attention by @clarenceli-msft. The issues here were: - `this.clientSequenceNumber` was only incremented at read time when grouped batching was enabled. This means that that when batches are ungrouped, there can be two messages with the same client sequence number, same sequence number and different content. - grouped batching uses fake client sequence numbers, just to differentiate between messages within the same batch.
… to 2.0.3 (microsoft#17975) Picks up this fix: microsoft#17947
Some fixes for the Node.js cluster module POC from microsoft#17243. ## Description There were 2 primary bugs: 1. When not setting up a server to perform sticky-session load balancing for Socket.io, we can't just have a blank server. So I added `NullHttpServer` and `NullWebServer` as dummy placeholders for various listeners assigned in `runner.ts` since the primary thread in this case isn't actually doing anything but load-balancing. 2. The config defaults were all over the place for the NodeClusterWebServerFactory, so I unified those in the constructor. This solved a bug where workers were timing out instantly, because there was no timeout config.
|
🔗 Found some broken links! 💔 Run a link check locally to find them. See linkcheck output |
|
@clarenceli-msft, what kind of feedback are you looking for on this review since it's marked as Draft? |
I am closing this PR. Originally, I intended to merge the updates into a feature branch instead of main. While resolving conflicts, I fetched from the main branch, but it appears like I forgot to synchronize the target branch with the main. Consequently, the fetch/pull operation brought all the commits targeting the main branch (since the last update time) to this PR. I tried using cherry-pick but unable to clean it up, I have opened a new PR #18376 to exclusively include the contents that I intend to commit. |
AB#5618