fix(tree): support attach/detach edits on multi-node chunks#27153
Conversation
|
Hey! You look nice today! Want me to review this PR? Based on the diff (274 lines, 4 files), I've queued these reviewers:
Toggle checkboxes to adjust, then reply yes to start — or ask me anything! |
yes |
There was a problem hiding this comment.
Pull request overview
Fixes ChunkedForest attach/detach behavior when edits target a node index that falls inside a multi-node chunk (e.g. UniformChunk) by ensuring edits operate on chunk boundaries.
Changes:
- Added
splitFieldAtIndexhelper to split/re-chunk the chunk containing a given node index so edits can splice on chunk boundaries. - Updated
ChunkedForestattach/detach edit paths to usesplitFieldAtIndexbefore splicing the field’schunksarray. - Added targeted unit tests covering attach/detach in the middle of a multi-node uniform chunk and the new helper’s boundary cases.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| packages/dds/tree/src/feature-libraries/chunked-forest/chunkedForest.ts | Use splitFieldAtIndex to ensure splice indices align to chunk boundaries for attach/detach. |
| packages/dds/tree/src/feature-libraries/chunked-forest/chunkTree.ts | Introduce splitFieldAtIndex to split/re-chunk a field’s chunk list at a node index. |
| packages/dds/tree/src/test/feature-libraries/chunked-forest/chunkedForest.spec.ts | Add regression tests for attach/detach targeting the middle of a uniform chunk. |
| packages/dds/tree/src/test/feature-libraries/chunked-forest/chunkTree.spec.ts | Add unit tests for splitFieldAtIndex across boundary and empty-field scenarios. |
…ee.ts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
| cursor.firstNode(); | ||
| const before = chunkRange(cursor, policy, remaining, false); | ||
| const after = chunkRange(cursor, policy, total - remaining, true); | ||
| // TODO: this could fail for really long chunks being split (due to argument count limits). |
There was a problem hiding this comment.
Should we address this now?
There was a problem hiding this comment.
currently, and like for all time, we will avoid making chunks long enough this is would be an issue since such cases would be really inefficient.
This really is just a case that the splice API is badly design (given the implementation limits in real JS implementations). Might be time to write our own splice if we don't have one already and use that. Annoyingly I think it would be slower than the built in one for the cases we care about.
For now I'm fine having this as is: if it crashes at runtime, we have a bug making massively long chunks on accident and perf is goanna suck.
Fleet Review — CleanNo issues found across the reviewer fleet for this run. |
Co-authored-by: Craig Macomber (Microsoft) <42876482+CraigMacomber@users.noreply.github.com>
|
There are some serious memory regressions with this change and should not be merged until the cause is found and addressed. It looks like it's a consequence of having multi-node chunks now and the general cost of editing them. Specifically with sequences of edits and how deltaVisitor deals with them. Changes to the loop in splitFieldAtIndex has accounted for most regression, but follow up work could help close the gap with ObjectForest.
|
| * {@link splitFieldAtIndex} bisects a chunk at its midpoint and | ||
| * descends into the half holding the target rather than splitting directly at the requested | ||
| * index. Bisecting shrinks the cost of repeated splits within a large chunk. A sequence of N | ||
| * splits inside an N-sized chunk does O(N log N) total work, at the price of producing a few | ||
| * extra intermediate chunks during descent. Smaller chunks are split exactly so we don't pay | ||
| * the bisection overhead when the linear walk is already cheap. |
There was a problem hiding this comment.
This description never actually mentions what this constant does. Starting with that, then putting the explanation of why in a @remarks block would be better.
Also its unclear if this threshold is about the top level length, or the total node count.
I suspect we will also want a similar threshold for the work on growing existing chunks instead of creating adjacent tiny ones.
I think adding another property to ChunkPolicy might make more sense here.
maybe something like uniformChunkNodeCountDynamicTargetMax: "When a uniformChunk is being edited, attempt to leave chunks with a total node count around this or lower. This means splitting chunks larger than this evenly (for example to avoid NM behavior in cases like splitFieldAtIndex when making N small edits one at a time to an M sized chunk), and merging/extending chunks when doing so does not exceed this (to avoid many tiny chunks while avoiding MN costs related to editing large chunks many times)."
Maybe not something about how uniformChunkNodeCount has a similar role, but for the first time something is chunked, where we can heuristically pick a larger size only risking a one time O(size) cost to split rather than N*size like we have with uniformChunkNodeCountDynamicTargetMax.
There was a problem hiding this comment.
I'll go ahead and update the ChunkPolicy with these changes. I imagine this is going to require some ChunkPolicy changes in other test files with the new property being added. Do we want to make this new property optional? I don't know if we have any consumer reaching into our internals for a custom ChunkPolicy that this change would break.
…Forest.ts Co-authored-by: Craig Macomber (Microsoft) <42876482+CraigMacomber@users.noreply.github.com>
Co-authored-by: Craig Macomber (Microsoft) <42876482+CraigMacomber@users.noreply.github.com>
Summary
Currently there are no multi-node chunks in chunked forest. This PR is part of the effort to support them before enabling the chunker to batch nodes into multi top-level node chunks, specifically
UniformChunks. This PR fixes howdetachEditandattachEdithandle thechunksarray atnodeIndexwhen in the middle of a multi-node chunk.What surfaces the bug
The bug that surfaced when enabling multi-node chunks is in
attachEditanddetachEditinchunkedForest.ts. For both methods they requirednodeIndexto fall on a chunk boundary in the fieldschunksarray. Once multi-node chunks were enabled this allowed node indexing into the middle of a chunk(not allowed). Example:splitFieldAtIndex and how it resolves the bug
A new helper in
chunkTreetakes a field'schunksarray and splits the chunk containingnodeIndexso thatnodeIndexlands on a chunk boundary, returning the index of the chunk that now begins atnodeIndexfor attach/detach to use. IfnodeIndexalready falls on an existing chunk boundary (including a single-node chunk),chunksis not mutated.nodeIndexmust satisfy0 <= nodeIndex <= totalNodes, wheretotalNodesis the sum oftopLevelLengthacross all chunks.nodeIndex === totalNodesis supported because it is the valid and necessary splice point for attach/detach at the end of the field. Each half of a split chunk is re-chunked throughchunkRangeusing the provided policy so the resulting chunks follow the same shape rules as the rest of the field.detachEditcalls this helper method twice, once at each end of the detach range, to isolate the nodes to splice out ofchunks.Testing
These tests are written to allow policy changes to happen, but if detach/attach edit logic changes (don't need to be on a node boundary anymore), then these tests will need to change with that logic change
chunksarray should not be mutated. 2. AtnodeIndex === totalNodes-chunksarray should not be mutated. 3. multi-node chunk - split should occur as intended. Final test callssplitFieldAtIndexon an empty field