Skip to content

fix(tree): Optionally use UUIDs for non-finalized identifiers in tree attach summaries#27278

Draft
Abe27342 wants to merge 2 commits into
mainfrom
fix/attach-summary-finalized-ids
Draft

fix(tree): Optionally use UUIDs for non-finalized identifiers in tree attach summaries#27278
Abe27342 wants to merge 2 commits into
mainfrom
fix/attach-summary-finalized-ids

Conversation

@Abe27342
Copy link
Copy Markdown
Contributor

@Abe27342 Abe27342 commented May 11, 2026

When a SharedTree is attached to an already-attached container, tree identifier values generated locally before the IdCompressor has been allocated a cluster can sometimes be serialized as negative op-space IDs. This is a bug which will be fixed in a separate PR (near-term, by using the long id in these cases similarly to what we do in some other codec paths).

When other clients attempt to load SharedTrees with such summaries, they currently crash as the decode path attempts to use the local client's session ID as the originating ID, which is incorrect. In the general case, the information about which client generated the ID has been lost at this point.

This PR adds a feature flag to configuredSharedTree which can potentially be enabled (with some risk, see notes) to allow documents that have reached this state to continue to operate anyway. It does so by generating a v5 uuid based on the SharedObject ID + op-space unfinalized ID. This provides similar collision-resistant properties as the original intent of the ID provided unfinalized ids are only currently written at attach summary generation time (the known bug). Depending on how the application used the ID, this may break stability guarantees (e.g. if application assumed the id would remain stable and wrote it to an external store or a different data store within Fluid, but did so using the previous long form). Thus this flag should only be enabled with those risks in mind.

…summaries

When a SharedTree is attached to an already-attached container, tree
identifier values generated locally before the IdCompressor has been
allocated a cluster were serialized as negative op-space IDs. Other
clients could decode the attach op because it carries originatorSessionId,
but once the summarizer reuses the blob as a handle in a later summary
with no intervening ops to finalize, the persisted IdCompressor state
(serialized with withSession=false) has no record of those local IDs and
new clients fail to load.

Thread an `idsMustBeFinalized` flag from ForestSummarizer (set when
incrementalSummaryContext === undefined) through FieldBatchEncodingContext
into the schema-based encoder. When the flag is set and an identifier
would normalize to a negative op-space ID, emit the stable UUID instead.
The identifier-field decode path already accepts either a number or a
string, so no format-version bump is required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 11, 2026

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (323 lines, 20 files), I've queued these reviewers:

  • Correctness — logic errors, race conditions, lifecycle issues
  • Security — vulnerabilities, secret exposure, injection
  • API Compatibility — breaking changes, release tags, type design
  • Performance — algorithmic regressions, memory leaks
  • Testing — coverage gaps, hollow tests

How this works

  • Adjust the reviewer set by ticking/unticking boxes above. Reviewer toggles alone don't trigger anything.

  • Tick Start review below to dispatch the review fleet.

  • After review finishes, tick Start review again to request another run — it auto-resets after each dispatch.

  • This comment updates as new commits land; your reviewer selections are preserved.

  • Start review

@Abe27342 Abe27342 changed the title fix(tree): emit stable UUIDs for non-finalized identifiers in attach … fix(tree): Optionally use UUIDs for non-finalized identifiers in tree attach summaries May 12, 2026
@Abe27342 Abe27342 marked this pull request as ready for review May 12, 2026 01:28
@Abe27342 Abe27342 requested a review from a team as a code owner May 12, 2026 01:28
Copilot AI review requested due to automatic review settings May 12, 2026 01:28
@Abe27342
Copy link
Copy Markdown
Contributor Author

Abe27342 commented May 12, 2026

I haven't fully reviewed this on my end yet, but if someone wants to take a stab before I have feel free. I'm at the point I want CI checks to start running so I can get it in a mergeable (pending approval) state. Oops, wrong PR. Thought this was #27281

@Abe27342 Abe27342 marked this pull request as draft May 12, 2026 01:29
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces an idsMustBeFinalized encoding knob that forces attach/first-summary paths to serialize non-finalized (negative) compressed IDs as stable UUID strings, preventing load-time failures when session-local IdCompressor state is unavailable.

Changes:

  • Add idsMustBeFinalized to the change/summary encoding contexts and thread it through edit manager, forest summary, and chunked-forest encoding pipelines.
  • Update RevisionTagCodec and identifier-field encoding to emit stable UUIDs instead of negative op-space numbers when idsMustBeFinalized is enabled.
  • Update and add tests, including a regression repro covering identifier encoding under attach-summary conditions.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/dds/tree/src/test/rebase/revisionTagCodec.spec.ts Update tests for new RevisionTagCodec.encode(tag, context) signature.
packages/dds/tree/src/test/feature-libraries/sequence-field/sequenceFieldCodecs.test.ts Pass encoding context through revision tag encoding in sequence-field codec tests.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/nodeEncoder.spec.ts Add regression coverage for identifier encoding with idsMustBeFinalized.
packages/dds/tree/src/shared-tree/treeCheckout.ts Provide encoding context when encoding revision tags for consistency comparisons.
packages/dds/tree/src/shared-tree-core/editManagerSummarizer.ts Enable idsMustBeFinalized when incrementalSummaryContext is absent.
packages/dds/tree/src/shared-tree-core/editManagerCodecsVSharedBranches.ts Thread encoding context into shared-branches edit manager summary encoding.
packages/dds/tree/src/shared-tree-core/editManagerCodecsV1toV4.ts Extend summary encoding context type with idsMustBeFinalized.
packages/dds/tree/src/shared-tree-core/editManagerCodecsCommons.ts Propagate idsMustBeFinalized into revision tag encoding throughout shared-branch summaries.
packages/dds/tree/src/shared-tree-core/editManagerCodecs.ts Extend summary encoding context type with idsMustBeFinalized.
packages/dds/tree/src/feature-libraries/modular-schema/modularChangeCodecV1.ts Propagate idsMustBeFinalized into detached-node encoding contexts.
packages/dds/tree/src/feature-libraries/forest-summary/forestSummarizer.ts Set idsMustBeFinalized for forest encoding when incrementalSummaryContext is absent.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/schemaBasedEncode.ts Add idsMustBeFinalized plumbed parameter into schema-based encoding entrypoints.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/nodeEncoder.ts Emit stable UUID for identifier fields when idsMustBeFinalized and op-space would be negative.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/compressedEncode.ts Add idsMustBeFinalized to EncoderContext.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/codecs.ts Add idsMustBeFinalized to FieldBatchEncodingContext and pass to encoders.
packages/dds/tree/src/core/tree/detachedFieldIndexCodecV2.ts Update revision tag encode/decode to handle stable UUIDs explicitly and refine type checks.
packages/dds/tree/src/core/tree/detachedFieldIndexCodecV1.ts Update revision tag encode/decode callsites for new signature and refine type checks.
packages/dds/tree/src/core/rebase/types.ts Broaden EncodedRevisionTag/schema to include string stable-UUID fallback and document behavior.
packages/dds/tree/src/core/rebase/revisionTagCodec.ts Add stable-UUID fallback encode/decode behavior guarded by idsMustBeFinalized.
packages/dds/tree/src/core/change-family/changeFamily.ts Document and add idsMustBeFinalized to ChangeEncodingContext.

Comment on lines +54 to 65
// vSharedBranches stores the summary's originator at the top level,
// and the decode path threads that originator down to each child codec.
// Stable-UUID fallbacks are therefore unnecessary here even when this is
// an attach summary, so suppress `idsMustBeFinalized` for the inner codec
// calls regardless of what the caller passed.
const innerContext = { ...context, idsMustBeFinalized: false };
const mainBranch = encodeSharedBranch(
changeCodec,
revisionTagCodec,
data.main,
context,
innerContext,
data.originator,
Comment on lines +50 to 54
export type EncodedRevisionTag = Brand<OpSpaceCompressedId, "EncodedRevisionTag"> | string;
export const RevisionTagSchema = Type.Union([
Type.Literal("root"),
brandedNumberType<Exclude<EncodedRevisionTag, string>>(),
Type.String(),
brandedNumberType<Brand<OpSpaceCompressedId, "EncodedRevisionTag">>(),
]);
Comment on lines +167 to +171
// An undefined `incrementalSummaryContext` indicates this is an attach summary (or a GC
// pass). Attach summary blobs can be reused as handles in later summaries with no
// intervening ops to finalize local IDs, after which the originating session's local
// ID state is no longer available to readers — so any non-finalized identifier must
// be encoded as its stable UUID instead.
Comment on lines +106 to +110
// An undefined `incrementalSummaryContext` indicates this is an attach summary (or a GC
// pass). Attach summary blobs can be reused as handles in later summaries with no
// intervening ops to finalize local IDs, after which the originating session's local
// ID state is no longer available to readers — so any compressed ID we persist must be
// in a form (stable UUID) that does not depend on session state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants