Skip to content

fix(tree): Add option to heal unresolvable identifier IDs at doc load time#27281

Merged
Abe27342 merged 18 commits into
microsoft:mainfrom
Abe27342:heal-attach-summary-problematic-docs
May 12, 2026
Merged

fix(tree): Add option to heal unresolvable identifier IDs at doc load time#27281
Abe27342 merged 18 commits into
microsoft:mainfrom
Abe27342:heal-attach-summary-problematic-docs

Conversation

@Abe27342
Copy link
Copy Markdown
Contributor

@Abe27342 Abe27342 commented May 11, 2026

Description

When a SharedTree is attached to an already-attached container, tree identifier values generated locally before the IdCompressor has been allocated a cluster can sometimes be serialized as negative op-space IDs. This is a bug which will be fixed in a separate PR (near-term, by using the long id in these cases similarly to what we do in some other codec paths).

When other clients attempt to load SharedTrees with such summaries, they currently crash as the decode path attempts to use the local client's session ID as the originating ID, which is incorrect. In the general case, the information about which client generated the ID has been lost at this point.

This PR adds a feature flag to configuredSharedTree which can potentially be enabled (with some risk, see notes) to allow documents that have reached this state to continue to operate anyway. It does so by generating a v5 uuid based on the SharedObject ID + op-space unfinalized ID. This provides similar collision-resistant properties as the original intent of the ID provided unfinalized ids are only currently written at attach summary generation time (the known bug). Depending on how the application used the ID, this may break stability guarantees (e.g. if application assumed the id would remain stable and wrote it to an external store or a different data store within Fluid, but did so using the previous long form). Thus this flag should only be enabled with those risks in mind.

Behavior:

  • chunkDecoding.readValue (the SpecialField.Identifier path) now wraps the normalizeToSessionSpace + decompress call in try/catch. On failure, when healUnresolvableIdsOnDecode is enabled and a sharedObjectId is available, it returns a deterministic stable UUID computed as uuidv5("${sharedObjectId}|${opSpaceId}", healingNamespace). All clients loading the same blob agree on the resulting value, and the existing chunked-forest encode path (also from d43d50d) writes the string back out at the next summary so the healing does not need to be re-applied on subsequent loads.

Plumbing:

  • FieldBatchEncodingContext and IdDecodingContext gain healUnresolvableIdsOnDecode?: boolean and sharedObjectId?: string.
  • ChangeEncodingContext and EditManagerEncodingContext carry the same fields so the flag reaches chunked-forest decode for builds/refreshers inside trunk-commit changes (via modularChangeCodecV1.decodeDetachedNodes) in addition to ForestSummarizer.load.
  • EditManagerSummarizer accepts the flag and shared-object id as constructor args.
  • SharedTreeOptionsBeta exposes healUnresolvableIdsOnDecode?: boolean (default false) via configuredSharedTreeBetaLegacy. Off by default so it does not mask genuine decode failures in documents that aren't actually corrupt.

The RevisionTagCodec and DetachedFieldIndex v1/v2 codecs are left unchanged: v2 already emits long IDs when summarizing in detached state, so those codecs do not produce unresolvable persisted IDs to heal.

Adds an end-to-end test that reproduces the SharedTree-attaches-to-attached- container scenario and verifies healed loading succeeds when the option is enabled.

Breaking Changes

N/A -- all behavioral changes are opt-in

…t decode

A SharedTree that attaches to an already-attached container could persist
identifier-field values as negative op-space integers — IDs the originating
session had locally minted but had not yet had a cluster allocated for. The
attach summary's id-compressor is serialized with `withSession=false`, so
when the attach blob is later reused as a handle in a subsequent summary
(no intervening ops to finalize the session-local IDs), readers loading from
the new summary have no way to resolve those negatives and fail to load.

Commit d43d50d fixed the encoder so new attach summaries no longer write
non-final IDs. This commit adds the decoder-side complement so existing
already-broken documents can be recovered.

Behavior:
- `chunkDecoding.readValue` (the `SpecialField.Identifier` path) now wraps
  the `normalizeToSessionSpace` + `decompress` call in try/catch. On
  failure, when `healUnresolvableIdsOnDecode` is enabled and a
  `sharedObjectId` is available, it returns a deterministic stable UUID
  computed as `uuidv5("${sharedObjectId}|${opSpaceId}", healingNamespace)`.
  All clients loading the same blob agree on the resulting value, and the
  existing chunked-forest encode path (also from d43d50d) writes the
  string back out at the next summary so the healing does not need to be
  re-applied on subsequent loads.

Plumbing:
- `FieldBatchEncodingContext` and `IdDecodingContext` gain
  `healUnresolvableIdsOnDecode?: boolean` and `sharedObjectId?: string`.
- `ChangeEncodingContext` and `EditManagerEncodingContext` carry the same
  fields so the flag reaches chunked-forest decode for builds/refreshers
  inside trunk-commit changes (via `modularChangeCodecV1.decodeDetachedNodes`)
  in addition to `ForestSummarizer.load`.
- `EditManagerSummarizer` accepts the flag and shared-object id as
  constructor args.
- `SharedTreeOptionsBeta` exposes `healUnresolvableIdsOnDecode?: boolean`
  (default `false`) via `configuredSharedTreeBetaLegacy`. Off by default
  so it does not mask genuine decode failures in documents that aren't
  actually corrupt.

The `RevisionTagCodec` and `DetachedFieldIndex` v1/v2 codecs are left
unchanged: v2 already emits long IDs when summarizing in detached state,
so those codecs do not produce unresolvable persisted IDs to heal.

Adds an end-to-end test that reproduces the SharedTree-attaches-to-attached-
container scenario and verifies healed loading succeeds when the option is
enabled.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 11, 2026

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (670 lines, 49 files), I've queued these reviewers:

  • Correctness — logic errors, race conditions, lifecycle issues
  • Security — vulnerabilities, secret exposure, injection
  • API Compatibility — breaking changes, release tags, type design
  • Performance — algorithmic regressions, memory leaks
  • Testing — coverage gaps, hollow tests

How this works

  • Adjust the reviewer set by ticking/unticking boxes above. Reviewer toggles alone don't trigger anything.

  • Tick Start review below to dispatch the review fleet.

  • After review finishes, tick Start review again to request another run — it auto-resets after each dispatch.

  • This comment updates as new commits land; your reviewer selections are preserved.

  • Start review

Comment thread packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Outdated
@Abe27342 Abe27342 marked this pull request as ready for review May 12, 2026 01:30
@Abe27342 Abe27342 requested a review from a team as a code owner May 12, 2026 01:30
Copilot AI review requested due to automatic review settings May 12, 2026 01:30
@Abe27342
Copy link
Copy Markdown
Contributor Author

Abe27342 commented May 12, 2026

Haven't finished reviewing this myself yet but it's in a state where I want agentic review + PR checks running on it.

Made a first pass. OK with the state.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an opt-in recovery path for SharedTree summary decoding when persisted identifier values contain unresolvable (not-yet-finalized) op-space IDs, by “healing” them into deterministic stable UUIDs during summary loads (not op processing).

Changes:

  • Introduces healUnresolvableIdsOnDecode (default false) and plumbs it (plus sharedObjectId) through encoding/decoding contexts so it reaches chunked-forest decoding during summary loads.
  • Adds an isSummary flag to relevant codec contexts to gate healing to summary decode paths only.
  • Adds unit coverage for the heal behavior in chunkDecoding, plus a new end-to-end test exercising the “detached datastore attached via op” scenario.

Reviewed changes

Copilot reviewed 42 out of 42 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
packages/test/test-end-to-end-tests/src/test/tree.spec.ts New E2E scenario to reproduce attach-op summary load behavior with healing enabled.
packages/dds/tree/src/test/testChange.spec.ts Updates dummy codec context to include isSummary.
packages/dds/tree/src/test/shared-tree/sharedTreeChangeCodec.spec.ts Updates decode contexts to include isSummary.
packages/dds/tree/src/test/shared-tree-core/messageCodec.spec.ts Updates dummy context to include isSummary.
packages/dds/tree/src/test/shared-tree-core/edit-manager/editManagerCodecs.test.ts Updates dummy context to include isSummary.
packages/dds/tree/src/test/rebase/revisionTagCodec.spec.ts Updates decode contexts to include isSummary.
packages/dds/tree/src/test/feature-libraries/sequence-field/sequenceFieldSnapshots.test.ts Updates snapshot context to include isSummary.
packages/dds/tree/src/test/feature-libraries/sequence-field/sequenceFieldCodecs.test.ts Updates codec context to include isSummary.
packages/dds/tree/src/test/feature-libraries/optional-field/optionalFieldSnapshots.test.ts Updates snapshot context to include isSummary.
packages/dds/tree/src/test/feature-libraries/optional-field/optionalFieldChangeCodecs.test.ts Updates codec context to include isSummary.
packages/dds/tree/src/test/feature-libraries/modular-schema/modularChangeFamily.spec.ts Updates codec context to include isSummary.
packages/dds/tree/src/test/feature-libraries/modular-schema/genericFieldSnapshots.test.ts Updates snapshot context to include isSummary.
packages/dds/tree/src/test/feature-libraries/modular-schema/genericFieldKind.spec.ts Updates codec context to include isSummary.
packages/dds/tree/src/test/feature-libraries/forest-summary/forestSummarizerCodec.spec.ts Updates encoding context to include isSummary.
packages/dds/tree/src/test/feature-libraries/forest-summary/forestSummarizer.spec.ts Updates encoding context to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/uncompressedEncode.spec.ts Updates encoding context to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/schemaBasedEncode.spec.ts Updates encoding context to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/compressedEncode.spec.ts Updates decode contexts to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/codecs.spec.ts Updates encoding contexts to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/chunkDecodingGeneric.spec.ts Updates id decoding context to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/chunkDecoding.spec.ts Adds unit tests for heal-on-decode behavior; updates id decoding contexts.
packages/dds/tree/src/test/feature-libraries/chunked-forest/codec/checkEncode.ts Updates decode contexts to include isSummary.
packages/dds/tree/src/test/feature-libraries/chunked-forest/chunkEncodingEndToEnd.spec.ts Updates encoding/decoding contexts to include isSummary.
packages/dds/tree/src/shared-tree/treeCheckout.ts Ensures op encode/decode contexts set isSummary: false.
packages/dds/tree/src/shared-tree/treeAlpha.ts Sets isSummary: false for ad-hoc (non-summary) encode paths.
packages/dds/tree/src/shared-tree/independentView.ts Sets isSummary: false for synthetic decode of inline content.
packages/dds/tree/src/shared-tree/sharedTreeChangeCodecs.ts Plumbs isSummary into downstream modular change codec usage.
packages/dds/tree/src/shared-tree/sharedTree.ts Adds option to SharedTreeOptionsBeta, defaults it off, and passes summary context flags to summarizer paths.
packages/dds/tree/src/shared-tree-core/sharedTreeCore.ts Adds internal option and wires it into EditManagerSummarizer construction.
packages/dds/tree/src/shared-tree-core/messageCodecVSharedBranches.ts Ensures op message encode/decode contexts set isSummary: false.
packages/dds/tree/src/shared-tree-core/messageCodecV1ToV4.ts Ensures op message encode/decode contexts set isSummary: false.
packages/dds/tree/src/shared-tree-core/editManagerSummarizer.ts Carries heal/sharedObjectId into edit manager summary encode/decode contexts.
packages/dds/tree/src/shared-tree-core/editManagerCodecsVSharedBranches.ts Extends edit manager codec context with isSummary + heal inputs.
packages/dds/tree/src/shared-tree-core/editManagerCodecsV1toV4.ts Extends edit manager codec context with isSummary + heal inputs.
packages/dds/tree/src/shared-tree-core/editManagerCodecsCommons.ts Threads isSummary and heal inputs through branch/commit encode/decode helpers.
packages/dds/tree/src/shared-tree-core/editManagerCodecs.ts Extends edit manager codec context with isSummary + heal inputs.
packages/dds/tree/src/feature-libraries/modular-schema/modularChangeCodecV1.ts Plumbs isSummary/heal/sharedObjectId into detached-node decode path.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/codecs.ts Adds isSummary + heal/sharedObjectId to FieldBatchEncodingContext and passes them to decoding.
packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Implements heal-on-summary-decode for negative op-space identifier values using uuidv5.
packages/dds/tree/src/core/tree/detachedFieldIndexCodecV2.ts Marks summarizer-only codec contexts with isSummary: true.
packages/dds/tree/src/core/tree/detachedFieldIndexCodecV1.ts Marks summarizer-only codec contexts with isSummary: true.
packages/dds/tree/src/core/change-family/changeFamily.ts Adds isSummary + heal/sharedObjectId to the shared ChangeEncodingContext.

Comment thread packages/test/test-end-to-end-tests/src/test/tree.spec.ts
Comment thread packages/test/test-end-to-end-tests/src/test/tree.spec.ts Outdated
Comment thread packages/dds/tree/src/shared-tree/sharedTree.ts Outdated
@Abe27342 Abe27342 requested a review from a team as a code owner May 12, 2026 02:51
Abram Sanderson (He/Him) added 3 commits May 11, 2026 22:04
@Abe27342 Abe27342 requested a review from a team as a code owner May 12, 2026 16:49
/**
* If `true`, identifier values that the local id-compressor cannot resolve
* during decode are healed into deterministic stable UUIDs instead of
* throwing. See `FieldBatchEncodingContext.healUnresolvableIdsOnDecode`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we use a {@link} here? Could also put this note in a @see block.

*/
originatorId: SessionId;
/**
* See {@link FieldBatchEncodingContext.isSummary}.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: {@inheritDoc} instead of See {@link} for these?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does inheritdoc work on some version of VSCode I'm not on or do I have the wrong syntax in screenshot below? If so, sure, sounds good. If not, I think leaving it as links is better with the dev tooling we actually have. Links at least render inline and are clickable, whereas inheritdocs do this:

image

I'd get it if this were exposed publicly (I'm sure API reports do a better job at doing reasonable things with that tag), but this is an internal type.

(still thanks for making me check that the link resolved; it did not previously but will with next update).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VSCode intellisense doesn't support TSDoc inheritDoc syntax, unfortunately. So it ends up being a glorified link in that context. But the generated docs support it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, not taking this suggestion then since the generated docs shouldn't include this interface.

Comment thread packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Outdated
@CraigMacomber CraigMacomber added the release-blocking Must be addressed before we cut and publish the next release label May 12, 2026
Comment thread .changeset/petite-pillows-wave.md Outdated
Comment thread .changeset/petite-pillows-wave.md Outdated
* would mask genuine bugs that otherwise surface as decode failures.
*
* This mitigation is also not perfect as the client that originated the non-finalized
* short ID will not apply the same "healing" and will continue to use the original ID.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens in that case? How important is it for the originating client to no longer be joined?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typical decoherence. Nothing meaningful on the Fluid side, so it's pretty much whatever the implications are to the calling application.

Comment thread .changeset/petite-pillows-wave.md Outdated
Comment thread .changeset/petite-pillows-wave.md Outdated
Comment thread packages/dds/tree/src/shared-tree/sharedTree.ts Outdated
Comment thread .changeset/petite-pillows-wave.md Outdated
Comment thread .changeset/petite-pillows-wave.md Outdated
Comment thread packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Outdated
return idCompressor.decompress(
idCompressor.normalizeToSessionSpace(
streamValue as OpSpaceCompressedId,
idDecodingContext.originatorId,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
idDecodingContext.originatorId,
// TODO: due to a bug (See SharedTreeOptionsBeta.healUnresolvableIdentifiersOnDecode),
// this this is actually the wrong identifier for summaries (current session not the one that encoded it):
// the "if" above mitigates that case.
idDecodingContext.originatorId,

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't really the right place to put that comment (or at least not the single right place). One spot is in this bit of ForestSummarizer:

image

but there might be some other places that need similar changes (EditManager summary skirts by right now for other reasons but I'd expect some of the plumbing there is wrong too).

Not going to include as part of this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are multiple places where we need to do follow-up work, and this is one of them since this code will have to change to tolerate not having a session id here.

Personally I think having a TODO here makes since this code can be wrong if it runs (that's why we need the if above) and will have to change in the future.

I think having more TODO's and comments about places where code has know to be wrong behaviors and thus doesn't perform as documented is better than having too few.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from what I can tell we don't actually necessarily need to touch this code or context object when we fix up our write format to include originator's session ID where we need it (I think at each summary blob suffices). We just need to supply the correct value at the call site.

I recognize it's probably dependent on exactly what your plans are in the area though.

Comment thread packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Outdated
Comment thread packages/dds/tree/src/shared-tree/sharedTree.ts Outdated
Comment thread packages/dds/tree/src/shared-tree/sharedTree.ts
Comment thread packages/dds/tree/src/shared-tree/sharedTree.ts
Comment thread packages/dds/tree/src/feature-libraries/chunked-forest/codec/chunkDecoding.ts Outdated
@github-actions
Copy link
Copy Markdown
Contributor

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  288641 links
    1922 destination URLs
    2172 URLs ignored
       0 warnings
       0 errors


@Abe27342 Abe27342 enabled auto-merge (squash) May 12, 2026 21:29
@Abe27342 Abe27342 merged commit d9205dd into microsoft:main May 12, 2026
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-blocking Must be addressed before we cut and publish the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants