feat(container-loader): add captureFullContainerState free function#27100
feat(container-loader): add captureFullContainerState free function#27100anthony-murphy wants to merge 10 commits intomicrosoft:mainfrom
Conversation
Adds a driver-only free function that captures a container's current state in the IPendingContainerState wire format using only an IDocumentServiceFactory and IUrlResolver. Unlike Container.getPendingLocalState(), no runtime or codeLoader is instantiated: the function fetches the latest snapshot, reads the authoritative sequence number from the snapshot's attributes blob, drains ops from delta storage from that sequence number, and serializes the result. pendingRuntimeState is undefined, so the output is intended for state relay, inspection, and durable-state snapshot use cases rather than rehydrating in-flight DDS changes. The output can be fed back into loadExistingContainer or loadFrozenContainerFromPendingState as pendingLocalState. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tainerPendingState Extends captureContainerPendingState so the driver-level state is fully self-contained for blob reads as well. Attachment blob bytes are fetched and added to snapshotBlobs keyed by storage ID, which ContainerStorageAdapter already serves through its cache — no wire-format change required. GC state is consulted when present: blobs GC has explicitly marked unreferencedTimestampMs, tombstoned, or deleted are skipped. Blobs absent from the GC graph are kept, since GC lag can leave recently-attached blobs off the graph and dropping them would lose live data. When the snapshot has no GC tree (GC disabled or pre-GC document), every attachment blob from the BlobManager redirect table is included. The relevant blob manager / GC constants and the minimal parsing logic are duplicated locally to avoid a loader → runtime dependency; comments point back to the canonical definitions in container-runtime and runtime-definitions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ware
Extends the driver-only capture to cover the whole referenced graph of the
container, not just attachment blobs.
- Honours ISnapshotTree.unreferenced: a shared tree walker skips any subtree
flagged unreferenced by the summarizer (which sets the flag from GC state)
and inlines contents of every other blob it reaches. Replaces the
unfiltered getBlobContentsFromTree path.
- Pre-fetches loading-group snapshots: enumerates groupIds on the base
snapshot (skipping unreferenced subtrees), fetches each via
IDocumentStorageService.getSnapshot({ versionId, loadingGroupIds }), runs
the fetched snapshot through the same tree walker, and serialises the
result into IPendingContainerState.loadedGroupIdSnapshots. If the driver
lacks getSnapshot support or no groupIds are declared, no groups are
included.
- GC parsing is done once and shared between tree-level and attachment-blob
filtering. captureReferencedAttachmentBlobs now takes pre-parsed GC data.
Renames captureAttachmentBlobs.ts to captureReferencedContents.ts to reflect
the broader scope.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Covers the reachability filtering and groupId-fetch paths that the local end-to-end tests can't easily exercise (no summarizer runs in the local test server, and TestFluidObjectFactory doesn't produce loading-group datastores out of the box). Tests construct ISnapshotTree fixtures directly and back readBlob/getSnapshot with an in-memory shim. 15 cases across readReferencedSnapshotBlobs, parseGcSnapshotData, captureReferencedAttachmentBlobs, and captureGroupIdSnapshots — covering unreferenced subtree skip, root .blobs special-casing, ISnapshot vs ISnapshotTree input, GC lag tolerance (blobs absent from gcNodes are kept), tombstone/deletedNodes skip, and groupId enumeration/dedup/fetch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ptureFullContainerState The function captures the whole referenced graph of the container (snapshot, loading-group snapshots, inlined structural blobs, inlined attachment blobs, trailing ops) — not just "pending state." The new name matches the scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a new @legacy @alpha free function captureFullContainerState to @fluidframework/container-loader to capture an attached container’s referenced state using only driver-level services, producing a serialized IPendingContainerState suitable for frozen/offline reload scenarios.
Changes:
- Export
captureFullContainerState(+ props interface) fromcontainer-loader. - Implement driver-only capture: fetch latest snapshot, inline referenced snapshot blobs + attachment blobs (GC-filtered), prefetch loading-group snapshots, and fetch post-snapshot ops.
- Add unit tests for referenced-content capture and E2E local-server round-trip tests.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| packages/test/local-server-tests/src/test/captureFullContainerState.spec.ts | Adds E2E tests validating capture + frozen rehydration, ops replay, handle/blob references, and attachment inlining. |
| packages/loader/container-loader/src/test/captureReferencedContents.spec.ts | Adds unit tests for GC parsing/filtering, attachment selection, snapshot walking, and loading-group snapshot capture. |
| packages/loader/container-loader/src/index.ts | Exports the new captureFullContainerState API and its props type. |
| packages/loader/container-loader/src/createAndLoadContainerUtils.ts | Implements captureFullContainerState and defines ICaptureFullContainerStateProps. |
| packages/loader/container-loader/src/captureReferencedContents.ts | New helper module for snapshot walking, GC parsing, attachment capture, and group snapshot prefetch. |
| packages/loader/container-loader/api-report/container-loader.legacy.alpha.api.md | Updates API report for the new exported function and interface. |
| const documentService = await documentServiceFactory.createDocumentService( | ||
| resolvedUrl, | ||
| logger, | ||
| ); | ||
| const storage = await documentService.connectToStorage(); | ||
|
|
||
| const versions = await storage.getVersions( | ||
| // `null` signals "latest" | ||
| // eslint-disable-next-line unicorn/no-null | ||
| null, | ||
| 1, | ||
| "captureFullContainerState", | ||
| FetchSource.noCache, | ||
| ); | ||
| const version = versions[0]; | ||
| const snapshot: ISnapshot | ISnapshotTree | undefined = | ||
| storage.getSnapshot === undefined | ||
| ? ((await storage.getSnapshotTree(version)) ?? undefined) | ||
| : await storage.getSnapshot({ | ||
| versionId: version?.id, | ||
| scenarioName: "captureFullContainerState", | ||
| }); | ||
| if (snapshot === undefined) { | ||
| throw new GenericError("Failed to fetch snapshot for captureFullContainerState"); | ||
| } | ||
|
|
||
| const baseSnapshot = getSnapshotTree(snapshot); | ||
| const attributes = await getDocumentAttributes(storage, baseSnapshot); | ||
| const gcData = await parseGcSnapshotData(baseSnapshot, storage); | ||
| const [structuralBlobs, attachmentBlobs, loadedGroupIdSnapshots] = await Promise.all([ | ||
| readReferencedSnapshotBlobs(snapshot, storage), | ||
| captureReferencedAttachmentBlobs(baseSnapshot, storage, gcData), | ||
| captureGroupIdSnapshots(baseSnapshot, storage, version?.id, "captureFullContainerState"), | ||
| ]); | ||
| const snapshotBlobs = { ...structuralBlobs, ...attachmentBlobs }; | ||
|
|
||
| const deltaStorage = await documentService.connectToDeltaStorage(); | ||
| const opsStream = deltaStorage.fetchMessages( | ||
| attributes.sequenceNumber + 1, | ||
| undefined, | ||
| undefined, | ||
| false, | ||
| "captureFullContainerState", | ||
| ); | ||
| const savedOps: ISequencedDocumentMessage[] = []; | ||
| let opsResult = await opsStream.read(); | ||
| while (!opsResult.done) { | ||
| savedOps.push(...opsResult.value); | ||
| opsResult = await opsStream.read(); | ||
| } | ||
|
|
||
| const pendingState: IPendingContainerState = { | ||
| attached: true, | ||
| baseSnapshot, | ||
| snapshotBlobs, | ||
| loadedGroupIdSnapshots: | ||
| Object.keys(loadedGroupIdSnapshots).length > 0 ? loadedGroupIdSnapshots : undefined, | ||
| pendingRuntimeState: undefined, | ||
| savedOps, | ||
| url: resolvedUrl.url, | ||
| }; | ||
| return JSON.stringify(pendingState); |
There was a problem hiding this comment.
captureFullContainerState creates an IDocumentService but never calls documentService.dispose(). IDocumentService.dispose() is part of the driver contract and is expected to be called when the consumer is done; otherwise drivers may leak sockets/caches. Consider wrapping the capture logic in a try/finally and disposing the document service in the finally block (optionally passing through the caught error).
| async function walkForBlobs( | ||
| tree: ISnapshotTree, | ||
| isRoot: boolean, | ||
| blobs: ISerializableBlobContents, | ||
| read: BlobReader, | ||
| ): Promise<void> { | ||
| if (tree.unreferenced === true) { | ||
| return; | ||
| } | ||
| const promises: Promise<unknown>[] = []; | ||
| for (const blobId of Object.values(tree.blobs)) { | ||
| promises.push(readAndStore(blobId, blobs, read)); | ||
| } | ||
| for (const [key, subTree] of Object.entries(tree.trees)) { | ||
| if (isRoot && key === blobsTreeName) { | ||
| // Attachment blob contents are captured separately; only inline | ||
| // the redirect table so BlobManager can rehydrate identity | ||
| // mappings. | ||
| const tableBlobId = subTree.blobs[redirectTableBlobName]; | ||
| if (tableBlobId !== undefined) { | ||
| promises.push(readAndStore(tableBlobId, blobs, read)); | ||
| } | ||
| } else { | ||
| promises.push(walkForBlobs(subTree, false, blobs, read)); | ||
| } | ||
| } | ||
| await Promise.all(promises); | ||
| } |
There was a problem hiding this comment.
readReferencedSnapshotBlobs currently schedules all blob reads (and subtree walks) into a single Promise.all per node. For large snapshots this can create unbounded parallel readBlob calls, which can overwhelm the driver/service or spike memory. It would be safer to limit concurrency (e.g., batch reads, or walk depth-first and read blobs sequentially like getBlobContentsFromTreeCore does) while still respecting the unreferenced filter.
| const storageIdsToFetch = new Set<string>(); | ||
| for (const [localId, storageId] of localIdToStorageId) { | ||
| if (unreferencedLocalIds?.has(localId) !== true) { | ||
| storageIdsToFetch.add(storageId); | ||
| } | ||
| } | ||
|
|
||
| const contents: ISerializableBlobContents = {}; | ||
| await Promise.all( | ||
| [...storageIdsToFetch].map(async (storageId) => { | ||
| const buffer = await storage.readBlob(storageId); | ||
| contents[storageId] = bufferToString(buffer, "utf8"); | ||
| }), | ||
| ); | ||
| return contents; |
There was a problem hiding this comment.
captureReferencedAttachmentBlobs fetches all referenced attachment blobs with a single Promise.all([...storageIdsToFetch]). If a document has many attachments, this can trigger a very large number of concurrent readBlob requests and increase the risk of throttling/OOM. Consider adding a concurrency limit (or chunking) when reading attachment blobs.
| function collectUnreferencedBlobLocalIds(gcData: IGcSnapshotData): Set<string> | undefined { | ||
| if (gcData.gcState === undefined) { | ||
| return undefined; | ||
| } | ||
| const blobPathPrefix = `/${blobManagerBasePath}/`; | ||
| const unreferenced = new Set<string>(); | ||
| for (const [nodePath, nodeData] of Object.entries(gcData.gcState.gcNodes)) { | ||
| if ( | ||
| nodePath.startsWith(blobPathPrefix) && | ||
| nodeData.unreferencedTimestampMs !== undefined | ||
| ) { | ||
| unreferenced.add(nodePath.slice(blobPathPrefix.length)); | ||
| } | ||
| } | ||
| for (const nodePath of [...(gcData.tombstones ?? []), ...(gcData.deletedNodes ?? [])]) { | ||
| if (nodePath.startsWith(blobPathPrefix)) { | ||
| unreferenced.add(nodePath.slice(blobPathPrefix.length)); | ||
| } | ||
| } | ||
| return unreferenced; | ||
| } |
There was a problem hiding this comment.
collectUnreferencedBlobLocalIds returns undefined when gcData.gcState is undefined, which means tombstones/deletedNodes are ignored even if parseGcSnapshotData populated them. This contradicts the function/doc comments that tombstoned/deleted blobs are always skipped, and can cause attempts to read blobs that GC has explicitly tombstoned/deleted. Consider always applying tombstones/deletedNodes filtering regardless of whether gcState is present.
| import type { SerializedSnapshotInfo } from "./serializedStateManager.js"; | ||
| import { getDocumentAttributes } from "./utils.js"; | ||
|
|
||
| // The following names are defined authoritatively in container-runtime and |
There was a problem hiding this comment.
we probably want to move all this stuff to somewhere else in our common-definitions. it would couple the code a bit more, but these can't be easily changed today give its all data format. also thought about a potential runtime-protocol package.
…e.spec Collapse a multi-line chained .get() call onto a single line to satisfy biome's formatter — CI was failing on `biome check .` in local-server-tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inerState Four issues flagged by Copilot review on PR microsoft#27100: 1. captureFullContainerState created an IDocumentService but never called dispose(). Wrap the capture body in try/finally and dispose in the finally to release driver-held resources (sockets/caches). 2. readReferencedSnapshotBlobs fanned every blob read at every tree level into a single Promise.all, giving unbounded concurrency on large snapshots. Refactor into a collect-then-fetch pipeline: walk the tree synchronously to gather referenced blob ids, then fetch via a new mapWithConcurrency helper capped at 32 in-flight reads. 3. captureReferencedAttachmentBlobs had the same unbounded-parallel issue over all referenced attachment storage ids. Route through the same mapWithConcurrency helper. 4. collectUnreferencedBlobLocalIds returned undefined when gcData.gcState was undefined, silently dropping tombstones/deletedNodes filtering even when those lists were populated. Contradicted the function docs. Now always applies tombstones/deletedNodes regardless of gcState presence, and returns a (possibly empty) Set rather than undefined. Added a unit test covering the gcState-undefined-but-tombstones-present case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Follow-up to 8a748cc: captureGroupIdSnapshots still fanned every getSnapshot call into a single Promise.all. Route through mapWithConcurrency with a lower limit (4) since each call pulls a whole snapshot tree, not a single blob. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| versionId: string | undefined, | ||
| scenarioName: string, | ||
| ): Promise<Record<string, SerializedSnapshotInfo>> { | ||
| const getSnapshot = storage.getSnapshot; |
There was a problem hiding this comment.
Deep Review: storage.getSnapshot is extracted into a local variable here, which strips the this binding. When real driver implementations call this (e.g., LocalDocumentStorageService.getSnapshot which references this.id, this.getVersions(...), etc.), this will be undefined in strict mode, causing a TypeError.
This is confirmed by the existing .bind() pattern elsewhere in the codebase — see protocolTreeDocumentStorageService.ts:31 which binds the same method.
Unit tests don't catch this because mockStorage is a plain object with arrow-function-like properties that don't depend on this. Integration tests don't hit this path because test containers lack loading-group subtrees.
Fix: Either bind when extracting:
const getSnapshot = storage.getSnapshot?.bind(storage);Or call the method directly on the object:
await storage.getSnapshot?.({ cacheSnapshot: false, versionId, loadingGroupIds: [groupId] });There was a problem hiding this comment.
Deep Review: Retracting this finding — the code already handles this binding correctly at line 371 with storage.getSnapshot?.bind(storage) and an explanatory comment. The original analysis missed the .bind() call. Apologies for the false positive.
…ip cache Two issues from PR review: 1. Bind getSnapshot when extracting it from the storage service. Real driver implementations reference `this` (e.g., LocalDocumentStorageService.getSnapshot reads this.id), so calling the detached method would TypeError in strict mode. Mirrors the bind pattern in protocolTreeDocumentStorageService.ts:31. Added a class-based unit test stub whose getSnapshot touches `this` — would have caught this. 2. Pass cacheSnapshot: false on every getSnapshot call we make from the capture path. This capture is transient; we don't want to pollute the driver's snapshot cache with it. Covered by a unit test asserting the option is forwarded. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
🔗 No broken links found! ✅ Your attention to detail is admirable. linkcheck output |
Deep ReviewReviewed commit Readiness: 5/10 — 🔨 MAKING PROGRESS
The prior 7/10 summary classified the binary-attachment-blob concern as out-of-scope ("pre-existing pattern"). On re-review that classification was wrong: the new Findings
Questions for Author
Path to ReadyTo move from 5 → 7-8 (ALMOST READY):
To move from 7-8 → 9-10 (READY):
Context for Reviewers
Review history (1 prior review)
|
Description
Adds a new
@legacy @alphafree function,captureFullContainerState, that captures the full referenced state of an attached container using only driver-level services — anIDocumentServiceFactory, anIUrlResolver, and a request — with no runtime, codeLoader, or liveContainerinstance. The output is a stringifiedIPendingContainerStatein the same wire format produced by a live container's pending-state serialization, and can be handed toloadExistingContainerorloadFrozenContainerFromPendingStateaspendingLocalState.The captured state is a self-contained view of the container's referenced graph:
ISnapshotTree.unreferenced(set by the summarizer from GC state) is honoured — dead subtrees are not read.IDocumentStorageService.getSnapshot({ versionId, loadingGroupIds })and walked with the same filter. PopulatesIPendingContainerState.loadedGroupIdSnapshots.unreferencedTimestampMs, tombstoned, or deleted are skipped; blobs absent from the GC graph are kept (GC lag tolerance). If the snapshot has no GC tree, every attachment blob is included..attributesblob).Attachment blob inlining piggybacks on the existing
ContainerStorageAdaptercache: it consultssnapshotBlobsby id before falling through to storage, so a frozen loader can serve the full referenced graph without a live storage service. No wire-format change.pendingRuntimeStateisundefined— no runtime is instantiated — so the output cannot carry DDS-level in-flight changes. Intended for state relay, inspection, and durable-state snapshot use cases.Implementation notes:
captureReferencedContents.tsholds the tree walker, GC parser, attachment-blob capture, and groupId fetcher. A handful of blob-manager / GC constants are duplicated locally with a comment pointing to the canonical definitions incontainer-runtime/runtime-definitions, to avoid a loader → runtime dependency.ISnapshotTreefixtures + mocked storage cover the GC filtering and groupId paths; 4 end-to-end scenarios againstLocalDeltaConnectionServerinlocal-server-testscover the round-trip throughloadFrozenContainerFromPendingState.Reviewer Guidance
The review process is outlined on this wiki page.
container-loaderrather than taking a dependency oncontainer-runtime/runtime-definitions. Comment in the file points to the canonical sources. If you'd prefer an extracted shared package, flag it.gcState.gcNodesare kept, not skipped — intentional, to tolerate GC lag for recently-attached blobs. OnlyunreferencedTimestampMs, tombstones, and deletedNodes filter content out.ISnapshotTree.unreferencedis never set end-to-end;TestFluidObjectFactorydoesn't produce loading-group datastores).