Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snapshot name collisions #4485

Merged
merged 18 commits into from
Jan 4, 2021
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,4 @@

export * from "./containerRuntime";
export * from "./containerRuntimeDirtyable";
export * from "./snapshot";
69 changes: 69 additions & 0 deletions packages/runtime/container-runtime-definitions/src/snapshot.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
/*!
arinwt marked this conversation as resolved.
Show resolved Hide resolved
* Copyright (c) Microsoft Corporation. All rights reserved.
* Licensed under the MIT License.
*/

import { ISnapshotTree } from "@fluidframework/protocol-definitions";
import { channelsTreeName } from "@fluidframework/runtime-definitions";

export type ContainerRuntimeSnapshotFormatVersion =
arinwt marked this conversation as resolved.
Show resolved Hide resolved
/**
* Format version is missing from snapshot.
* This indicates it is an older version.
*/
| undefined
/**
* Introduces .metadata blob and .channels trees for isolation of
* data store trees from container-level objects.
*/
| "0.1";

export type DataStoreSnapshotFormatVersion =
/**
* Format version is missing from snapshot.
* This indicates it is an older version.
*/
| undefined
/**
* From this version the pkg within the data store
* attributes blob is a JSON array rather than a string.
*/
| "0.1"
/**
* Introduces .channels trees for isolation of
* channel trees from data store objects.
*/
| "0.2";
arinwt marked this conversation as resolved.
Show resolved Hide resolved

export const metadataBlobName = ".metadata";
export const chunksBlobName = ".chunks";
export const blobsTreeName = ".blobs";

export interface IContainerRuntimeMetadata {
snapshotFormatVersion: ContainerRuntimeSnapshotFormatVersion;
}

export const protocolTreeName = ".protocol";

/**
* List of tree IDs at the container level which are reserved.
* This is for older versions of snapshots that do not yet have an
* isolated data stores namespace. Without the namespace, this must
* be used to prevent name collisions with data store IDs.
*/
export const nonDataStorePaths = [protocolTreeName, ".logTail", ".serviceProtocol", blobsTreeName];

export const dataStoreAttributesBlobName = ".component";

export interface IRuntimeSnapshot {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to more strongly type the other layers as well (and actually wrote out the types as seen in the first commit), but it requires more work across the boundaries with not much gain right now. Will look more into it after fixing SummarizerNode.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think id (at that level) should be always null., no?
Based on latest discussion with SPO, I think we should start asserting in our layers that we provide either value (content but id === null) or reference (id !== null, but no other fields are provided) for trees, similar how we do for blobs - we either reuse existing one, or write out new one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding was that storage always returns ID + all tree contents (i.e. this is the full skeleton). I think it could make sense for them to sometimes return partial trees, but I don't know exactly how that would be specified right now.

If there's a better way to type it that you know of, I can change it, but I know just from debugging that currently it will be id !== null and other fields are provided so that type would be misleading or at least limiting (maybe intentionally?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I though this interface is for writing snapshots (summaries), not for reading.
Runtime never reads "snapshots", it operates in individual (shallow) trees that always have content and ID.

BTW, we should be try to eliminate "snapshot" word from our code base. It is used as in "snapshot tree but I think naming is from the past. I think they are always shallow trees (i.e. only contain one level of data), right?

id: string | null;
blobs: {
[chunksBlobName]: string;
[metadataBlobName]: string;
};
trees: {
[protocolTreeName]: ISnapshotTree;
[blobsTreeName]: ISnapshotTree;
[channelsTreeName]: ISnapshotTree;
},
}
40 changes: 30 additions & 10 deletions packages/runtime/container-runtime/src/containerRuntime.ts
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,13 @@ import {
AttachState,
} from "@fluidframework/container-definitions";
import {
blobsTreeName,
chunksBlobName,
IContainerRuntime,
IContainerRuntimeDirtyable,
IContainerRuntimeEvents,
IContainerRuntimeMetadata,
metadataBlobName,
} from "@fluidframework/container-runtime-definitions";
import {
assert,
Expand Down Expand Up @@ -90,6 +94,7 @@ import {
IChannelSummarizeResult,
CreateChildSummarizerNodeParam,
SummarizeInternalFn,
channelsTreeName,
} from "@fluidframework/runtime-definitions";
import {
addBlobToSummary,
Expand All @@ -114,11 +119,7 @@ import { SummaryCollection } from "./summaryCollection";
import { PendingStateManager } from "./pendingStateManager";
import { pkgVersion } from "./packageVersion";
import { BlobManager } from "./blobManager";
import { DataStores } from "./dataStores";

const chunksBlobName = ".chunks";
const blobsTreeName = ".blobs";
export const nonDataStorePaths = [".protocol", ".logTail", ".serviceProtocol", blobsTreeName];
import { BaseSnapshotType, DataStores } from "./dataStores";

export enum ContainerMessageType {
// An op to be delivered to store
Expand Down Expand Up @@ -475,14 +476,21 @@ export class ContainerRuntime extends TypedEventEmitter<IContainerRuntimeEvents>

const registry = new ContainerRuntimeDataStoreRegistry(registryEntries);

const chunkId = context.baseSnapshot?.blobs[chunksBlobName];
const chunks = context.baseSnapshot && chunkId ? context.storage ?
await readAndParse<[string, string[]][]>(context.storage, chunkId) :
readAndParseFromBlobs<[string, string[]][]>(context.baseSnapshot.blobs, chunkId) : [];
const tryFetchBlob = async <T>(blobName: string): Promise<T | undefined> => {
const blobId = context.baseSnapshot?.blobs[blobName];
if (context.baseSnapshot && blobId) {
return context.storage ?
readAndParse<T>(context.storage, blobId) :
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious, should it be reverse? I.e. if we have it in snapshot, why do we read it from storage?

Copy link
Contributor Author

@arinwt arinwt Dec 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be an unrelated change I think, but it makes sense to me. I don't know as much about what it means when the blobs are stored directly in the snapshot though, maybe we prioritize the ones in storage for some reason? i.e. out of date or something?

@jatgarg would know I think.

readAndParseFromBlobs<T>(context.baseSnapshot.blobs, blobId);
}
};
const chunks = await tryFetchBlob<[string, string[]][]>(chunksBlobName) ?? [];
const metadata = await tryFetchBlob<IContainerRuntimeMetadata>(metadataBlobName);

const runtime = new ContainerRuntime(
context,
registry,
metadata,
chunks,
runtimeOptions,
containerScope,
Expand Down Expand Up @@ -645,6 +653,7 @@ export class ContainerRuntime extends TypedEventEmitter<IContainerRuntimeEvents>
private constructor(
private readonly context: IContainerContext,
private readonly registry: IFluidDataStoreRegistry,
metadata: IContainerRuntimeMetadata = { snapshotFormatVersion: undefined },
chunks: [string, string[]][],
private readonly runtimeOptions: IContainerRuntimeOptions = {
generateSummaries: true,
Expand Down Expand Up @@ -692,8 +701,19 @@ export class ContainerRuntime extends TypedEventEmitter<IContainerRuntimeEvents>
},
);

// back-compat before namespaces were improved
arinwt marked this conversation as resolved.
Show resolved Hide resolved
let dataStoresSnapshot = context.baseSnapshot;
let dataStoresSnapshotType: BaseSnapshotType = "legacy";

if (!!dataStoresSnapshot && metadata.snapshotFormatVersion !== undefined) {
dataStoresSnapshot = dataStoresSnapshot.trees[channelsTreeName];
dataStoresSnapshotType = "next";
assert(!!dataStoresSnapshot, "expected .channels tree in snapshot");
}

this.dataStores = new DataStores(
context.baseSnapshot,
dataStoresSnapshot,
dataStoresSnapshotType,
this,
(attachMsg) => this.submit(ContainerMessageType.Attach, attachMsg),
(id: string, createParam: CreateChildSummarizerNodeParam) =>
Expand Down
64 changes: 38 additions & 26 deletions packages/runtime/container-runtime/src/dataStoreContext.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ import {
BindState,
AttachState,
} from "@fluidframework/container-definitions";
import { Deferred, assert, TypedEventEmitter } from "@fluidframework/common-utils";
import { Deferred, assert, TypedEventEmitter, unreachableCase } from "@fluidframework/common-utils";
import { IDocumentStorageService } from "@fluidframework/driver-definitions";
import { readAndParse } from "@fluidframework/driver-utils";
import { BlobTreeEntry } from "@fluidframework/protocol-base";
Expand All @@ -30,8 +30,13 @@ import {
ITree,
ITreeEntry,
} from "@fluidframework/protocol-definitions";
import { IContainerRuntime } from "@fluidframework/container-runtime-definitions";
import {
dataStoreAttributesBlobName,
DataStoreSnapshotFormatVersion,
IContainerRuntime,
} from "@fluidframework/container-runtime-definitions";
import {
channelsTreeName,
CreateChildSummarizerNodeFn,
CreateChildSummarizerNodeParam,
FluidDataStoreRegistryEntry,
Expand All @@ -51,22 +56,17 @@ import {
import { addBlobToSummary, convertSummaryTreeToITree } from "@fluidframework/runtime-utils";
import { ContainerRuntime } from "./containerRuntime";

// Snapshot Format Version to be used in store attributes.
export const currentSnapshotFormatVersion = "0.1";

const attributesBlobKey = ".component";

function createAttributes(pkg: readonly string[], isRootDataStore: boolean): IFluidDataStoreAttributes {
const stringifiedPkg = JSON.stringify(pkg);
return {
pkg: stringifiedPkg,
snapshotFormatVersion: currentSnapshotFormatVersion,
snapshotFormatVersion: "0.1",
isRootDataStore,
};
}
export function createAttributesBlob(pkg: readonly string[], isRootDataStore: boolean): ITreeEntry {
const attributes = createAttributes(pkg, isRootDataStore);
return new BlobTreeEntry(attributesBlobKey, JSON.stringify(attributes));
return new BlobTreeEntry(dataStoreAttributesBlobName, JSON.stringify(attributes));
}

/**
Expand All @@ -76,7 +76,7 @@ export function createAttributesBlob(pkg: readonly string[], isRootDataStore: bo
*/
export interface IFluidDataStoreAttributes {
pkg: string;
readonly snapshotFormatVersion?: string;
readonly snapshotFormatVersion: DataStoreSnapshotFormatVersion;
/**
* This tells whether a data store is root. Root data stores are never collected.
* Non-root data stores may be collected if they are not used. If this is not present, default it to
Expand Down Expand Up @@ -395,7 +395,7 @@ export abstract class FluidDataStoreContext extends TypedEventEmitter<IFluidData
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
const summarizeResult = await this.channel!.summarize(fullTree, trackState);
const attributes: IFluidDataStoreAttributes = createAttributes(pkg, isRootDataStore);
addBlobToSummary(summarizeResult, attributesBlobKey, JSON.stringify(attributes));
addBlobToSummary(summarizeResult, dataStoreAttributesBlobName, JSON.stringify(attributes));
return { ...summarizeResult, id: this.id };
}

Expand Down Expand Up @@ -554,7 +554,7 @@ export class RemotedFluidDataStoreContext extends FluidDataStoreContext {

constructor(
id: string,
private readonly initSnapshotValue: Promise<ISnapshotTree> | string | null,
private readonly initSnapshotValue: Promise<ISnapshotTree> | string | undefined,
runtime: ContainerRuntime,
storage: IDocumentStorageService,
scope: IFluidObject,
Expand Down Expand Up @@ -586,12 +586,12 @@ export class RemotedFluidDataStoreContext extends FluidDataStoreContext {
// pkg can never change for a store.
protected async getInitialSnapshotDetails(): Promise<ISnapshotDetails> {
if (!this.details) {
let tree: ISnapshotTree | null;
let tree: ISnapshotTree | undefined;
let isRootStore: boolean | undefined;

if (typeof this.initSnapshotValue === "string") {
const commit = (await this.storage.getVersions(this.initSnapshotValue, 1))[0];
tree = await this.storage.getSnapshotTree(commit);
tree = await this.storage.getSnapshotTree(commit) ?? undefined;
} else {
tree = await this.initSnapshotValue;
}
Expand All @@ -605,24 +605,36 @@ export class RemotedFluidDataStoreContext extends FluidDataStoreContext {
this.pending = loadedSummary.outstandingOps.concat(this.pending!);
}

if (tree !== null && tree.blobs[attributesBlobKey] !== undefined) {
if (!!tree && tree.blobs[dataStoreAttributesBlobName] !== undefined) {
// Need to rip through snapshot and use that to populate extraBlobs
const { pkg, snapshotFormatVersion, isRootDataStore } =
await localReadAndParse<IFluidDataStoreAttributes>(tree.blobs[attributesBlobKey]);
await localReadAndParse<IFluidDataStoreAttributes>(tree.blobs[dataStoreAttributesBlobName]);

let pkgFromSnapshot: string[];
// Use the snapshotFormatVersion to determine how the pkg is encoded in the snapshot.
// For snapshotFormatVersion = "0.1", pkg is jsonified, otherwise it is just a string.
if (snapshotFormatVersion === undefined) {
if (pkg.startsWith("[\"") && pkg.endsWith("\"]")) {
// For snapshotFormatVersion = "0.1" or "0.2", pkg is jsonified, otherwise it is just a string.
switch (snapshotFormatVersion) {
case undefined: {
if (pkg.startsWith("[\"") && pkg.endsWith("\"]")) {
pkgFromSnapshot = JSON.parse(pkg) as string[];
} else {
pkgFromSnapshot = [pkg];
}
break;
}
case "0.2": {
tree = tree.trees[channelsTreeName];
// Intentional fallthrough, since package is still JSON
}
case "0.1": {
pkgFromSnapshot = JSON.parse(pkg) as string[];
} else {
pkgFromSnapshot = [pkg];
break;
}
default: {
unreachableCase(
snapshotFormatVersion,
`Invalid snapshot format version ${snapshotFormatVersion}`);
}
} else if (snapshotFormatVersion === currentSnapshotFormatVersion) {
pkgFromSnapshot = JSON.parse(pkg) as string[];
} else {
throw new Error(`Invalid snapshot format version ${snapshotFormatVersion}`);
}
this.pkg = pkgFromSnapshot;
isRootStore = isRootDataStore;
Expand All @@ -637,7 +649,7 @@ export class RemotedFluidDataStoreContext extends FluidDataStoreContext {
// eslint-disable-next-line @typescript-eslint/no-non-null-assertion
pkg: this.pkg!,
isRootDataStore: isRootStore ?? true,
snapshot: tree ?? undefined,
snapshot: tree,
};
}

Expand Down
27 changes: 17 additions & 10 deletions packages/runtime/container-runtime/src/dataStores.ts
Original file line number Diff line number Diff line change
Expand Up @@ -37,18 +37,22 @@ import { BlobCacheStorageService, buildSnapshotTree, readAndParseFromBlobs } fro
import { assert, Lazy } from "@fluidframework/common-utils";
import { v4 as uuid } from "uuid";
import { TreeTreeEntry } from "@fluidframework/protocol-base";
import {
nonDataStorePaths,
} from "@fluidframework/container-runtime-definitions";
import { normalizeAndPrefixGCNodeIds } from "@fluidframework/garbage-collector";
import { DataStoreContexts } from "./dataStoreContexts";
import { ContainerRuntime, nonDataStorePaths } from "./containerRuntime";
import { ContainerRuntime } from "./containerRuntime";
import {
FluidDataStoreContext,
RemotedFluidDataStoreContext,
IFluidDataStoreAttributes,
currentSnapshotFormatVersion,
LocalFluidDataStoreContext,
createAttributesBlob,
LocalDetachedFluidDataStoreContext,
} from "./dataStoreContext";
} from "./dataStoreContext";

export type BaseSnapshotType = "legacy" | "next";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe put some comment in here describing what legacy and next mean in practice?
Should be given them more descriptive names? While this is only runtime data, I image that we might have "next next" format some day :)


/**
* This class encapsulates data store handling. Currently it is only used by the container runtime,
Expand All @@ -66,6 +70,7 @@ export class DataStores implements IDisposable {

constructor(
private readonly baseSnapshot: ISnapshotTree | undefined,
baseSnapshotType: BaseSnapshotType,
private readonly runtime: ContainerRuntime,
private readonly submitAttachFn: (attachContent: any) => void,
private readonly getCreateChildSummarizerNodeFn:
Expand All @@ -75,11 +80,12 @@ export class DataStores implements IDisposable {
) {
this.logger = ChildLogger.create(baseLogger);
// Extract stores stored inside the snapshot
const fluidDataStores = new Map<string, ISnapshotTree | string>();
const fluidDataStores = new Map<string, ISnapshotTree>();

if (typeof baseSnapshot === "object") {
const nonDataStorePathsToUse = baseSnapshotType === "legacy" ? nonDataStorePaths : [];
arinwt marked this conversation as resolved.
Show resolved Hide resolved
if (baseSnapshot) {
Object.keys(baseSnapshot.trees).forEach((value) => {
if (!nonDataStorePaths.includes(value)) {
if (!nonDataStorePathsToUse.includes(value)) {
const tree = baseSnapshot.trees[value];
fluidDataStores.set(value, tree);
}
Expand Down Expand Up @@ -110,10 +116,11 @@ export class DataStores implements IDisposable {
snapshotTree.blobs,
snapshotTree.blobs[".component"]);
// Use the snapshotFormatVersion to determine how the pkg is encoded in the snapshot.
// For snapshotFormatVersion = "0.1", pkg is jsonified, otherwise it is just a string.
// For snapshotFormatVersion = "0.1" or "0.2", pkg is jsonified, otherwise it is just a string.
arinwt marked this conversation as resolved.
Show resolved Hide resolved
// However the feature of loading a detached container from snapshot, is added when the
// snapshotFormatVersion is "0.1", so we don't expect it to be anything else.
if (snapshotFormatVersion === currentSnapshotFormatVersion) {
// snapshotFormatVersion is at least "0.1", so we don't expect it to be anything else.
if (snapshotFormatVersion === "0.1"
|| snapshotFormatVersion === "0.2") {
pkgFromSnapshot = JSON.parse(pkg) as string[];
} else {
throw new Error(`Invalid snapshot format version ${snapshotFormatVersion}`);
Expand Down Expand Up @@ -163,7 +170,7 @@ export class DataStores implements IDisposable {

const flatBlobs = new Map<string, string>();
let flatBlobsP = Promise.resolve(flatBlobs);
let snapshotTreeP: Promise<ISnapshotTree> | null = null;
let snapshotTreeP: Promise<ISnapshotTree> | undefined;
if (attachMessage.snapshot) {
snapshotTreeP = buildSnapshotTree(attachMessage.snapshot.entries, flatBlobs);
// flatBlobs' validity is contingent on snapshotTreeP's resolution
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,7 @@ describe("Data Store Context Tests", () => {
it("Check RemotedDataStore Attributes without version", async () => {
dataStoreAttributes = {
pkg: "TestDataStore1",
snapshotFormatVersion: undefined,
};
const buffer = IsoBuffer.from(JSON.stringify(dataStoreAttributes), "utf-8");
const blobCache = new Map<string, string>([["fluidDataStoreAttributes", buffer.toString("base64")]]);
Expand Down
Loading