Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISnapshotTree.blobs has unwanted usage - used to store blobID mapping, as well as blob content #4746

Closed
vladsud opened this issue Jan 6, 2021 · 3 comments
Assignees
Labels
area: driver Driver related issues bug Something isn't working
Milestone

Comments

@vladsud
Copy link
Contributor

vladsud commented Jan 6, 2021

There are multiple (but not that many) places in our repo where we rely on a fact that ISnapshotTree.blobs contains two non-overlapping and distinct set of data:

  1. Mapping of path (like "header" or ".attributes") to storage blobId.
  2. Mapping of blobId to actual content of the blob.

This looks totally wrong for two reasons:

  1. There is actually no guarantee that these two sets (path names and blobId names) have no overlap. Nothing prevents storage to generate blobId = "header". This will result in user data loss as we will not be able to open a file.
  2. As we change how data flow through storage/driver/runtime, blobs will become binary, i.e. not strings but ArrayBufferLike. We will have to split these two sets or have a pain of continuous cases and suppressing compiler warnings / adding a lot of unneeded IFs.

I believe we are in this situation only because unsaved changes (i.e. features like Draft mode, and container runtime reload due to code proposal) store data in right there in blobs property, instead of using separate array.

loadAndInitializeProtocolState() is a good example here - instead of using same path (i.e. always have storage interface, and readiing from it as we do on "normal" load, we added another branch that assumes blobs themselves can be fetched from same ISnapshotTree. We get here due to convertProtocolAndAppSummaryToSnapshotTreeCore() making that design choice of stuffing everything everything into same ISnapshotTree, I believe it should not do it. I believe this is the only place where we do it (based on scanning of who's using fromUtf8ToBase64)

Places where we rely on that behavior (reading) are easy to find after https://github.com/microsoft/FluidFramework/pull/4530/files makes it into main - one can search for few remaining usages of fromBase64ToUtf8() to find them.

Examples:

readAndParseFromBlobs

export function readAndParseFromBlobs<T>(blobs: {[index: string]: string}, id: string): T {
    const encoded = blobs[id];
    const decoded = fromBase64ToUtf8(encoded);
    return JSON.parse(decoded) as T;
}

convertSnapshotTreeToSummaryTree():

    // The entries in blobs are supposed to be blobPath -> blobId and blobId -> blobValue
    // and we want to push blobPath to blobValue in tree entries.
    if (snapshot.blobs[value] !== undefined) {
        const decoded = fromBase64ToUtf8(snapshot.blobs[value]);
        builder.addBlob(key, decoded);
    }

getDocumentAttributes():

    // Back-compat: old docs would have ".attributes" instead of "attributes"
    const attributesHash = ".protocol" in tree.trees
        ? tree.trees[".protocol"].blobs.attributes
        : tree.blobs[".attributes"];

    const attributes = storage !== undefined ? await readAndParse<IDocumentAttributes>(storage, attributesHash)
        : readAndParseFromBlobs<IDocumentAttributes>(tree.trees[".protocol"].blobs, attributesHash);

loadAndInitializeProtocolState():

            members = readAndParseFromBlobs<[string, ISequencedClient][]>(snapshot.trees[".protocol"].blobs,
                baseTree.blobs.quorumMembers);
            proposals = readAndParseFromBlobs<[number, ISequencedProposal, string[]][]>(
                snapshot.trees[".protocol"].blobs, baseTree.blobs.quorumProposals);
            values = readAndParseFromBlobs<[string, ICommittedProposal][]>(snapshot.trees[".protocol"].blobs,
                baseTree.blobs.quorumValues);

convertProtocolAndAppSummaryToSnapshotTreeCore:

function convertProtocolAndAppSummaryToSnapshotTreeCore(
            case SummaryType.Blob: {
                const blobId = uuid();
                treeNode.blobs[key] = blobId;
                const content = typeof summaryObject.content === "string" ?
                    summaryObject.content : Uint8ArrayToString(summaryObject.content, "base64");
                treeNode.blobs[blobId] = fromUtf8ToBase64(content);
                break;
            }
@vladsud vladsud added the bug Something isn't working label Jan 6, 2021
@ghost ghost added the triage label Jan 6, 2021
@vladsud vladsud added this to the February 2021 milestone Jan 6, 2021
@ghost ghost added triage and removed triage labels Jan 6, 2021
@vladsud
Copy link
Contributor Author

vladsud commented Jan 6, 2021

Related item: #4695

@vladsud
Copy link
Contributor Author

vladsud commented Feb 25, 2021

Please note that addressing this issue properly will likely mean introduction of storage object where previously we have not had one.
Note that we were already pretty bad at managing when we have storage and when we do not have it and transitions. I've added this comment in code- it's easy to see here by adding assert and hitting it right away:

    public get storage(): IDocumentStorageService {
        // This code is plain wrong. It lies that it never returns undefined!!!
        // All callers should be fixed, as this API is called in detached state of container when we have
        // no storage and it's passed down the stack without right typing.
        if (!this._storage && this.context.storage) {
            // Note: BlobAggregationStorage is smart enough for double-wrapping to be no-op
            this._storage = BlobAggregationStorage.wrap(this.context.storage, this.logger);
        }
        // eslint-disable-next-line @typescript-eslint/no-non-null-assertion
        return this._storage!;
    }

It would be great to re-evaluate how we deal with transitions of no storage -> storage, and in order to fix this bug it likely needs to be temp storage -> permanent storage.
For most part that can be abstracted from layers below by always giving storage object and doing switch under the covers, though it begs question - can we do it safely? Or it needs to be an addition (i.e. old blobs are always available, but now we gained write capabilities).
I guess first part on solving problems here - understanding how code works today and how we can remove eslint violation suppressions and make code work correctly with property types, learn and move to fixing this issue.

@jatgarg
Copy link
Contributor

jatgarg commented Jul 30, 2021

Closing with all above mentioned PR.
Following up with this to remove leftover back compat code in future. #6938

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: driver Driver related issues bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants