Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onSnapshot performance degrades "exponentially" as size of snapshot increase (e.g. >5k) #2668

Closed
tonyxiao opened this issue Feb 24, 2020 · 4 comments

Comments

@tonyxiao
Copy link

Environment

  • Operating System version: macOS Catalina 10.15.3 (19D76)
  • Browser version: Chrome Version 80.0.3987.116 (Official Build) (64-bit)
  • Firebase SDK version: 7.8.2
  • Firebase Product: firestore (auth, database, storage, etc)

Describe the problem

When I have an onSnapshot open with a large number of documents (in our case it was 4.7k, as new document gets added server-side, it causes the client to grind to a complete halt. My mac Activity Manager shows 150% CPU usage.

Here's a screenshot of the chrome profiler
image

I've identified the cause to be inside watch_change.ts, specifically

private targetContainsDocument(
    targetId: TargetId,
    key: DocumentKey
  ): boolean {
    const existingKeys = this.metadataProvider.getRemoteKeysForTarget(targetId);
    return existingKeys.has(key);
  }

This function calls getRemoteKeysForTarget inside sync_engine.ts, the implementation of which is as follow

getRemoteKeysForTarget(targetId: TargetId): DocumentKeySet {
    const limboResolution = this.limboResolutionsByTarget[targetId];
    if (limboResolution && limboResolution.receivedDocument) {
      return documentKeySet().add(limboResolution.key);
    } else {
      let keySet = documentKeySet();
      const queries = this.queriesByTarget[targetId];
      if (!queries) {
        return keySet;
      }
      for (const query of queries) {
        const queryView = this.queryViewsByQuery.get(query);
        assert(!!queryView, `No query view found for ${query}`);
        keySet = keySet.unionWith(queryView.view.syncedDocuments);
      }
      return keySet;
    }
  }

The problem is that keySet is a SortedSet, and calling unionWith on it is a very expensive operation (at least n Log(n) if I still remember from algorithm class). Given that targetContainsDocument gets called for every document change, we end up re-sorting the entire keySet just to handle each change, resulting in a performance of roughly m * n * log(n) where m is the # of changes and n is the number of existing documents in the snapshot.

I have successfully worked around this problem for the time being by patching the node_module and replace SortedSet with just a plain Set

inside sync_engine.ts

getRemoteKeysForTarget(targetId: TargetId): DocumentKeySet {
// ...
 let globalOrWindow = typeof global !== 'undefined' ? global : window
      if (globalOrWindow.HACK__getRemoteKeysForTarget === true) {
        const notSortedKeys = new Set()
        for (const query of queries) {
          const queryView = this.queryViewsByQuery.get(query);
          for (const doc of queryView.view.syncedDocuments.toArray()) {
            notSortedKeys.add(doc)
            return notSortedKeys as any
          }
        }
      }
// ...

And inside watch_change.ts

   let globalOrWindow = typeof global !== 'undefined' ? global : window
        global.HACK__getRemoteKeysForTarget = true
        var existingKeys = this.metadataProvider.getRemoteKeysForTarget(targetId);
        global.HACK__getRemoteKeysForTarget = false

I'm not familiar with the firestore architecture enough to propose the "right" solution yet, however it seems that as a starter we should not be sorting the entire key set for every new document change from the server and can add a short-circuit for this special case of detecting has.

@tonyxiao
Copy link
Author

Ooops, just found this #2620 might be either related or duplicate.

@tonyxiao
Copy link
Author

Seeing this commit got merged. https://github.com/firebase/firebase-js-sdk/pull/2624/files

Would the sortedSet performance be similar to Set after this? Or would it still be worth having a separate codepath for detecting targetContainsDocument, which does not care about sort order?

@wu-hui
Copy link
Contributor

wu-hui commented Feb 24, 2020

Hi @tonyxiao

Very impressive analysis and write up!

You are right about the cause, and #2620 is an earlier report of the same issue. The fix is also in the latest version, please give it a try to see if your problem goes away (they should).

The issue is our SortedSet.unionWith is not optimized, which is fixed in the pr you linked.

Thanks.

@wu-hui wu-hui closed this as completed Feb 24, 2020
@tonyxiao
Copy link
Author

Thank you @wu-hui . I applied a patch and can verify that performance is indeed much better and app no longer gets completely stuck. It's still sluggish when receiving changes and blocks the UI, so we're thinking of moving firebase into a webworker to work around that. Would you recommend that approach or do you think there's still a lot more low hanging fruits in the firebase codebase that would obviate that need? One particular issue I ran into while trying this so far was that LocalStorage isn't supported inside webworker.

Oh also, according to https://github.com/firebase/firebase-js-sdk/blob/master/packages/firestore/CHANGELOG.md#unreleased , it's not yet released. Am I missing something?

@firebase firebase locked and limited conversation to collaborators Mar 26, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants