Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Initial /sync isn't as fast as it should be when LL is enabled #3720

Closed
ara4n opened this issue Aug 20, 2018 · 8 comments
Closed

Initial /sync isn't as fast as it should be when LL is enabled #3720

ara4n opened this issue Aug 20, 2018 · 8 comments
Labels
z-major (Deprecated Label)

Comments

@ara4n
Copy link
Member

ara4n commented Aug 20, 2018

I haven't had a chance to profile on my account using the test synchrotron yet, but using sync7 it takes ~3m to calculate a normal initial sync but ~4.5m to calculate one with the LL filter turned on.

I'm going to assume this is because we blow out the state group cache by forcing it reload everything from the DB because the cache isn't able to handle wildcards (i.e. the bit where we say "get any state.. other than members, which should be limited to this subset") due to #3383 (i think?).

I had an idea for a solution to this though which I wanted to jot down here for feedback:

AIUI, DictionaryCache currently lets you query a subset of dict_keys for a given cache entry - e.g. for state_group_cache the entry key would be a group ID, and the value will would be a dict of form { (m.room.member, user_id1): { state_event1 }, (m.room.member, user_id2): { state_event2 } }. If you try to get the cache for a given group ID with dict_keys of [(m.room.member, user_id1)] then you'll just get that cache entry.

However: can we fix this by just splitting the cache into two halves - one for members, and the other for non-members state events? Then our query becomes one request to load all state (for non-member state, which can be cached efficiently) - and another to load specific members keys (which lacks a wildcard, and so will also be cached efficiently).

@richvdh, does this sound plausible?

@ara4n
Copy link
Member Author

ara4n commented Aug 20, 2018

I went ahead anyway and implemented a WIP of this over at #3726.

@richvdh
Copy link
Member

richvdh commented Aug 21, 2018

e.g. for state_group_cache the entry key would be a group ID, and the value will would be a dict of form { (m.room.member, user_id1): { state_event1 }, (m.room.member, user_id2): { state_event2 } }

Fwiw the value will be a DictionaryEntry (whose value will be what you describe): it just allows the cache to store some extra metadata.

@ara4n
Copy link
Member Author

ara4n commented Aug 22, 2018

#3726 seems to have sped things up so that LL is now about 30% faster than non-LL (based on my primary matrix.org account). However, it should be way faster than this, so i suspect something is still going wrong - it needs to be profiled.

@ara4n ara4n changed the title Initial /sync is about 1.5x slower when LL is enabled Initial /sync isn't as fast as it should be when LL is enabled Aug 22, 2018
@ara4n
Copy link
Member Author

ara4n commented Aug 22, 2018

some datapoints: testing against sync7 with limit 1 and LL on/off - @matthew:matrix.org took

@matthew2:matrix.org: 0:03s for LL (3.2MB); 0:08s for non-LL (18.7MB)
@matthew:matrix.org: 2:57 for LL (23.3MB); 4:05 for non-LL (142MB)

I think @matthew:matrix.org might be blowing out the state event cache though for the whole synchrotron. I suspect we can improve this by reducing RAM of the sync (i.e. py3) and then tuning the cache sizing.

@ara4n
Copy link
Member Author

ara4n commented Aug 22, 2018

(these figures should be taken with a pinch of salt as they were done against the live sync7, and there will be large caching artefacts). I just reran the LL one for matthew2 and it took 18s, for instance.

@ara4n
Copy link
Member Author

ara4n commented Aug 22, 2018

uncached (by waiting long enough between syncs to let the cache churn), i'm seeing 23s for non-LL and 17s for LL on matthew2, which matches the same 30% faster i got for the matthew account.

@neilisfragile neilisfragile added p1 z-minor (Deprecated Label) labels Aug 28, 2018
@ara4n
Copy link
Member Author

ara4n commented Sep 4, 2018

tested again against sync-test; the original #2970 code ran 5x faster in LL than in full. However, by now, it runs same speed with & without LL.

It looks like one major factor is that:

        # FIXME: it feels very heavy to load up every single membership event
        # just to calculate the counts.
        member_events = yield self.store.get_events(member_ids.values())

is a massive issue for the room summary behaviour as it loads all events from the DB and so eliminates all our benefits in trying to avoid loading membership events from the DB.

Separately, incremental syncs seem to be much slower than they should be too.

It feels like something else has slowed things down during the review process of #2970 - possibly the split cache isn't helping either in terms of txn count.

@ara4n ara4n added z-major (Deprecated Label) and removed z-major (Deprecated Label) labels Sep 4, 2018
@ara4n
Copy link
Member Author

ara4n commented Sep 4, 2018

re-triaging given the feature doesn't work otherwise.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
z-major (Deprecated Label)
Projects
None yet
Development

No branches or pull requests

3 participants