`have_seen_events` is eating up time while processing state and auth events while backfilling #13625

MadLittleMods · 2022-08-25T02:08:44Z

Mentioned in internal doc. Part of #13356

Optimize have_seen_events because when backfilling #matrix:matrix.org, 20s is just calling have_seen_events on the 200k state and auth events in the room.

have_seen_events (157 db.have_seen_events) takes 6.62s to process 77k events
have_seen_events (246 db.have_seen_events) takes 13.19s to process 122k events

Benchmark and timing is from the in-flight PR, #13561

The @cachedList is so slow 🐌 and we're better off removing it at this point. Would be good see if @cachedList can be improved or if we can improve things further with a better cache.

# events Timing Timing (removing the @cachedList cache)

50k

.

Benchmark time (1 cold cache ): 3.7170820236206055
Benchmark time (2, warm cache): 0.2985079288482666
Benchmark time (3, warm cache): 0.28847789764404297
Benchmark time (4, odds      ): 0.1537461280822754
Benchmark time (5, odds      ): 0.14780497550964355
Benchmark time (6, evens     ): 0.1475691795349121
Benchmark time (7, evens     ): 0.14868617057800293

.

Benchmark time (1 cold cache ): 0.3248419761657715
Benchmark time (2, warm cache): 0.32351016998291016
Benchmark time (3, warm cache): 0.3136260509490967
Benchmark time (4, odds      ): 0.15899014472961426
Benchmark time (5, odds      ): 0.15054106712341309
Benchmark time (6, evens     ): 0.15465688705444336
Benchmark time (7, evens     ): 0.1412408351898193

100k

.

Benchmark time (1 cold cache ): 8.10055136680603
Benchmark time (2, warm cache): 0.6121761798858643
Benchmark time (3, warm cache): 0.6093218326568604
Benchmark time (4, odds      ): 0.29950785636901855
Benchmark time (5, odds      ): 0.3049640655517578
Benchmark time (6, evens     ): 0.3025388717651367
Benchmark time (7, evens     ): 0.29833483695983887

.

Benchmark time (1 cold cache ): 0.8466510772705078
Benchmark time (2, warm cache): 0.8022150993347168
Benchmark time (3, warm cache): 0.7888422012329102
Benchmark time (4, odds      ): 0.3941817283630371
Benchmark time (5, odds      ): 0.416118860244751
Benchmark time (6, evens     ): 0.42328405380249023
Benchmark time (7, evens     ): 0.3695280551910400

200k

.

Benchmark time (1 cold cache ): 19.106724977493286
Benchmark time (2, warm cache): 22.98161005973816
Benchmark time (3, warm cache): 23.126408100128174
Benchmark time (4, odds      ): 11.401129007339478
Benchmark time (5, odds      ): 0.6159579753875732
Benchmark time (6, evens     ): 12.087002992630005
Benchmark time (7, evens     ): 0.6241748332977295

.

Benchmark time (1 cold cache ): 1.328582763671875
Benchmark time (2, warm cache): 1.279066801071167
Benchmark time (3, warm cache): 1.2781598567962646
Benchmark time (4, odds      ): 0.6520607471466064
Benchmark time (5, odds      ): 0.647273063659668
Benchmark time (6, evens     ): 0.6393017768859863
Benchmark time (7, evens     ): 0.6427278518676758

The text was updated successfully, but these errors were encountered:

Fix #13856 Fix #13865 > Discovered while trying to make Synapse fast enough for [this MSC2716 test for importing many batches](matrix-org/complement#214 (comment)). As an example, disabling the `have_seen_event` cache saves 10 seconds for each `/messages` request in that MSC2716 Complement test because we're not making as many federation requests for `/state` (speeding up `have_seen_event` itself is related to #13625) > > But this will also make `/messages` faster in general so we can include it in the [faster `/messages` milestone](https://github.com/matrix-org/synapse/milestone/11). > > *-- #13856 ### The problem `_invalidate_caches_for_event` doesn't run in monolith mode which means we never even tried to clear the `have_seen_event` and other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT). Additionally there was bug with the key being wrong so `_invalidate_caches_for_event` never invalidates the `have_seen_event` cache even when it does run. Because we were using the `@cachedList` wrong, it was putting items in the cache under keys like `((room_id, event_id),)` with a `set` in a `set` (ex. `(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)`) and we we're trying to invalidate with just `(room_id, event_id)` which did nothing.

MadLittleMods added the A-Messages-Endpoint /messages client API endpoint (`RoomMessageListRestServlet`) (which also triggers /backfill) label Aug 25, 2022

MadLittleMods self-assigned this Aug 25, 2022

MadLittleMods added this to the Q3 2022 - Faster /messages milestone Aug 25, 2022

MadLittleMods mentioned this issue Aug 25, 2022

Optimize have_seen_events #13561

Closed

6 tasks

This was referenced Sep 21, 2022

have_seen_event cache is not invalidated when we persist an event #13856

Closed

Fix have_seen_event cache not being invalidated #13863

Merged

kittykat mentioned this issue Oct 24, 2022

Profile /messages and make it faster v1 #14284

Closed

7 tasks

MadLittleMods removed their assignment Jan 31, 2023

kittykat mentioned this issue Mar 2, 2023

Profile /messages and make it even faster v2 #15182

Closed

8 tasks

matrixbot mentioned this issue Dec 21, 2023

have_seen_events is eating up time while processing state and auth events while backfilling element-hq/synapse#13625

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`have_seen_events` is eating up time while processing state and auth events while backfilling #13625

`have_seen_events` is eating up time while processing state and auth events while backfilling #13625

MadLittleMods commented Aug 25, 2022 •

edited

Loading

have_seen_events is eating up time while processing state and auth events while backfilling #13625

have_seen_events is eating up time while processing state and auth events while backfilling #13625

Comments

MadLittleMods commented Aug 25, 2022 • edited Loading

`have_seen_events` is eating up time while processing state and auth events while backfilling #13625

`have_seen_events` is eating up time while processing state and auth events while backfilling #13625

MadLittleMods commented Aug 25, 2022 •

edited

Loading