/sync performance slow since v0.10.0 #2777

raidokaldma · 2022-10-07T12:56:09Z

Background information

v0.10.1
Monolith
Postgres
Running in Docker

Description

I'm running Dendrite in Docker and noticed a huge difference in performance when doing initial sync when switching from v0.9.9 to v0.10.0 or v0.10.1

Response times in my server:
With docker image matrixdotorg/dendrite-monolith:v0.9.9

# With existing filter
$ curl -o /dev/null -s -w 'Total: %{time_total}s\n' 'https://[my-domain]/_matrix/client/r0/sync?filter=7&timeout=0&_cacheBuster=1665145519640&access_token=[redacted]'
Total: 1,113253s

# With inline filter
$ curl -o /dev/null -s -w 'Total: %{time_total}s\n' 'https://[my-domain]/_matrix/client/r0/sync?filter=%7B%22room%22%3A%7B%22timeline%22%3A%7B%22limit%22%3A100%7D%7D%7D&timeout=0&access_token=[redacted]'
Total: 6,983575s

With docker image matrixdotorg/dendrite-monolith:v0.10.1

# With existing filter
$ curl -o /dev/null -s -w 'Total: %{time_total}s\n' 'https://[my-domain]/_matrix/client/r0/sync?filter=7&timeout=0&_cacheBuster=1665145519640&access_token=[redacted]'
Total: 5,740234s

# With inline filter
$ curl -o /dev/null -s -w 'Total: %{time_total}s\n' 'https://[my-domain]/_matrix/client/r0/sync?filter=%7B%22room%22%3A%7B%22timeline%22%3A%7B%22limit%22%3A100%7D%7D%7D&timeout=0&access_token=[redacted]'
Total: 15,450809s

The text was updated successfully, but these errors were encountered:

Mic92 · 2022-10-08T15:11:30Z

I can confirm. With weechat-matrix I am no longer able to login for this reason.

neilalexander · 2022-10-10T10:06:28Z

How are your database connections configured? Are you using the global database section or per-component? If so, what are your max_open_conns values?

raidokaldma · 2022-10-10T11:09:32Z

I'm using one global database connection pool with following config:

  database:
    connection_string: "postgresql://[redacted]"
    max_open_conns: 10
    max_idle_conns: 5
    conn_max_lifetime: -1

neilalexander · 2022-10-10T12:17:14Z

With a total connection count of 10 shared across the entirety of Dendrite, I'm not at all surprised to hear things are slow. Are you using such a low limit for a specific reason?

raidokaldma · 2022-10-10T12:36:32Z

I did an experiment and set max_open_conns: 90 as in default config and ran the same requests, but I did not notice any difference in response times.

Server I am running, does not get a lot of traffic, and I'm almost alone running queries, so available connections do not seem to be exhausted. Looking from the metrics that PostgreSQL reports, I can tell that Dendrite uses 5 connections or less.

Mic92 · 2022-10-11T07:06:10Z

After increasing per component connections to 100, I was able to login again: Mic92/dotfiles@9f6de28
I currently don't see consistently that many connections being used, so I am not sure this if they are used for a short amount of time.

raidokaldma · 2022-10-19T09:12:15Z

I created a database dump, imported data to a new database and enabled statement logging in Postgres for that database only. Then I ran docker container with version 0.9.1 and 0.10.3 (latest). Here's what I found:

matrixdotorg/dendrite-monolith:v0.9.1

curl -o /dev/null -sS -w 'Total: %{time_total}s\n' 'http://localhost:8008/_matrix/client/r0/sync?filter=7&timeout=0&access_token=[redacted]'
Total: 0,846131s

With single HTTP request, Dendrite made 858 database queries, top 10 being:

441 SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($1) AND (CARDINALITY($2::bigint[]) = 0 OR event_type_nid = ANY($2)) AND (CARDINALITY($3::bigint[]) = 0 OR event_state_key_nid = ANY($3)) ORDER BY event_type_nid, event_state_key_nid ASC
 42 SELECT COUNT(*) FROM ( SELECT DISTINCT ON (room_id, user_id) room_id, user_id, membership FROM syncapi_memberships WHERE room_id = $1 AND stream_pos <= $2 ORDER BY room_id, user_id, stream_pos DESC) t WHERE t.membership = $3
 25 SELECT content FROM account_data WHERE localpart = $1 AND room_id = $2 AND type = $3
 21 SELECT topological_position, stream_position FROM syncapi_output_room_events_topology WHERE event_id = $1
 21 SELECT state_snapshot_nid, state_block_nids FROM roomserver_state_snapshots WHERE state_snapshot_nid = ANY($1) ORDER BY state_snapshot_nid ASC
 21 SELECT state_block_nid, event_nids FROM roomserver_state_block WHERE state_block_nid = ANY($1) ORDER BY state_block_nid ASC
 21 SELECT room_version, room_nid, state_snapshot_nid, latest_event_nids FROM roomserver_rooms WHERE room_id = $1
 21 SELECT room_nid, room_version FROM roomserver_rooms WHERE room_nid = ANY($1)
 21 SELECT latest_event_nids, state_snapshot_nid FROM roomserver_rooms WHERE room_nid = $1
 21 SELECT event_type, event_type_nid FROM roomserver_event_types WHERE event_type = ANY($1)

matrixdotorg/dendrite-monolith:v0.10.3 (latest)

curl -o /dev/null -sS -w 'Total: %{time_total}s\n' 'http://localhost:8008/_matrix/client/r0/sync?filter=7&timeout=0&access_token=[redacted]'
Total: 13,016279s

With single HTTP request, Dendrite made 3504 database queries, top 10 being:

2056 SELECT event_type_nid, event_state_key_nid, event_nid FROM roomserver_events WHERE event_nid = ANY($1) AND (CARDINALITY($2::bigint[]) = 0 OR event_type_nid = ANY($2)) AND (CARDINALITY($3::bigint[]) = 0 OR event_state_key_nid = ANY($3)) ORDER BY event_type_nid, event_state_key_nid ASC
 185 SELECT room_nid, room_version FROM roomserver_rooms WHERE room_nid = ANY($1)
 185 SELECT event_nid, room_nid FROM roomserver_events WHERE event_nid = ANY($1)
 185 SELECT event_nid, event_json FROM roomserver_event_json WHERE event_nid = ANY($1) ORDER BY event_nid ASC
 185 SELECT event_nid, event_id FROM roomserver_events WHERE event_nid = ANY($1)
 167 SELECT event_nid, state_snapshot_nid FROM roomserver_events WHERE event_id = $1
 127 SELECT state_block_nid, event_nids FROM roomserver_state_block WHERE state_block_nid = ANY($1) ORDER BY state_block_nid ASC
  42 SELECT state_snapshot_nid, state_block_nids FROM roomserver_state_snapshots WHERE state_snapshot_nid = ANY($1) ORDER BY state_snapshot_nid ASC
  42 SELECT room_version, room_nid, state_snapshot_nid, latest_event_nids FROM roomserver_rooms WHERE room_id = $1
  42 SELECT event_state_key, event_state_key_nid FROM roomserver_event_state_keys WHERE event_state_key = ANY($1)

EDIT. I additionally tested with database (from same dump) and Dendrite container (0.10.3) running on the same host and noticed that HTTP request times were quite OK (around 0.6 sec). If Dendrite and PostgreSQL are running on different hosts, then network latency becomes an issue and request times are as slow as I've described above.

Any thoughts, any suggestions what to try next?

This optimizes history visibility checks by (mostly) avoiding database hits. Possibly solves #2777 Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

S7evinK · 2022-11-01T15:38:12Z

@raidokaldma mind giving main/latest a try?

raidokaldma · 2022-11-02T06:28:51Z

Edit: I removed my previous response, which was incorrect

Since last time I had removed previous database and recreated a new one, so response times are not 1:1 comparable with previous comments, but regardless here is what I found:

I had matrixdotorg/dendrite-monolith:v0.10.4 running, so I did a measurement to get the baseline response time:

curl -o /dev/null -sS -w 'Total: %{time_total}s\n' 'http://localhost:8008/_matrix/client/r0/sync?filter=4&timeout=0&access_token=[omitted]
Total: 13,093631s

Then I ran matrixdotorg/dendrite-monolith:v0.10.6 which is the latest right now, and ran the same request again:

curl -o /dev/null -sS -w 'Total: %{time_total}s\n' 'http://localhost:8008/_matrix/client/r0/sync?filter=4&timeout=0&access_token=[omitted]'
Total: 6,367924s

Improvement is noticable indeed, but it is still slower compared to what I had before 0.10.0.

That being said, I'm doing other sorts of optimization in my JS code (matrix-js-sdk) to overcome the slowness, making use of IndexedDBStore for example. So the current performance is not a big problem any more.

And also big thank you for investing time and shipping improvements.

kegsay · 2022-12-06T15:29:25Z

@raidokaldma thank you very much for your detailed and comprehensive reporting. I have a few follow-up questions:

How large is the account being synced in terms of number of rooms on the account?
What is the timeline limit set to in the stored filter?
Would you be interested in trying out experimental branches which may improve sync performance?

Thanks!

raidokaldma · 2022-12-07T14:48:59Z

Hey, thanks for dedicating time to the issue.

I still have my database dump, and here's a summary of what my database contains:
1.

select relname, n_live_tup from pg_stat_all_tables where schemaname = 'public' order by relname;
┌──────────────────────────────────────────────┬────────────┐
│                   relname                    │ n_live_tup │
├──────────────────────────────────────────────┼────────────┤
│ db_migrations                                │         18 │
│ federationsender_blacklist                   │          0 │
│ federationsender_inbound_peeks               │          0 │
│ federationsender_joined_hosts                │        216 │
│ federationsender_notary_server_keys_json     │          0 │
│ federationsender_notary_server_keys_metadata │          0 │
│ federationsender_outbound_peeks              │          0 │
│ federationsender_queue_edus                  │          0 │
│ federationsender_queue_json                  │          0 │
│ federationsender_queue_pdus                  │          0 │
│ keydb_server_keys                            │          0 │
│ keyserver_cross_signing_keys                 │          6 │
│ keyserver_cross_signing_sigs                 │          6 │
│ keyserver_device_keys                        │        589 │
│ keyserver_key_changes                        │         92 │
│ keyserver_one_time_keys                      │          0 │
│ keyserver_stale_device_lists                 │          0 │
│ mediaapi_media_repository                    │          0 │
│ mediaapi_thumbnail                           │          0 │
│ roomserver_event_json                        │       2098 │
│ roomserver_event_state_keys                  │         93 │
│ roomserver_event_types                       │         11 │
│ roomserver_events                            │       2098 │
│ roomserver_invites                           │        237 │
│ roomserver_membership                        │        329 │
│ roomserver_previous_events                   │       2001 │
│ roomserver_published                         │          0 │
│ roomserver_redactions                        │          0 │
│ roomserver_room_aliases                      │         64 │
│ roomserver_rooms                             │         97 │
│ roomserver_state_block                       │       1985 │
│ roomserver_state_snapshots                   │       1986 │
│ syncapi_account_data_type                    │        256 │
│ syncapi_backward_extremities                 │          0 │
│ syncapi_current_room_state                   │        791 │
│ syncapi_filter                               │         26 │
│ syncapi_ignores                              │          0 │
│ syncapi_invite_events                        │       1021 │
│ syncapi_memberships                          │        470 │
│ syncapi_notification_data                    │        273 │
│ syncapi_output_room_events                   │       2098 │
│ syncapi_output_room_events_topology          │       2098 │
│ syncapi_peeks                                │          0 │
│ syncapi_presence                             │         12 │
│ syncapi_receipts                             │        157 │
│ syncapi_relations                            │          0 │
│ syncapi_send_to_device                       │          0 │
│ userapi_account_datas                        │        256 │
│ userapi_accounts                             │         92 │
│ userapi_daily_stats                          │          0 │
│ userapi_daily_visits                         │       2917 │
│ userapi_devices                              │        583 │
│ userapi_key_backup_versions                  │          0 │
│ userapi_key_backups                          │          0 │
│ userapi_login_tokens                         │          0 │
│ userapi_notifications                        │         34 │
│ userapi_openid_tokens                        │          1 │
│ userapi_profiles                             │         92 │
│ userapi_pushers                              │          0 │
│ userapi_threepids                            │          0 │
└──────────────────────────────────────────────┴────────────┘

# From dendrite DB:
select filter from syncapi_filter where id = 4;
{"account_data":{},"presence":{},"room":{"account_data":{},"ephemeral":{},"state":{},"timeline":{"limit":8}}}

Sure, if I have time. It would be nice if experimental builds were available as docker image too.

S7evinK · 2023-02-04T16:42:59Z

@raidokaldma not yet merged, but PR #2961 should significantly reduce the queries and be much more performant. Your filter might cause problems, because you're getting the whole state for each room, enabling lazy loading will improve the response time drastically with that PR.

Should fix the following issues or make a lot less worse when using Postgres: The main issue behind #2911: The client gives up after a certain time, causing a cascade of context errors, because the response couldn't be built up fast enough. This mostly happens on accounts with many rooms, due to the inefficient way we're getting recent events and current state For #2777: The queries for getting the membership events for history visibility were being executed for each room (I think 185?), resulting in a whooping 2k queries for membership events. (Getting the statesnapshot -> block nids -> actual wanted membership event) Both should now be better by: - Using a LATERAL join to get all recent events for all joined rooms in one go (TODO: maybe do the same for room summary and current state etc) - If we're lazy loading on initial syncs, we're now not getting the whole current state, just to drop the majority of it because we're lazy loading members - we add a filter to exclude membership events on the first call to `CurrentState`. - Using an optimized query to get the membership events needed to calculate history visibility --------- Co-authored-by: kegsay <kegan@matrix.org>

raidokaldma · 2023-02-13T10:12:27Z

I finally found time to test latest changes. In my case initial sync on version v0.10.8 took 5s, with v0.11.1 same request responds in 0.7s.

The difference is noticeable.

Thank you @S7evinK!

Feel free to close the issue if no further work on the issue is planned.

kegsay · 2023-02-13T18:39:18Z

We will be adding in more comprehensive profiling via runtime/trace over the next few weeks, so we should hopefully be able to identify this sort of thing much more easily. Closing this.

neilalexander added the needs investigation label Oct 10, 2022

S7evinK mentioned this issue Nov 1, 2022

Optimize history visibility checks #2848

Merged

neilalexander added a commit that referenced this issue Nov 1, 2022

Optimize history visibility checks (#2848)

2acc1d6

This optimizes history visibility checks by (mostly) avoiding database hits. Possibly solves #2777 Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>

This comment was marked as off-topic.

Sign in to view

kegsay added X-Needs-Info This issue is blocked awaiting information from the reporter and removed needs investigation labels Dec 5, 2022

S7evinK mentioned this issue Feb 1, 2023

Optimize /sync and history visibility #2961

Merged

kegsay closed this as completed Feb 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

/sync performance slow since v0.10.0 #2777

/sync performance slow since v0.10.0 #2777

raidokaldma commented Oct 7, 2022

Mic92 commented Oct 8, 2022

neilalexander commented Oct 10, 2022

raidokaldma commented Oct 10, 2022

neilalexander commented Oct 10, 2022

raidokaldma commented Oct 10, 2022

Mic92 commented Oct 11, 2022

raidokaldma commented Oct 19, 2022 •

edited

Loading

S7evinK commented Nov 1, 2022

raidokaldma commented Nov 2, 2022 •

edited

Loading

This comment was marked as off-topic.

kegsay commented Dec 6, 2022

raidokaldma commented Dec 7, 2022

S7evinK commented Feb 4, 2023

raidokaldma commented Feb 13, 2023

kegsay commented Feb 13, 2023

/sync performance slow since v0.10.0 #2777

/sync performance slow since v0.10.0 #2777

Comments

raidokaldma commented Oct 7, 2022

Background information

Description

Mic92 commented Oct 8, 2022

neilalexander commented Oct 10, 2022

raidokaldma commented Oct 10, 2022

neilalexander commented Oct 10, 2022

raidokaldma commented Oct 10, 2022

Mic92 commented Oct 11, 2022

raidokaldma commented Oct 19, 2022 • edited Loading

matrixdotorg/dendrite-monolith:v0.9.1

matrixdotorg/dendrite-monolith:v0.10.3 (latest)

S7evinK commented Nov 1, 2022

raidokaldma commented Nov 2, 2022 • edited Loading

This comment was marked as off-topic.

kegsay commented Dec 6, 2022

raidokaldma commented Dec 7, 2022

S7evinK commented Feb 4, 2023

raidokaldma commented Feb 13, 2023

kegsay commented Feb 13, 2023

raidokaldma commented Oct 19, 2022 •

edited

Loading

raidokaldma commented Nov 2, 2022 •

edited

Loading