Very slow deletes from `device_inbox` during `/sync` requests #16479

richvdh · 2023-10-12T21:00:36Z

Description

Synchrotron workers are maxing out db connections, resulting in multi-minute scheduling time:

Inspection shows that these are operations of the shape

                DELETE FROM device_inbox WHERE ctid IN (
                  SELECT ctid FROM device_inbox
                  WHERE user_id = '@<user_id>' AND device_id = 'LTHNOYWSMJ' AND stream_id <= 7526430
                  LIMIT 100
                )

Some queries have LIMIT 1000 rather than LIMIT 100.

Steps to reproduce

have lots of messages in device_inbox
/sync

Homeserver

EMS host

Synapse Version

1.94.0

Installation Method

Docker (matrixdotorg/synapse)

Database

postgres 11

Workers

Multiple workers

Platform

k8s

Configuration

No response

Relevant log output

...

Anything else that would be useful to know?

No response

The text was updated successfully, but these errors were encountered:

richvdh · 2023-10-12T21:04:09Z

an EXPLAIN on the queries in question shows that they are sequence-scanning the table to do the delete:

Delete on device_inbox  (cost=8.60..2559182.13 rows=1 width=36)
  ->  Nested Loop  (cost=8.60..2559182.13 rows=1 width=36)
        Join Filter: (device_inbox.ctid = "ANY_subquery".ctid)
        ->  Unique  (cost=8.60..8.61 rows=1 width=36)
              ->  Sort  (cost=8.60..8.61 rows=1 width=36)
                    Sort Key: "ANY_subquery".ctid
                    ->  Subquery Scan on "ANY_subquery"  (cost=0.56..8.59 rows=1 width=36)
                          ->  Limit  (cost=0.56..8.58 rows=1 width=6)
                                ->  Index Scan using device_inbox_user_stream_id on device_inbox device_inbox_1  (cost=0.56..8.58 rows=1 width=6)
                                      Index Cond: ((user_id = '<user_id>'::text) AND (device_id = 'SAJAWKWGHO'::text) AND (stream_id <= 7535635))
        ->  Seq Scan on device_inbox  (cost=0.00..2376344.90 rows=14626290 width=6)

Apparently the query optimiser isn't smart enough to know that it would do better searching for each of the 100 ctids individually than scanning the whole table.

richvdh · 2023-10-12T21:14:03Z

This code changed recently in #16240 and again in #16312.

A couple of thoughts:

I'm surprised it was necessary to change the code in the SyncHandler at all. We know that will be deleting at most 100 rows, because they are exactly the rows that were served to the client on the previous /sync. AFAICT there is no need to LIMIT the delete, and therefore no need to mess about with ctid and no need for the fallback-to-async-task.
For the cleanup when a device is deleted, we should probably pick a stream_id to delete up to for each batch, rather than messing with ctid.

MatMaul · 2023-10-13T07:48:28Z

Apparently the query optimiser isn't smart enough to know that it would do better searching for each of the 100 ctids individually than scanning the whole table.

Sadness...
https://postgrespro.com/list/thread-id/2446912

For the cleanup when a device is deleted, we should probably pick a stream_id to delete up to for each batch, rather than messing with ctid.

something along that?

DELETE FROM device_inbox WHERE user_id = ? AND device_id = ? AND stream_id <= (
  SELECT MAX(stream_id) FROM device_inbox
  WHERE user_id = '@<user_id>' AND device_id = 'LTHNOYWSMJ' AND stream_id <= 7526430
  LIMIT 100
)

I'm surprised it was necessary to change the code in the SyncHandler at all. We know that will be deleting at most 100 rows, because they are exactly the rows that were served to the client on the previous /sync. AFAICT there is no need to LIMIT the delete, and therefore no need to mess about with ctid and no need for the fallback-to-async-task.

🤦 Indeed, I just didn't think about it.

kegsay · 2023-10-13T07:56:52Z

The query planner woes sounds similar to matrix-org/sliding-sync@fa67467 except in that case we wanted a seq scan.

richvdh · 2023-10-13T08:30:32Z

Apparently the query optimiser isn't smart enough to know that it would do better searching for each of the 100 ctids individually than scanning the whole table.

Sadness... https://postgrespro.com/list/thread-id/2446912

For the cleanup when a device is deleted, we should probably pick a stream_id to delete up to for each batch, rather than messing with ctid.

something along that?
DELETE FROM device_inbox WHERE user_id = ? AND device_id = ? AND stream_id <= (
  SELECT MAX(stream_id) FROM device_inbox
  WHERE user_id = '@<user_id>' AND device_id = 'LTHNOYWSMJ' AND stream_id <= 7526430
  LIMIT 100
)

Yes. Except you need an ORDER BY stream_id in the inner query.

And I don't think you need stream_id <= 7526430 for device deletions (we need to delete all messages for the given device, irrespective of stream_id). It's harmless, but redundant.

MatMaul · 2023-10-13T12:20:06Z

And I don't think you need stream_id <= 7526430 for device deletions (we need to delete all messages for the given device, irrespective of stream_id). It's harmless, but redundant.

Someone can create a device with the same ID, and without the check it would then delete the messages aimed at the new device.

richvdh · 2023-10-13T13:02:47Z

well, that would be a stupid thing to do :)

But fair enough. Stick a comment in, though - that's quite subtle!

MatMaul · 2023-10-14T19:36:11Z

I don't get the same plan with my deploy on pg15:

 Delete on device_inbox  (cost=1801.45..1962.75 rows=0 width=0)
   ->  Nested Loop  (cost=1801.45..1962.75 rows=40 width=36)
         ->  HashAggregate  (cost=1801.45..1801.85 rows=40 width=36)
               Group Key: "ANY_subquery".ctid
               ->  Subquery Scan on "ANY_subquery"  (cost=1641.08..1801.35 rows=40 width=36)
                     ->  Limit  (cost=1641.08..1800.95 rows=40 width=6)
                           ->  Bitmap Heap Scan on device_inbox device_inbox_1  (cost=1641.08..1800.95 rows=40 width=6)
                                 Recheck Cond: ((user_id = '@mxid'::text) AND (device_id = 'DEVICEID'::text))
                                 ->  Bitmap Index Scan on device_inbox_user_stream_id  (cost=0.00..1641.07 rows=40 width=0)
                                       Index Cond: ((user_id = '@mxid'::text) AND (device_id = 'DEVICEID'::text))
         ->  Tid Scan on device_inbox  (cost=0.00..4.01 rows=1 width=6)
               TID Cond: (ctid = "ANY_subquery".ctid)
(12 rows)

With #16492 , I get:

 Delete on device_inbox  (cost=2178.99..2231.08 rows=0 width=0)
   InitPlan 1 (returns $0)
     ->  Aggregate  (cost=1646.15..1646.16 rows=1 width=8)
           ->  Limit  (cost=0.67..1645.65 rows=40 width=8)
                 ->  Index Only Scan using device_inbox_user_stream_id on device_inbox device_inbox_1  (cost=0.67..1645.65 rows=40 width=8)
                       Index Cond: ((user_id = '@mat:tout.im'::text) AND (device_id = 'CCUJZYMJLP'::text))
   ->  Bitmap Heap Scan on device_inbox  (cost=532.83..584.92 rows=13 width=6)
         Recheck Cond: ((user_id = '@mat:tout.im'::text) AND (device_id = 'CCUJZYMJLP'::text) AND (stream_id <= $0))
         ->  Bitmap Index Scan on device_inbox_user_stream_id  (cost=0.00..532.83 rows=13 width=0)
               Index Cond: ((user_id = '@mat:tout.im'::text) AND (device_id = 'CCUJZYMJLP'::text) AND (stream_id <= $0))
(10 rows)

So it's probably a tad slower on more modern PG, but at least we will (?) not get a seqscan on older ones.

@richvdh could you do a comparison with your test DB to check the new query is ok on PG11?

Also note that my new query delete 10 rows and not 12, I need to check why.

MatMaul · 2023-10-14T22:21:16Z

The indicated rows is the number of lines for the plan and not the number of deleted records 😭 .

I have executed the 2 inner SELECT and they do return the same number of rows for my test case (125 rows).

clokep · 2023-10-17T17:06:51Z

I think this was fixed by #16491?

richvdh · 2023-10-18T08:31:55Z

#16491 will only have fixed half the problem.

To-device message deletions happen in two places: 1) after they are successfully sent to the target device via /sync; 2) when the target device is deleted. Both were using delete_messages_for_device_txn which uses a very inefficient query plan on Postgres 11.

#16491 changes the /sync case to no longer use delete_messages_for_device_txn and therefore fixes case 1, but does not change the query plan in that function so does not fix case 2.

richvdh · 2023-10-18T08:33:21Z

oh, in fairness, the subject of this issue is just about the /sync requests, so yes, I guess it is fixed. Sorry.

For completeness, the device-deletion side looks to be covered in #16492.

MatMaul self-assigned this Oct 13, 2023

MatMaul added a commit that referenced this issue Oct 13, 2023

Improve perf of delete device messages query, cf #16479

6104ab5

This was referenced Oct 13, 2023

Improve performance of delete device messages query #16483

Closed

Remove useless async job to delete device messages on sync #16484

Closed

This was referenced Oct 14, 2023

Remove useless async job to delete device messages on sync #16491

Merged

Improve performance of delete device messages query #16492

Merged

clokep closed this as completed Oct 17, 2023

richvdh reopened this Oct 18, 2023

richvdh closed this as completed Oct 18, 2023

Valodim mentioned this issue Nov 13, 2023

synapse performs inordinate amount of IO on DELETE FROM on the device_inbox table #16621

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow deletes from `device_inbox` during `/sync` requests #16479

Very slow deletes from `device_inbox` during `/sync` requests #16479

richvdh commented Oct 12, 2023 •

edited

richvdh commented Oct 12, 2023

richvdh commented Oct 12, 2023

MatMaul commented Oct 13, 2023 •

edited

kegsay commented Oct 13, 2023

richvdh commented Oct 13, 2023

MatMaul commented Oct 13, 2023 •

edited

richvdh commented Oct 13, 2023

MatMaul commented Oct 14, 2023 •

edited

MatMaul commented Oct 14, 2023

clokep commented Oct 17, 2023

richvdh commented Oct 18, 2023

richvdh commented Oct 18, 2023

Very slow deletes from device_inbox during /sync requests #16479

Very slow deletes from device_inbox during /sync requests #16479

Comments

richvdh commented Oct 12, 2023 • edited

Description

Steps to reproduce

Homeserver

Synapse Version

Installation Method

Database

Workers

Platform

Configuration

Relevant log output

Anything else that would be useful to know?

richvdh commented Oct 12, 2023

richvdh commented Oct 12, 2023

MatMaul commented Oct 13, 2023 • edited

kegsay commented Oct 13, 2023

richvdh commented Oct 13, 2023

MatMaul commented Oct 13, 2023 • edited

richvdh commented Oct 13, 2023

MatMaul commented Oct 14, 2023 • edited

MatMaul commented Oct 14, 2023

clokep commented Oct 17, 2023

richvdh commented Oct 18, 2023

richvdh commented Oct 18, 2023

Very slow deletes from `device_inbox` during `/sync` requests #16479

Very slow deletes from `device_inbox` during `/sync` requests #16479

richvdh commented Oct 12, 2023 •

edited

MatMaul commented Oct 13, 2023 •

edited

MatMaul commented Oct 13, 2023 •

edited

MatMaul commented Oct 14, 2023 •

edited