[Messenger] Bad performances with the Doctrine/MySQL transport

### Symfony version(s) affected

5.4.13 (but I guess it happens with 6.* as well as the code is the same)

### Description

We have the following setup
* MySQL 8.0.15 with an InnoDb engine and `READ-COMMITTED` as transaction isolation level
* 4 Ubuntu-based workers running supervisor which runs 2 processes
* Command to consume: `php bin/console messenger:consume async_email --limit=100 --time-limit=300 --memory-limit=480M`

We experienced a lag of up to ~80k messages in the queue because workers spend most of their time waiting to lock a message to process.
A colleague tried to increase the number of processes from 8 to 80 but it made things worst.
Our processing tops at 3 or 4 consumed messages per seconds on this specific queue

Here are my findings so far:

The MySQL command `SHOW FULL PROCESSLIST` shows a lot of the `SELECT ... FOR UPDATE` requests are in the `Sending data` state.

The `SELECT ... FOR UPDATE` uses the index to find and lock a message:
```
EXPLAIN SELECT m.id FROM messenger_email m WHERE (m.delivered_at is null OR m.delivered_at < '2022-10-10 20:16:29') AND (m.available_at <= '2022-10-10 21:16:29') AND (m.queue_name = 'email') ORDER BY available_at ASC LIMIT 1 FOR UPDATE
```

id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
1 | SIMPLE | m | NULL | range | IDX_DC4C2A4016BA31DB,IDX_DC4C2A40E3BD61CE,IDX_DC4C2A40FB7336F0 | IDX_DC4C2A40E3BD61CE | 5 | NULL | 37451 | 25.00 | Using index condition; Using where

_IDX_DC4C2A40E3BD61CE is on the `available_at` column_

Because it uses an index and we are using `READ-COMMITTED` as the transaction isolation level, then the following happens:
> If the WHERE condition includes an indexed column, and InnoDB uses the index, only the indexed column is considered when taking and retaining record locks. In the following example, the first [UPDATE](https://dev.mysql.com/doc/refman/8.0/en/update.html) takes and retains an x-lock on each row where b = 2. The second [UPDATE](https://dev.mysql.com/doc/refman/8.0/en/update.html) blocks when it tries to acquire x-locks on the same records, as it also uses the index defined on column b.
```
CREATE TABLE t (a INT NOT NULL, b INT, c INT, INDEX (b)) ENGINE = InnoDB;
INSERT INTO t VALUES (1,2,3),(2,2,4);
COMMIT;

# Session A
START TRANSACTION;
UPDATE t SET b = 3 WHERE b = 2 AND c = 3;

# Session B
UPDATE t SET b = 4 WHERE b = 2 AND c = 4;
```
Source: https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html

So the bottleneck in my case happens when workers search for a message to lock because the query is blocking.
Once locked, the message is very very quick to handle.




### How to reproduce

I guess it's quite difficult to reproduce as I think there is a balance between:
* The number of processes consuming the queue
* The time spend to consume one message
* The load on the MySQL server

### Possible Solution

# Solution 1 - Remove the indexes
As suggested by https://github.com/symfony/symfony/issues/42868 removing the indexes actually removes the blockings issue as per MySQL documentation because without indexes:
> If READ COMMITTED is used instead, the first [UPDATE](https://dev.mysql.com/doc/refman/8.0/en/update.html) acquires an x-lock on each row that it reads and releases those for rows that it does not modify:

```
x-lock(1,2); unlock(1,2)
x-lock(2,3); update(2,3) to (2,5); retain x-lock
x-lock(3,2); unlock(3,2)
x-lock(4,3); update(4,3) to (4,5); retain x-lock
x-lock(5,2); unlock(5,2)
```
> For the second UPDATE, InnoDB does a “semi-consistent” read, returning the latest committed version of each row that it reads to MySQL so that MySQL can determine whether the row matches the WHERE condition of the [UPDATE](https://dev.mysql.com/doc/refman/8.0/en/update.html):

```
x-lock(1,2); update(1,2) to (1,4); retain x-lock
x-lock(2,3); unlock(2,3)
x-lock(3,2); update(3,2) to (3,4); retain x-lock
x-lock(4,3); unlock(4,3)
x-lock(5,2); update(5,2) to (5,4); retain x-lock
```

However, when there are a lot of messages, it leads to slow queries because MySQL uses a "filesort" because of the `ORDER BY available_at` clause.
I think this clause could be removed if losing the FIFO-like strategy is acceptable

# Solution 2 - Use SKIP LOCKED
MySQL has a `SKIP LOCKED` clause that could be used here: https://dev.mysql.com/doc/refman/8.0/en/innodb-locking-reads.html
> A locking read that uses SKIP LOCKED never waits to acquire a row lock. The query executes immediately, removing locked rows from the result set.

I measured these performances for the SELECT query:
* Without `FOR UPDATE` & `SKIP LOCKED`: about 0.002 seconds to return a row
* With `FOR UPDATE` & without `SKIP LOCKED`: about 2 seconds to return a row
* With `FOR UPDATE`& with `SKIP LOCKED`: about 0.2 seconds to return a row

=> x10 faster with the `SKIP LOCKED` option

As far as I know, `doctrine/dbal` doesn't support the `SKIP LOCKED` option.
An issue has been around for years about it with `doctrine/orm`: https://github.com/doctrine/orm/issues/7746

I suggest here that the Doctrine transport "manually" adds the `SKIP LOCKED` option when the database platform is MySQL 8.0 or later (maybe others that support it)

This is my preferred solution

### Additional Context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Messenger] Bad performances with the Doctrine/MySQL transport #47828

Symfony version(s) affected

Description

How to reproduce

Possible Solution

Solution 1 - Remove the indexes

Solution 2 - Use SKIP LOCKED

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Messenger] Bad performances with the Doctrine/MySQL transport #47828

Description

Symfony version(s) affected

Description

How to reproduce

Possible Solution

Solution 1 - Remove the indexes

Solution 2 - Use SKIP LOCKED

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions