Skip to content

Conversation

@erickskrauch
Copy link
Contributor

Q A
Is bugfix? ✔️
New feature?
Breaks BC?
Fixed issues I didn't create 🙈

In the production environment, we encountered a situation where the worker didn't work for a while, but jobs kept being pushed to the queue. By the time the problem was fixed, there were already over 300k jobs in queue. We started the queue, but it was very slow. With the help of the profiler, we were able to discover that the request to unblock unfinished tasks was taking the longest time. And being inside the mutex lock, it didn't allow us to parallelize the tasks. After investigating the behavior of the code in the MariaDB database, I came up with the solution to add a simple pre-filter that would force the database to use the filter first, and then apply more complex filters on the rest of the set.

We tested this solution in production and now everything works for large data volumes.

Copy link
Member

@samdark samdark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a line for CHANGELOG. Thanks.

@samdark samdark added this to the 2.3.4 milestone Mar 29, 2022
@samdark samdark merged commit 0ec084a into yiisoft:master Mar 30, 2022
@samdark
Copy link
Member

samdark commented Mar 30, 2022

👍

@gustavovendramini
Copy link

I'm getting a few deadlocks after this PR release.
Is there anything I can do?

Next yii\db\Exception: SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to get lock; try restarting transaction
The SQL being executed was: UPDATE `queue` SET `reserved_at`=NULL WHERE `reserved_at` is not null and `reserved_at` < 1649649756 - `ttr` and `done_at` is null in /home/debian/http/vendor/yiisoft/yii2/db/Schema.php:676
Stack trace:
#0 /home/debian/http/vendor/yiisoft/yii2/db/Command.php(1307): yii\db\Schema->convertException(Object(PDOException), 'UPDATE `queue` ...')
#1 /home/debian/http/vendor/yiisoft/yii2/db/Command.php(1102): yii\db\Command->internalExecute('UPDATE `queue` ...')
#2 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Queue.php(250): yii\db\Command->execute()
#3 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Queue.php(182): yii\queue\db\Queue->moveExpired()
#4 [internal function]: yii\queue\db\Queue->yii\queue\db\{closure}(Object(yii\db\Connection))
#5 /home/debian/http/vendor/yiisoft/yii2/db/Connection.php(1129): call_user_func(Object(Closure), Object(yii\db\Connection))
#6 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Queue.php(212): yii\db\Connection->useMaster(Object(Closure))
#7 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Queue.php(78): yii\queue\db\Queue->reserve()
#8 [internal function]: yii\queue\db\Queue->yii\queue\db\{closure}(Object(Closure))
#9 /home/debian/http/vendor/yiisoft/yii2-queue/src/cli/Queue.php(117): call_user_func(Object(Closure), Object(Closure))
#10 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Queue.php(93): yii\queue\cli\Queue->runWorker(Object(Closure))
#11 /home/debian/http/vendor/yiisoft/yii2-queue/src/drivers/db/Command.php(56): yii\queue\db\Queue->run(false)
#12 [internal function]: yii\queue\db\Command->actionRun()
#13 /home/debian/http/vendor/yiisoft/yii2/base/InlineAction.php(57): call_user_func_array(Array, Array)
#14 /home/debian/http/vendor/yiisoft/yii2/base/Controller.php(178): yii\base\InlineAction->runWithParams(Array)
#15 /home/debian/http/vendor/yiisoft/yii2/console/Controller.php(182): yii\base\Controller->runAction('run', Array)
#16 /home/debian/http/vendor/yiisoft/yii2/base/Module.php(552): yii\console\Controller->runAction('run', Array)
#17 /home/debian/http/vendor/yiisoft/yii2/console/Application.php(180): yii\base\Module->runAction('queue/run', Array)
#18 /home/debian/http/vendor/yiisoft/yii2/console/Application.php(147): yii\console\Application->runAction('queue/run', Array)
#19 /home/debian/http/vendor/yiisoft/yii2/base/Application.php(384): yii\console\Application->handleRequest(Object(yii\console\Request))
#20 /home/debian/http/yii(27): yii\base\Application->run()
#21 {main}
Additional Information:
Array
(
    [0] => 40001
    [1] => 1213
    [2] => Deadlock found when trying to get lock; try restarting transaction
)

@erickskrauch
Copy link
Contributor Author

@gustavovendramini, I'm not sure, that this deadlock is related to changes introduced in this PR. It only forces DB to use index and doesn't introduce any additional queries or touch any other columns, that were used before. Can you ensure, that with the downgraded version (2.3.3) there is no problem?

@gustavovendramini
Copy link

@erickskrauch I'll check it, downgrade it in production and wait a few days to see how it behaves, then I will inform back here.

@gustavovendramini
Copy link

gustavovendramini commented Jul 2, 2022

@erickskrauch I'm late but, I did the downgrade to 2.3.3 and the deadlocks have stopped. So is something related to this improvement, or related to my infrastructure

I'm running the queue this way at crontab: * * * * * php /var/www/html/yii queue/run as shown in https://www.yiiframework.com/extension/yiisoft/yii2-queue/doc/guide/2.0/en/worker#cron

And my Job have this methods to retry failed jobs

    public function getTtr()
    {
        return 10 * 60; 
    }

    public function canRetry($attempt, $error)
    {
        return true;
    }

@nadar
Copy link
Contributor

nadar commented Jul 15, 2022

@gustavovendramini @samdark
We have downgraded also to 2.3.3 and never saw the Error: Has not waited the lock exception anymore. Which usually happen multiple times a day forcing the workers to restart. So i assume that PR has introduced that problem.

@rob006
Copy link

rob006 commented Jul 15, 2022

@nadar Do you keep finished jobs in queue?

@nadar
Copy link
Contributor

nadar commented Jul 15, 2022

i have to admit @rob006, i don't know. I don't think so. We just use the queue like the example configuration provides. The driver is DB queue and the table does not contain those jobs anymore after processing. So i believe: no

Or could you give me an example of what it would look like when we have finished jobs in the queue?

@rob006
Copy link

rob006 commented Jul 15, 2022

I'm not sure why adding reserved_at is not null condition would slow down anything. Normally you should have only few records passing this condition, so forcing using index on reserved_at should speed up this query.

@samdark samdark mentioned this pull request Aug 1, 2022
@gb5256
Copy link

gb5256 commented May 8, 2023

HI, we have downgraded as well, but the error keeps displaying eyery 15 to 30 minutes.
This also happens on our dev site, where there are ZERO jobs in the queue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants