[9.x] Prevent redis crash when large number of jobs are scheduled for a specific time #43310

AbiriAmir · 2022-07-20T11:39:35Z

Currently, scheduling a large number of jobs for a specific time causes Redis to halt since migrate script is a heavy script.
In this pull request, support for limiting the number of migrated jobs from the delayed queue to the main queue is added.

Steps to reproduce the problem:

Schedule >10000 delayed jobs for a specific time.
Then run php artisan queue:work

When the time reaches, Redis will be busy moving these jobs for minutes.

By limiting the number of jobs to migrate, the problem would be easily fixed.
It is also fully backward compatible since the default value passed as the limit is -1 which means no limit.

… in redis queue in order to prevent redis crash when large number of jobs are scheduled for a specific time

rezakeramati · 2022-07-20T11:57:30Z

Really needed this feature.
My Redis stucks when lots of jobs are scheduled for a specific time, and results downtime on the whole website.

taylorotwell · 2022-07-20T13:45:39Z

Need much more explanation about how this works, what every change does, etc.

taylorotwell · 2022-07-20T13:45:50Z

Please mark as ready for review when that is provided.

AbiriAmir · 2022-07-20T14:40:10Z

In Laravel Redis Queue, delayed jobs and ready-to-do jobs are stored in different data structures. Let's assume that we have a queue named default. In this case, ready-to-do jobs are stored in a Redis list named default, and delayed jobs are stored in a Redis zset named default:delayed with the key equal to the timestamp of expected running time and value equal to the job body itself.

Then, before popping every job from the main queue, it is checked if there is some overdue delayed job or not and if it exists, it is migrated to the main queue:

Now, imagine you are scheduling thousands of jobs for a specific time. For example, in our use case, we calculate user balances every night and schedule to notify them at 8:00 in the morning.
By the time of 8:00, suddenly this script runs:

This heavy script, blocks all other redis operations and makes your entire application down. According to the documentation:

Redis guarantees the script's atomic execution. While executing the script, all server activities are blocked during its entire runtime. These semantics mean that all of the script's effects either have yet to happen or had already happened.

Hence, doing a bunch of works in a single eval causes Redis to block all other operations including all of the queues, cache (if it is used in the same redis server), etc and can bring the whole application down.

Now, the solution is quite simple. Chunking the expired (and ready-to-migrate) jobs can easily fix this problem.
Migrate function is using zrangebyscore command of Redis. It gets an optional limit option just like SQL. Passing the limit option is enough to solve this problem.

Hence:

-- Get all of the jobs with an expired "score"...
local val = redis.call('zrangebyscore', KEYS[1], '-inf', ARGV[1])

is changed to:

-- Get all of the jobs with an expired "score"...
local val = redis.call('zrangebyscore', KEYS[1], '-inf', ARGV[1], 'limit', 0, ARGV[2])

(Limit option is passed to zrangebyscore)

And a migrationBatchSize option from redis config is passed to the function.

The default value is set to -1 to protect backward compatibility when no config is set.

The optional LIMIT argument can be used to only get a range of the matching elements (similar to SELECT LIMIT offset, count in SQL). A negative count returns all elements from the offset. Keep in mind that if offset is large, the sorted set needs to be traversed for offset elements before getting to the elements to return, which can add up to O(N) time complexity.

And the config can be set just like any other Redis queue configuration:

… a specific time (laravel#43310) * Added support for limitting the number of jobs to migrate in each job in redis queue in order to prevent redis crash when large number of jobs are scheduled for a specific time * Added support for limitting the number of jobs to migrate in each job in redis queue in order to prevent redis crash when large number of jobs are scheduled for a specific time * formatting Co-authored-by: amir <amir@abiri.info> Co-authored-by: Taylor Otwell <taylor@laravel.com>

AbiriAmir96 added 2 commits July 20, 2022 15:45

Added support for limitting the number of jobs to migrate in each job…

a512bf0

… in redis queue in order to prevent redis crash when large number of jobs are scheduled for a specific time

Added support for limitting the number of jobs to migrate in each job…

6c864b3

… in redis queue in order to prevent redis crash when large number of jobs are scheduled for a specific time

taylorotwell marked this pull request as draft July 20, 2022 13:45

AbiriAmir marked this pull request as ready for review July 20, 2022 14:42

nunomaduro changed the title ~~Prevent redis crash when large number of jobs are scheduled for a specific time~~ [9.x] Prevent redis crash when large number of jobs are scheduled for a specific time Jul 26, 2022

formatting

f0d4c13

taylorotwell merged commit 5582614 into laravel:9.x Jul 29, 2022

dammy001 mentioned this pull request Jul 31, 2022

[8.x]: Prevent redis crash when large number of jobs are scheduled fo… #43488

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[9.x] Prevent redis crash when large number of jobs are scheduled for a specific time #43310

[9.x] Prevent redis crash when large number of jobs are scheduled for a specific time #43310

AbiriAmir commented Jul 20, 2022

rezakeramati commented Jul 20, 2022

taylorotwell commented Jul 20, 2022

taylorotwell commented Jul 20, 2022

AbiriAmir commented Jul 20, 2022 •

edited

Loading

[9.x] Prevent redis crash when large number of jobs are scheduled for a specific time #43310

[9.x] Prevent redis crash when large number of jobs are scheduled for a specific time #43310

Conversation

AbiriAmir commented Jul 20, 2022

rezakeramati commented Jul 20, 2022

taylorotwell commented Jul 20, 2022

taylorotwell commented Jul 20, 2022

AbiriAmir commented Jul 20, 2022 • edited Loading

AbiriAmir commented Jul 20, 2022 •

edited

Loading