Skip to content
This repository has been archived by the owner on Jul 16, 2021. It is now read-only.

Queue Multi-Task Handling #2460

Open
aarreedd opened this issue Dec 31, 2020 · 3 comments
Open

Queue Multi-Task Handling #2460

aarreedd opened this issue Dec 31, 2020 · 3 comments

Comments

@aarreedd
Copy link

Laravel "Job Batching" allows you to dispatch multiple tasks at once. But I want to do the opposite: Have one handler receive multiple tasks at once.

For example, the ProcessPodcast class from the docs could be called ProcessPodcasts and the constructor would receive an array of Podcast classes instead of just one.

Configuration

There are two main configurations: "group size" and "wait time".

Group Size would be the maximum number of Podcast classes the ProcessPodcasts class can receive at once.

Wait Time is how long a job will wait for a full group of jobs. After some period of time (e.g. one minute) if there are still fewer than a full group of jobs, the queue will just send all waiting jobs to the handler.

Why would you want this?

When processing a job has significant setup overhead it may be more efficient to process multiple jobs at once.

Adding more queue workers does not solve this problem. Multiple works could allow you to process the jobs more quickly, but not more efficiently.

Example

Imagine that the ProcessPodcast job launches an EC2 instance to process each podcast. That setup takes a lot of time. It would be more efficient for the ProcessPodcasts class to receive a group of Podcasts so it can use the same container to process all of them.

Implementation in AWS Lambda

This feature already exists in AWS when using SQS and Lambda. You can configure each SQS message to trigger a Lambda function. Or you can configure a "batch size" and when your Lambda function is invoked it will receive an array of SQS messages.

@mfn
Copy link

mfn commented Jan 4, 2021

Another exampling supporting this idea:
I'm using ElasticSearch and need to sync many model changes.

If you've to sync a LOT of models constantly (i.e. hundreds of changes per second), is super efficient to have "fully batched process":

  • fetch multiple models (and their related necessary data) at once
    (i.e. this is what this feature would enable)
  • make use of ElasticSearch own batch capability

Especially the part to reduce SQL queries for simply fetching data can be an incredible improvement by being able to e.g. grab 8 or 16 same-class jobs, extract their model IDs and fetch the data in one-go instead of having 2 or more SQL statements per model (taking related data into consideration).


Years ago, due to performance issues back then, I manually built such a solution using redis and the principal architecture of how Laravel worker daemons work and, by using Redis blpop / lpop. But the redis keys were very simply purpose built only for those type of model IDs, so nothing which would make sense in a broader concept / for code sharing at this point. But I would be happy to get rid of this :)

One other requirement for my solution was to allow easy updating of ES types from non-Laravel applications, so naturally also the "exchange format" used within Redis was much more simply then that serialized data the framework uses to manage jobs in general.

@aarreedd
Copy link
Author

aarreedd commented Jan 4, 2021

@mfn Thanks for adding your use-case.

When you have a lot of jobs and each one requires a database lookup it is a lot faster to group them.


For your specific use-case with Elastic Search, there may be an easier way. If you are using Laravel Scout, you can pause indexing then do a batch index periodically with a scheduled command.

https://laravel.com/docs/8.x/scout#pausing-indexing
https://laravel.com/docs/8.x/scout#batch-import

@mfn
Copy link

mfn commented Jan 4, 2021

@aarreedd thanks, I'm aware but scout only covers too basic uses, we've the need for a custom ES implementation.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants