Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Reserver concept to allow different behaviours for reservation of jobs. #338

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

rayward
Copy link

@rayward rayward commented Apr 13, 2017

Currently php-resque is capable of reserving jobs in two ways:

  1. The default: Iterate through each queue in the order they are specified, processing all jobs in each queue before moving to the next
  2. If BLOCKING=1 is configured in the environment: The redis blpop command is used to do a blocking wait for a job to be enqueued on the configured queues. The order of processing is the same as above.

The behaviour of processing a queue entirely before moving to the next is restrictive. Additionally, if resque is configured to dynamically retrieve the available queues (by specifying QUEUE=*), then there is even less capability to prioritise as the retrieved queues are sorted alphabetically.

We have resque workers configured to process multiple queues but we consider them all equal priority. There is no way to configure this right now and it means that a single queue can dominate all the workers entirely, ignoring all others until it is processed. We occasionally experience surges where a queue may be loaded with 10's of thousands (or even more) of jobs which be result in a substantial delay on being able to process the other queues.

This PR introduces the concept of a Reserver (via ReserverInterface). A reserver exposes a reserve() method which is called by the Resque_Worker. Various behaviours can then be implemented in the reserve() method, such as to allow more fair queue processing for example.

Three Reservers are implemented:

  • QueueOrderReserver - This reimplements the default behaviour described above, processing a higher priority queue entirely before moving to the next.
  • BlockListPopReserver - This reimplements the behaviour that is toggled by specifying BLOCKING=1.
  • RandomQueueOrderReserver - Shuffles the list of queues on each call to reserve(). This is a simple way to allow an equal amount of processing on all the queues.

New and interesting ways of reservation could be implemented in the future such as:

  • Oldest job first - inspect all the queues and find the job with the oldest enqueue_time

ReserverInterface is now a required dependency of Resque_Worker and is supplied in its constructor. The $blocking parameter to the work() and reserve() methods are removed.

A reserver is chosen in bin/resque based off the environment configuration:

  • BLOCKING - A non empty value will select the BlockListPopReserver
  • RESERVER - The name of any implemented reserver can be specified in snake case format, eg. queue_order, random_queue_order, blocking_list_pop

If neither is specified, the default QueueOrderReserver is used.

While I've tested this a fair bit myself, it would be great if others could give this some verification also.

@danhunsaker
Copy link
Contributor

danhunsaker commented Apr 13, 2017

While an interesting concept, and I'm by no means expressing disapproval of the feature, the existing behavior is standard in both Ruby Resque (of which this is a port, albeit currently mirroring an early version), and in most other queueing engines available for most languages. Though there likely exceptions, of course.

The way to process jobs in multiple queues at more-or-less equal priorities (why multiple queues at all, then, as the only purpose they serve internally is prioritizing work?) is to run multiple workers (which you seem to already be doing) with different orders for their queue lists (which isn't possible with just *, of course). Or, in many cases, with just a single queue in each list, so each worker is dedicated to a single queue (though you can still have multiple workers per queue, of course).

And again, this is true of most queueing libraries for most languages. Queues exist, in these cases, to prioritize work.

That said, I still find this approach interesting, and it could be used to add further value to this library. I'd be interested in seeing a Reserver that allows queues to be assigned priorities in addition to names, so that queues can still be prioritized higher than others, but so that multiple queues of equal priority have a roughly equal chance of having jobs processed. That way, you can keep the benefits of splitting jobs across queues for reasons other than prioritizing them (though I'm still not clear on what those are), as well as the benefits of higher/lower priority queues. Add that, and I'd be jumping to recommend a merge.

@rayward
Copy link
Author

rayward commented Apr 14, 2017

@danhunsaker thanks for the feedback.

Our infrastructure configuration necessitates a complex worker/queue setup and unfortunately we can't prioritise by just adding more workers for each individual queue (due to the load it would add to other infrastructure components).

I'd be interested in seeing a Reserver that allows queues to be assigned priorities in addition to names

Yeh I also considered that sort of implementation and I agree that it probably would supersede the RandomQueueOrderReserver. It would however require a more complex environmental configuration of the queues to be able to specify those priorities and I feel that sort of change is a bit outside the scope of this PR. The foundations are here though for others to add those new behaviours.

@chrisboulton
Copy link
Owner

From a first pass at this - the changes look good, and I can't pick any backwards incompatibility issues we should be aware of.

These changes also address a common message I'm hearing - usually via email from someone using php-resque who doesn't fully understand how dynamic queues work. They'll typically ask my why they have jobs in queues that aren't being processed yet, and it's because they have a queue in their list sooner that has a lot of backed up work. It's pretty simple to say "break that out into its own worker", but there are use cases like we've uncovered where you really do what a single worker (or multiple) to be working a whole list of queues, "fairly" - prioritization of what's in those queues is less important, but sharding and distribution is.

I'm okay with a change like this because it's not breaking any compatibility with Resque itself. The web UI, etc all still work as expected, as does the ability to enqueue a job or process a job in another language runtime. The default reservation behaviour still matches that which Resque guarantees (ordered emptying and FIFO job processing).

I see this as a good start for other possible reservation/prioritization strategies as both of you guys have already mentioned.

I need to put a more thorough set of eyes over this before we bring it down, but I think I want that to be soon.

@rayward: can we get this branch of this deployed into our environment to give it a good beating before I merge this down?

@rayward
Copy link
Author

rayward commented Apr 18, 2017

I've also included a fix for autoloading.

Composer prepends autoloaders such that the last autoloader always takes precedence. If APP_INCLUDE is used and that app itself requires php-resque, then the app's autoloader will result in the classes being loaded from the app rather than from here.

If the version of resque included by the app is older than this version, then many unintended problems can occur, such as:

  • fatal errors due to missing methods
  • behavioural differences or bugs being present that were subsequently fixed

The change in the last commit causes php-resque's autoaloader to be re-registered so that when Resque_Worker and other classes are instantiated, they will use the correct definitions.

@rayward
Copy link
Author

rayward commented Apr 18, 2017

Also looks like there's a pre-existing issue with hhvm that's causing the travis build to fail - facebook/hhvm#6286

Will see if I can implement a work-around.

This interface allows implementations to define different behaviours
for reserving jobs from the queues.
This reserver checks queues in the order they were specified.
As long as jobs exist on higher priority queues, they will be reserved
before moving to the next lowest priority queue.

This implements the current default job reservation behaviour.
This reserver simply shuffles the list of available queues
(whether configured or retrieved dynamically from redis) on each
call to reserve() meaning a random queue is used to reserve a
job from.
This behaves similarly to QueueOrderReserver, but uses the redis
blpop command to do a blocking wait for a job to come onto the
specified queues.
This creates the various reserver implementations.
Resque_Worker now takes an instance of ReserverInterface as a dependency.

The body of the Resque_Worker::reserve() method has been replaced
with a call to ReserverInterface::reserve().

Blocking and non-blocking reservation behaviour is now implemented
in BlockingListPopReserver and QueueOrderReserver respectively.

Logic that decides whether the worker should sleep
after an attempt to reserve a job has been replaced with a call
to ReserverInterface::waitAfterReservationAttempt().
…s used and not another version from APP_INCLUDE.

Requiring composer from external apps causes their autoloader to register
which takes precedence over any prior registered autoloaders.

This may cause issues when using an APP_INCLUDE that depends on an older
version of Resque which ends up replacing this version.
@danhunsaker
Copy link
Contributor

@rayward The changes here are too extensive for me to comfortably merge into the new upstream directly. Is there any chance you could open a new PR against resque/php-resque with these changes, so I can get more eyes on it in the new context?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants