Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Allow queue jobs to be released back to the queue without increasing the attempt count #1381

Open
matthew-muscat opened this issue Nov 6, 2018 · 9 comments

Comments

@matthew-muscat
Copy link

@matthew-muscat matthew-muscat commented Nov 6, 2018

This idea is related to #735

For the sake of clarity — i'm referring to this as "requeuing"

Background

Currently, when using release() within a job, the job is released back on the queue, with it's attempt count being incremented.

The idea would be to introduce a method similar to release(), with the job being released back into the queue, but it's attempt count not incrementing.

Why

Our use case is that jobs handle API interactions for an integration service, with queues providing a mechanism for a exponential backoff in case of failures (via release() calls with an increasing delay value) as well as a guarantee that the job is attempted at least X times before failing permanently. This is currently handled via release() calls when the job fails, as well as a max attempt count within the job specification (ie: via the tries property).

To improve the queue processing logic with relation to API calls, we want to be able to..

  • Ensure a job is only attempted upto X times, before failing permanently (ie: via a job's tries property - this is already possible)
  • Ensure job api logic is only called when it's within available API rate limits (ie: via throttling)
  • Avoid job processing being considered as an attempt when it fails due to throttling, as the api attempt was not made

Example

Redis::throttle('key')->allow(10)->every(60)->then(function () {
    try {
        // API logic
    }
    catch (Exception) {
        // Release the job back into the queue
        return $this->release(10);
    }
}, function () {
    // Could not obtain lock...
    return $this->requeue(10);
});

Approach / Naming

Some ideas on the approach / naming of this method..

New Method added to trait InteractsWithQueue
Create a new method name that is similar to release's functionality, but releases the job back into the queue without an attempt counter increase.

The method's signature will be similar to that of release(), with a single delay parameter being available.

  • requeue()
  • retry()
  • reattempt()

Adjust release method to accept additional parameter
Adjusts the release() method to accept a parameter that would determine if an attempt count is being considered for the job release.

ie:

/**
 * Release the job back into the queue.
 *
 * @param  int    $delay
 * @param  bool   $reattempt
 * @return void
 */
public function release($delay = 0, $reattempt = false);

Additional notes

Attempt count increments
Attempt counts on jobs are incremented when they are "reserved" in the queue — this is a good practice, but does mean that we would likely need to decrease the attempt count when requeuing the job.

This is due to us only knowing if the job's attempt should be considered within the job handler logic

Attempt limit + time expiry
If introduced, it would be recommended that it's usage is constrained by both a "attempt limit" and "time expiry" — ensuring the job does not remain in the queue if it's unable to obtain a lock for a period of time.

@matthew-muscat matthew-muscat changed the title Allow queue jobs to be released back to the queue without increasing the attempt count [Proposal] Allow queue jobs to be released back to the queue without increasing the attempt count Nov 6, 2018
@matthew-muscat

This comment has been minimized.

Copy link
Author

@matthew-muscat matthew-muscat commented Nov 6, 2018

Approach

Abstract Classes, Contract and Trait Changes

Job Contract

https://github.com/laravel/framework/blob/master/src/Illuminate/Contracts/Queue/Job.php#L34

  • Introduce new requeue() or adjust existing release() method

InteractsWithQueue Trait

https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/InteractsWithQueue.php#L57

  • Introduce new requeue() or adjust existing release() method

Job Abstract Class

https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/Jobs/Job.php#L112

  • Introduce new requeue() or adjust existing release() method
  • Introduce new property / method on the job class to indicate the job was requeued?

Queue Implementations

Database

Job Class
https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/Jobs/DatabaseJob.php#L50

  • Introduce new requeue() or adjust existing release() method

Job Record https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/Jobs/DatabaseJobRecord.php#L34

  • Introduce decrement() method

Queue Implementation
https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/DatabaseQueue.php#L144

  • Introduce new requeue() or adjust existing release() method

Redis

Job Class
https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/Jobs/RedisJob.php#L93

  • Introduce new requeue() or adjust existing release() method

Queue Implementation
https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/RedisQueue.php#L274

  • Introduce new requeue() or adjust existing release() method
    • Requeue would likely use the job's original payload prior to being marked as reserved
    • ie: instead of $job->getReservedJob(), using $job->getPayload() during the release process
      • need to consider differences of reserved job and payload data, if any (currently assumed to include the adjusted attempt counts)

Sync

Job Class
https://github.com/laravel/framework/blob/master/src/Illuminate/Queue/Jobs/SyncJob.php#L47

  • Introduce new requeue() or adjust existing release() method

Others (SQS, Beanstalkd)
I'm not too sure how this will be handled in the SQS and Beanstalkd implementations.. I'll need to take a closer look at how these implement attempt counters and if they can be manipulated

@mfn

This comment has been minimized.

Copy link

@mfn mfn commented Nov 6, 2018

but releases the job back into the queue without an attempt counter increase.

Avoid job processing being considered as an attempt when it fails due to throttling, as the api attempt was not made

If introduced, it would be recommended that it's usage is constrained by both a "attempt limit" and "time expiry"

This would add quite some complexity to a system already having lots of variables.

One hing for sure: you can't introduce such a feature without additional safe-guard limits against this, so I understand this.

But: can't you achieve the same with a high number of retries (i.e. 100) ?

@matthew-muscat

This comment has been minimized.

Copy link
Author

@matthew-muscat matthew-muscat commented Nov 6, 2018

This would add quite some complexity to a system already having lots of variables.

We're not really adding complexity here — as this is already available within the framework.

For example — allowing jobs to fail on both Max Attempts + Timeouts would look like the following in the job file...

<?php

namespace App\Jobs;

class ProcessItem implements ShouldQueue {

    // Limit the job to 5 attempts
    public $tries = 5;

    ....

    public function retryUntil()
    {
        // Allow the job to retry until two hours have elapsed
        return now()->addHours(2);
    }
}

But: can't you achieve the same with a high number of retries (i.e. 100) ?

In our use case, we want to avoid a situation of making 100+ api calls on retry if the api call was to fail for some reason, but still provide at least 5 actual attempts of the api call being made.

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented Dec 17, 2018

I was looking for the same thing .. for now I am deleting the job and dispatching it again inside the job.

@sebastianwebb

This comment has been minimized.

Copy link

@sebastianwebb sebastianwebb commented Feb 19, 2019

@matthew-muscat I would love to see your suggestion added. I posted in the Laravel forum requesting advice on a workaround for this, along with my review of other solutions that aren't quite there (such as using a database queue driver or re-dispatching jobs).

I've just reread your posts in more detail, and I think you've touched on the main issue I'm currently struggling with:

need to consider differences of reserved job and payload data, if any (currently assumed to include the adjusted attempt counts)

I'm having trouble figuring out how to decrement the attempt count on my horizon/redis managed job payload. I tried creating a function for updating the the payload string in redis directly but this is getting overwritten / not not working for some reason.

public static function adjustRedisJobAttempts($change, $job_id)
{
	// job key
	$key = 'horizon:'.$job_id;

	// get the Redis job hash
	$job_meta = Redis::HGETALL($key);

	// convert payload json string to PHP array
	$payload = json_decode($job_meta['payload']);

	// increment/decrement attempts
	$payload->attempts = Helper::adjustNumber($change, $payload->attempts);

	// set new attempt number
	Redis::hmset($key, 'payload', json_encode($payload));

	// return revised attempts number
	return $payload->attempts;

}

I'm feeling around in the dark here. I'm new to Laravel. Did you manage to figure out a way to decrement the attempt? I also don't want to make tons more API calls than necessary, and I feel your solution is the only one with no down sides.

Thanks,
Sebastian

@StefanPrintezis

This comment has been minimized.

Copy link

@StefanPrintezis StefanPrintezis commented Feb 22, 2019

As stated earlier in the topic we would need some kind of new function that doesn't increase the attempt. Since the attempt is determined when popping the job of the queue, we would need to decrement. Decrementing conditionally, usually means we shouldn't increment automatically, but i'm not sure if that's what we want here.

In the DurationLimiterBuilder the LimiterTimeoutException is thrown if the maximum locks are reached in the decay. The closure $failure in the then() function is called, but it's actually not a failure, but a 'throttled'. The job didn't fail it just didn't run because of being throttled.

Somehow i'm thinking this functionality should be added on a Queue level. Instead of an implementation in the handle() method of a queue. It's a different approach, but now it's kind of a wacky integration.

If a Queue is throttle aware it could be able to only give a job within the limit when a getNextJob is requested.

I could make pull request, but it feels like an entirely different approach to adding an alternative to the release function. Any thoughts on the 'Throttle aware Queue' approach?

@taylorotwell

This comment has been minimized.

Copy link
Member

@taylorotwell taylorotwell commented May 27, 2019

I'm OK with this concept in general and would be OK with it being PR'd. As noted, you would definitely want to time constrain the re-attempts using retryUntil. I'm fine with a new method like requeue() or something I think?

@shawnhind

This comment was marked as resolved.

Copy link

@shawnhind shawnhind commented Jun 12, 2019

@matthew-muscat I was looking into testing your proposed implementation by overriding the queues with my own custom ones. One problem I've come across is that the lua scripts match based on the payload. So for example the release lua script seems to delete the job and then add the same payload onto another queue. If you simply pass the $job->getPayload instead of $job->getReserved during the release process as you suggested it seems like it won't delete the reserved job. I'm having a problem where jobs are being tried multiple times still even after failing and so far my best guess is that, that is what's causing it.

I could be very wrong but it seems to me that an extra lua script call would be needed to specifically requeue instead of release as well.

@judgej

This comment has been minimized.

Copy link

@judgej judgej commented Nov 20, 2019

With Laravel 6 job middleware, it makes sense for the job attempts count to be incremented only when the job actually runs. Rate limiting that is implemented in the middleware should not count up the attempts if it decides no attempt is to be performed. However, I understand the count increment is performed when the job is popped from the queue, so the job count has already increased before the middleware even decides whether the job can run.

Is that how it works? Is the middleware considered to be a part of the job, rather than a precursor to the job? Or have I got my assumptions wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants
You can’t perform that action at this time.