Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent score removal jobs from overlapping #11162

Merged

Conversation

notbakaneko
Copy link
Collaborator

@notbakaneko notbakaneko commented Apr 17, 2024

Use WithoutOverlapping middleware to lock the job. There will be a bogus try added (and also gets counted as successful run) if the job takes longer than retry_after to run and the job gets moved to a :delayed queue. InteractsWithQueue is needed for the actual $job->release() behaviour of the queue. There's still the problem if something kills the worker before it can run any fail handlers and the job won't be retried until the timeout/lock expires, but that's not any different than if retry_after is changed.


The issue is workers will automatically retry reserved jobs after retry_after has expired even it they're still running, so the same job starts being run by multiple workers simultaneously and have their attempt counts incremented, eventually causing MaxAttemptsExceededException to be thrown (while the job is still running on the first worker). If the job does fail now, it won't be retried anymore.

@notbakaneko notbakaneko self-assigned this Apr 17, 2024
@notbakaneko notbakaneko changed the title Run remove scores jobs on separate connection Run score removal jobs on separate connection Apr 17, 2024
@nanaya
Copy link
Collaborator

nanaya commented Apr 25, 2024

There is WithoutOverlapping middleware for the job, but that just causes workers trying to run the job while the lock is active to immediate complete the job and release it, completely removing the job from the queues instead of leaving it in the reserved queue.

from what I gather it's more like the lock is just held forever unless expiresAfter is specified? Would specifying it to be the same as timeout help?

Also the "release" is more like sending it back to queue?

@notbakaneko
Copy link
Collaborator Author

notbakaneko commented Apr 25, 2024

If the lock hasn't expired yet, other workers will still run the job and immediately complete it and remove it from the reserved queue, and it doesn't get picked up again. If the original worker gets killed by OOM and doesn't fire the error handler, the job is gone.

@nanaya
Copy link
Collaborator

nanaya commented Apr 25, 2024

where did you get that? That's not what the handle function says at least

https://github.com/laravel/framework/blob/ad758500b47964d022addf119600a1b1b0230733/src/Illuminate/Queue/Middleware/WithoutOverlapping.php#L70-L85

$job->release() is "Release the job back into the queue after (n) seconds."

(and if needed, there's dontRelease() option) actually, it needs to be released for it to be retried.

@notbakaneko
Copy link
Collaborator Author

oh, it also needs the InteractsWithQueue trait added to the job to get the default release behaviour :|

It's still not great:

worker 1 dequeues job, increments attempts, acquires lock
retries_after expires
worker 2 dequeues job, increments attempts, fails to lock, places on delayed queue, returns job as done
releaseAfter expires
worker 2 dequeues job, increments attempts...

It might be passable but it's also still a roundabout way of having another setting that acts like retries_after and doesn't apply the first time; you'd think it should always use the releaseAfter value 🤔

@nanaya
Copy link
Collaborator

nanaya commented Apr 25, 2024

it's probably still better than adding different queue just to have different timeout

@notbakaneko notbakaneko changed the title Run score removal jobs on separate connection Prevent score removal jobs from overlapping Apr 26, 2024

public function middleware(): array
{
return [new WithoutOverlapping($this->beatmapset->getKey(), $this->timeout, $this->timeout)];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string key?

@nanaya nanaya enabled auto-merge April 27, 2024 15:04
@nanaya nanaya merged commit 0444f77 into ppy:master Apr 27, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

2 participants