Prevent score removal jobs from overlapping #11162

notbakaneko · 2024-04-17T08:47:00Z

Use WithoutOverlapping middleware to lock the job. There will be a bogus try added (and also gets counted as successful run) if the job takes longer than retry_after to run and the job gets moved to a :delayed queue. InteractsWithQueue is needed for the actual $job->release() behaviour of the queue. There's still the problem if something kills the worker before it can run any fail handlers and the job won't be retried until the timeout/lock expires, but that's not any different than if retry_after is changed.

The issue is workers will automatically retry reserved jobs after retry_after has expired even it they're still running, so the same job starts being run by multiple workers simultaneously and have their attempt counts incremented, eventually causing MaxAttemptsExceededException to be thrown (while the job is still running on the first worker). If the job does fail now, it won't be retried anymore.

nanaya · 2024-04-25T06:24:38Z

There is WithoutOverlapping middleware for the job, but that just causes workers trying to run the job while the lock is active to immediate complete the job and release it, completely removing the job from the queues instead of leaving it in the reserved queue.

from what I gather it's more like the lock is just held forever unless expiresAfter is specified? Would specifying it to be the same as timeout help?

Also the "release" is more like sending it back to queue?

notbakaneko · 2024-04-25T07:22:20Z

If the lock hasn't expired yet, other workers will still run the job and immediately complete it and remove it from the reserved queue, and it doesn't get picked up again. If the original worker gets killed by OOM and doesn't fire the error handler, the job is gone.

nanaya · 2024-04-25T07:53:24Z

where did you get that? That's not what the handle function says at least

https://github.com/laravel/framework/blob/ad758500b47964d022addf119600a1b1b0230733/src/Illuminate/Queue/Middleware/WithoutOverlapping.php#L70-L85

$job->release() is "Release the job back into the queue after (n) seconds."

~~(and if needed, there's dontRelease() option)~~ actually, it needs to be released for it to be retried.

notbakaneko · 2024-04-25T11:13:17Z

oh, it also needs the InteractsWithQueue trait added to the job to get the default release behaviour :|

It's still not great:

worker 1 dequeues job, increments attempts, acquires lock
retries_after expires
worker 2 dequeues job, increments attempts, fails to lock, places on delayed queue, returns job as done
releaseAfter expires
worker 2 dequeues job, increments attempts...

It might be passable but it's also still a roundabout way of having another setting that acts like retries_after and doesn't apply the first time; you'd think it should always use the releaseAfter value 🤔

nanaya · 2024-04-25T14:00:50Z

it's probably still better than adding different queue just to have different timeout

nanaya · 2024-04-26T12:07:32Z

app/Jobs/RemoveBeatmapsetBestScores.php

+
+    public function middleware(): array
+    {
+        return [new WithoutOverlapping($this->beatmapset->getKey(), $this->timeout, $this->timeout)];


string key?

notbakaneko added 2 commits April 17, 2024 15:09

run on different connection and queue

337ad7d

use sync queue for tests

9ebe441

notbakaneko added the type:reliability label Apr 17, 2024

notbakaneko self-assigned this Apr 17, 2024

notbakaneko changed the title ~~Run remove scores jobs on separate connection~~ Run score removal jobs on separate connection Apr 17, 2024

use WithoutOverlapping middleware and just eat the extra try

7ee752e

notbakaneko changed the title ~~Run score removal jobs on separate connection~~ Prevent score removal jobs from overlapping Apr 26, 2024

nanaya requested changes Apr 26, 2024

View reviewed changes

cast key to string

92ea188

nanaya approved these changes Apr 27, 2024

View reviewed changes

Merge branch 'master' into feature/remove-scores-separate-connection

257525a

nanaya enabled auto-merge April 27, 2024 15:04

nanaya merged commit 0444f77 into ppy:master Apr 27, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent score removal jobs from overlapping #11162

Prevent score removal jobs from overlapping #11162

notbakaneko commented Apr 17, 2024 •

edited

nanaya commented Apr 25, 2024 •

edited

notbakaneko commented Apr 25, 2024 •

edited

nanaya commented Apr 25, 2024 •

edited

notbakaneko commented Apr 25, 2024

nanaya commented Apr 25, 2024

nanaya Apr 26, 2024

Prevent score removal jobs from overlapping #11162

Prevent score removal jobs from overlapping #11162

Conversation

notbakaneko commented Apr 17, 2024 • edited

nanaya commented Apr 25, 2024 • edited

notbakaneko commented Apr 25, 2024 • edited

nanaya commented Apr 25, 2024 • edited

notbakaneko commented Apr 25, 2024

nanaya commented Apr 25, 2024

nanaya Apr 26, 2024

Choose a reason for hiding this comment

notbakaneko commented Apr 17, 2024 •

edited

nanaya commented Apr 25, 2024 •

edited

notbakaneko commented Apr 25, 2024 •

edited

nanaya commented Apr 25, 2024 •

edited