-
-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
until_and_while_executing and lock_ttl: jobs silently dropped #788
Comments
I get your problem but if I allow that then there is no meaning of having the gem. I mean, this gem is about preventing duplicate jobs. Given a TTL there is simply no way to sort this out when the TTL expires the lock does too. If you want to make sure no job is lost I would recommend removing the TTL, delegating to the reaper for cleanup and using a conflict resolution to ensure the job isn't ignored. |
But for the until_and_while_executing lock, isn't it better to allow the job to run and only drop it ( or apply the defined conflict strategy ) if there is another job running? I mean, the lock for this job expired during the "until_executing" phase. I guess that trying to run it won't cause problems because the "while_executing" part will ensure that it's unique.
We previously tried that, but as least for our use case with thousands of jobs enqueued/scheduled, the reaper is too slow to be effective. When we force it to run by adjusting timeouts/count, it takes 12+ hours to check all keys. |
This will be improved with reaper performance by only checking digests/locks known to have a chance of expiring. It was brought to my attention that Sidekiq only checks in every 10 seconds, and during those seconds, jobs might appear to be missing, and I clean up active locks. This should be fixed in the main branch, but I need help testing my assumptions. #830, in particular, was affecting you @fernandomm. |
Describe the bug
When using lock_ttl and until_and_while_executing locks, jobs are silently dropped if lock expires before the job can be processed.
Expected behavior
It should allow jobs to be processed, even if their uniqueness can no longer be ensured.
Current behavior
Jobs are dropped here.
Worker class
Run:
Wait 5 seconds before starting a sidekiq worker. When it starts processing, you will get this:
If the job is processed before the lock expires, it works as expected:
At this moment the only work around that I found was:
The text was updated successfully, but these errors were encountered: