New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
duplicate scheduling when running on multiple host #332
Comments
@rvyas you might want to take a look at sidekiq-uniq |
Thanks @bpo. I will try this gem. I was trying to do the same in the worker/checking other jobs in queue, etc, but there is some delay when those jobs show up in sidekiq APIs. |
@rvyas I don't use that one myself, but I know others have had similar issues. Sidekiq Enterprise has built-in support for unique jobs, which would probably be your best choice if that's an option for you. |
@rvyas Did you solved your problem? |
No, it did not solve the problem. Even with sidekiq-uniq I see the same duplicate scheduling although the rate has reduced. |
@rvyas Were you able to solve your issue with sidekiq-cron? |
@rvyas @mgoggin I didn't take a look at this gem https://github.com/mhenrixon/sidekiq-unique-jobs, but maybe it can help. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
ok I think one solution for those problems will be to use a Redis based Lock instead of rufus-scheduler default filelock https://github.com/jmettraux/rufus-scheduler#advanced-lock-schemes This would allow us to use something like https://github.com/kenn/redis-mutex to create a single lock to all schedulers running on different hosts which would ensure that only one scheduler would be able to schedule a job at a given time. I'll investigate more and make some tests, this may land as an experimental feature in v4.1 |
@marcelolx this seems like a tricky thing for sidekiq-scheduler to handle well, as it's still always possible for the app to enqueue jobs externally from the scheduler. |
@bpo Could you exemplify? PS: Only thing we want is to avoid sidekiq-scheduler enqueueing duplicated jobs on multiple hosts |
I think there may be 3 issues here: A) Sidekiq-scheduler can enqueue a job when it's already running The initial description of the issue here is all 3 of those: The overall system has performance problems because two copies of a job are running at once (C), caused by Sidekiq enqueuing two copies of a job (B), and sidekiq-scheduler handled the actual enqueueing (A). A is clearly a subset of B, which makes me think that a more general plugin to Sidekiq might be the right way to solve the whole problem rather than adding the complexity within sidekiq-scheduler.
I think this means you are saying "we only want to solve A". I think you're right that A is solvable within sidekiq-scheduler. I wanted to raise a caution about added complexity vs value (since B/C will remain unaddressed) but it sounds like you've already thought that through. |
got it @bpo, that makes sense! I agree that adding too much complexity is not a path we want to follow and yeah, what I am looking for is to solve A, just to avoid sidekiq-scheduler enqueue the same job when running on multiple hosts. |
I would like to share one of our solution to this problem, (we ended up with different solution because we have a lot of dynamically scheduled tasks) class RedisLock
include Singleton
attr_reader :redis
LOCK_NAME = "rufus:trigger"
INSTANCE_IDENTIFIER = Socket.gethostname || SecureRandom.hex(10)
def lock
acquired =
Sidekiq.redis do |r|
res = r.set(LOCK_NAME, INSTANCE_IDENTIFIER, nx: true, ex: 5.minutes.to_i)
res == false ? r.get(LOCK_NAME) == INSTANCE_IDENTIFIER : res
end
acquired
end
end
def unlock
true
end
def remove_instance_from_queue
removed = Sidekiq.redis { |r| r.del(LOCK_NAME) }
Honeycomb.add_field("scheduler_removed", removed == 1)
end
end
# Monkey patched Sidekiq::Scheduler since we don't have way to
# inject scheduler_lock
module SidekiqScheduler
class Scheduler
def initialize(options = {})
self.enabled = options[:enabled]
self.dynamic = options[:dynamic]
self.dynamic_every = options[:dynamic_every]
self.listened_queues_only = options[:listened_queues_only]
self.rufus_scheduler_options = options[:rufus_scheduler_options] || {}
self.rufus_scheduler_options =
rufus_scheduler_options.merge({ scheduler_lock: RedisLock.instance })
end
end This will make sure only one host can schedule. Each host needs to create a thread to check if |
@eldemcan You don't have to patch sidekiq-scheduler, it is just a matter of passing the One question, what your team wants is a single scheduler running, right? BTW, what happens if your server crashes and Also, I suggest looking at how Square https://github.com/square/sqeduler have implemented its locking strategy using Redis which may be helpful (My plans are introduce support for that at some point, that you can opt-in, but first we want to solve #396) |
How can I pass options to rufus_scheduler instance inside SidekiqScheduler:Scheduler?
We want one instance to be able to schedule at a time but if that instance goes away another instance should be able to take that role.
Key will expire in 5 min. (this could be shorter) So if remove_instance_from_queue is not called redis will remove the key. We could also create a thread that just keeps checking if key is set if not instance will set the key meaning they will declare themselves as scheduler amongs other hosts.
Thank you I will take a look into this |
Doesn't this work? ⬇️ rufus_scheduler_options:
scheduler_lock: <%= RedisLock.instance %>
Then I think that you want to pass along a
while the trigger_lock
So I think that what you're looking for is a trigger lock and not a scheduler lock....
🤦♂️ Didn't see the expire! |
@marcelolx thanks for engagement,
It doesn't work because
We tried |
Thanks for the feedback @eldemcan, I'll keep an eye on that issue. And once I work on supporting a redis lock strategy I'll see if we can allow users to set a trigger_lock/scheduler_lock without having to patch sidekiq-scheduler! |
I recently experienced this same issue, although things had been working correctly despite having multiple hosts. This occurred after I adjusted my Overall, I deployed a wide array of changes to my scheduled jobs, so I'm wondering if this has anything to due with the scheduler queuing duplicates? Perhaps each change + deploy to the scheduler populates additional scheduled jobs without clearing out the old ones? |
I've made one for myself. You may be interested :) |
This gem works fine under general use case. However, we have some jobs that take longer time (20-25 min) and increases the queue latency.
During such scenario, jobs are scheduled twice which causes other issues in the app. There is a section in the readme that warns about this but is there a way i can minimize duplication? like tweaking certain parameters or limiting certain things?
OR can this issue be fixed in the gem itself?
Current config:
The text was updated successfully, but these errors were encountered: