Ways to solve multiple schedulers. #161

Closed
MichaelXavier opened this Issue Jun 5, 2012 · 4 comments

Comments

Projects
None yet
3 participants

Hi. I recently got bitten by having multiple schedulers running at a time. I came up with a strategy to prevent this in future, but I wondered if you'd be interested in solving this in the library. I could try my hand at doing it myself as a pull request if you'd like. I figured running two schedulers on the same resque namespace/server is never what you want so it would be nice if we could trivially prevent it.

Basically, my hack works like so:

desc "Spawn an auto-expiring lock for scheduler (run as a dependency)"
task :scheduler_lock do
  redis    = Resque.redis
  lock_key = "scheduler:lock"
  timeout  = 90
  interval = 60

  if redis.setnx(lock_key, Time.now.to_i)
    Thread.new do
      loop do
        redis.expire(lock_key, timeout)
        sleep(interval)
      end
    end
  else
    puts "Another scheduler is already running. Aborting."
    exit 1
  end
end

# add scheduler lock dependency
task :scheduler => :scheduler_lock

So it basically spins off another thread to periodically poke a lock every once in a while and abandons it. I thought this would be better than a job because its independent of queues being backed up and dies with the process, naturally unlocks if the scheduler dies in a fire. Thoughts?

Contributor

bvandenbos commented Jun 5, 2012

I've actually been thinking about doing something similar to support redundancy (though in the scheduler process itself). Basically, when new schedulers spin up, they use setnx to attempt to acquire the "master" look (a key that expires after a period of time and store the machine name and pid as the value). Rather than aborting if they fail to acquire it, they simply continue to attempt to acquire every so often. If the master ever fails to re-grab the lock before expiration (say, if the machine became unresponsive) a new master would grab the lock and begin actively scheduling jobs (and delayed jobs). If the original master happens to become responsive again it will check the lock key and notice it no longer is the master and being operating as a master-to-be, periodically checking to see if the new master has failed to re-acquire the lock. Seems to work in theory (and has the side effect of addressing your problem).

But alas, I have been too busy and have been neglecting this project.

On Jun 4, 2012, at 7:27 PM, Michael Xavier wrote:

Hi. I recently got bitten by having multiple schedulers running at a time. I came up with a strategy to prevent this in future, but I wondered if you'd be interested in solving this in the library. I could try my hand at doing it myself as a pull request if you'd like. I figured running two schedulers on the same resque namespace/server is never what you want so it would be nice if we could trivially prevent it.

Basically, my hack works like so:

desc "Spawn an auto-expiring lock for scheduler (run as a dependency)"
task :scheduler_lock do
 redis    = Resque.redis
 lock_key = "scheduler:lock"
 timeout  = 90
 interval = 60

 if redis.setnx(lock_key, Time.now.to_i)
   Thread.new do
     loop do
       redis.expire(lock_key, timeout)
       sleep(interval)
     end
   end
 else
   puts "Another scheduler is already running. Aborting."
   exit 1
 end
end

# add scheduler lock dependency
task :scheduler => :scheduler_lock

So it basically spins off another thread to periodically poke a lock every once in a while and abandons it. I thought this would be better than a job because its independent of queues being backed up and dies with the process, naturally unlocks if the scheduler dies in a fire. Thoughts?


Reply to this email directly or view it on GitHub:
bvandenbos#161

Contributor

meatballhat commented Nov 9, 2013

@MichaelXavier still valid?

Nope. I think we switched from our hand-rolled solution to this some time ago and I haven't seen the issue reappear.

Contributor

meatballhat commented Nov 9, 2013

@MichaelXavier much thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment