Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Browse files

Fix a race condition encountered with pruning dead workers

On a single machine, starting multiple workers at the same time would cause the same job to have
it's failure hook called multiple times. In this patch I'm only calling job.Fail when we've
successfully removed the worker from the :workers set.

Effectively this synchronizes the job fail
on the first worker to successfully remove itself from the set.
  • Loading branch information...
commit 7996c63c4f82b5010b5705a87e7a74877fe21add 1 parent c773e57
@raykrueger raykrueger authored
Showing with 17 additions and 10 deletions.
  1. +14 −9 lib/resque/worker.rb
  2. +3 −1 test/worker_test.rb
View
23 lib/resque/worker.rb
@@ -369,17 +369,22 @@ def run_hook(name, *args)
# Unregisters ourself as a worker. Useful when shutting down.
def unregister_worker
- # If we're still processing a job, make sure it gets logged as a
- # failure.
- if (hash = processing) && !hash.empty?
- job = Job.new(hash['queue'], hash['payload'])
- # Ensure the proper worker is attached to this job, even if
- # it's not the precise instance that died.
- job.worker = self
- job.fail(DirtyExit.new)
+
+ # Multiple workers on a single machine can fail a job, causing
+ # duplicate retries when using retry plugin. Here we're effectively
+ # synchronizing on the srem call to prevent that.
+ if redis.srem(:workers, self)
+ # If we're still processing a job, make sure it gets logged as a
+ # failure.
+ if (hash = processing) && !hash.empty?
+ job = Job.new(hash['queue'], hash['payload'])
+ # Ensure the proper worker is attached to this job, even if
+ # it's not the precise instance that died.
+ job.worker = self
+ job.fail(DirtyExit.new)
+ end
end
- redis.srem(:workers, self)
redis.del("worker:#{self}")
redis.del("worker:#{self}:started")
View
4 test/worker_test.rb
@@ -9,6 +9,7 @@
Resque.after_fork = nil
@worker = Resque::Worker.new(:jobs)
+ @worker.register_worker
Resque::Job.create(:jobs, SomeJob, 20, '/tmp')
end
@@ -307,7 +308,8 @@ def self.exception
workerB.instance_variable_set(:@to_s, "#{`hostname`.chomp}:2:high,low")
workerB.register_worker
- assert_equal 2, Resque.workers.size
+ # should be 3 counting @worker
+ assert_equal 3, Resque.workers.size
# then we prune them
@worker.work(0) do
Please sign in to comment.
Something went wrong with that request. Please try again.