reschedule_at not working correctly #36

Closed
whitslar opened this Issue Feb 22, 2013 · 1 comment

Comments

Projects
None yet
3 participants
@whitslar

Im having problems with reschedule_at working properly. My worker will always make two attempts in quick succession upon failure no matter what I set reschedule_at to. I've done some research and it it looks like there's an issue with locked_by being set when it shouldn't. Either that or the logic in ready_to_run should include some extra parenthesis so that locked_by doesnt override run_at being > db_time_now.

Here's an example, where I included some debugging lines

in worker.rb:

def reschedule(job, time = nil)
  if (job.attempts += 1) < max_attempts(job)
    puts "Rescheduling"
    time ||= job.reschedule_at
    job.run_at = time
    puts "just set run at to #{time}"
    puts "locket_at: #{job.locked_at}"
    puts "locket_by: #{job.locked_by}"
    job.unlock
    puts 'just unlocked'
    job.save!
    puts 'just saved'
    puts "locket_at: #{job.locked_at}"
    puts "locket_by: #{job.locked_by}"
  else
    say "PERMANENTLY removing #{job.name} because of #{job.attempts} consecutive failures.", Logger::INFO
    failed(job)
  end
end

in active_record.rb:

def self.ready_to_run(worker_name, max_run_time)
  puts "Ready to run"
  puts "db_time_now: #{db_time_now}"
  puts "run_at: #{where('(run_at <= ? AND (locked_at IS NULL OR locked_at < ?) OR locked_by = ?) AND failed_at IS NULL', db_time_now, db_time_now - max_run_time, worker_name).first.run_at rescue nil}"
  puts "locked_at: #{where('(run_at <= ? AND (locked_at IS NULL OR locked_at < ?) OR locked_by = ?) AND failed_at IS NULL', db_time_now, db_time_now - max_run_time, worker_name).first.locked_at rescue nil}"
  puts "locked_by: #{where('(run_at <= ? AND (locked_at IS NULL OR locked_at < ?) OR locked_by = ?) AND failed_at IS NULL', db_time_now, db_time_now - max_run_time, worker_name).first.locked_by rescue nil}"
  where('(run_at <= ? AND (locked_at IS NULL OR locked_at < ?) OR locked_by = ?) AND failed_at IS NULL', db_time_now, db_time_now - max_run_time, worker_name)
end

When I run a single delayed job that intentionally raises an exception, I get output that looks like this:

Ready to run
db_time_now: 2013-02-22 11:55:10 -0500
run_at: 2013-02-22 11:54:58 -0500
locked_at:
locked_by:

(job fails so it reschedules...)

Rescheduling
just set run at to 2013-02-22 11:58:20 -0500
locket_at: 2013-02-22 11:55:10 -0500
locket_by: host:Matts-iMac.local pid:32904
just unlocked
just saved
locket_at:
locket_by:

Ready to run
db_time_now: 2013-02-22 11:55:12 -0500
run_at: 2013-02-22 11:58:20 -0500
locked_at:
locked_by: host:Matts-iMac.local pid:32904

(hmmm.. why is locked_at nil but locked_by isn't. Also run_at is > db_time_now so this shouldnt have ran, but it did)

(job fails again...)
Rescheduling
just set run at to 2013-02-22 12:32:39 -0500
locket_at: 2013-02-22 11:55:12 -0500
locket_by: host:Matts-iMac.local pid:32904
just unlocked
just saved
locket_at:
locket_by:

Ready to run
db_time_now: 2013-02-22 11:55:12 -0500
run_at:
locked_at:
locked_by:

(weird.. after a second call to reschedule_at, locked_by is now nil like it probably should be and ready_to_run doesn't find any jobs)

So I'm not sure why locked_by isn't nil, maybe its on purpose, maybe not, but according to the logic in ready_to_run, if locked_by = worker_name, it will override run_at <= db_time_now like so:

where('(run_at <= ? AND (locked_at IS NULL OR locked_at < ?) OR locked_by = ?) AND failed_at IS NULL', db_time_now, db_time_now - max_run_time, worker_name)

in another form could be:

(A && (B || C) || D) && E

Since D is chained on the end like that a true value will override A && (B || C)

So is locked_by being set incorrectly, or should the logic be more like (A && ((B || C) || D)) && E ??

@sferik

This comment has been minimized.

Show comment Hide comment
@sferik

sferik Apr 2, 2013

Collaborator

Can you please verify whether this is fixed in the latest gem version (0.4.4)?

Collaborator

sferik commented Apr 2, 2013

Can you please verify whether this is fixed in the latest gem version (0.4.4)?

@albus522 albus522 closed this Sep 25, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment