lost resque jobs #42

Closed
zoras opened this Issue Sep 8, 2011 · 7 comments

Comments

Projects
None yet
7 participants

zoras commented Sep 8, 2011

While I restart resque workers during deploy if new job comes in, then these jobs are set as queued in resque-status but it's not actually queued in resque. So my jobs are lost.

queued

ruby-1.9.2-p180 :027 > status=Resque::Status.get("f050dd20bc45012e1e77723c9193eb99")
 => #<Resque::Status {"time"=>1315485749, "status"=>"queued", "uuid"=>"f050dd20bc45012e1e77723c9193eb99"}>

ruby-1.9.2-p180 :022 > status.status
 => "queued"

ruby-1.9.2-p180 :033 > Resque.info
 => {:pending=>0, :processed=>12943, :queues=>9, :workers=>10, :working=>0, :failed=>8911, :servers=>["redis://192.168.###.###:6379/0"], :environment=>"production"} 

As you can see there are 12 jobs with queued status but they're not showing up in Queues tab as there is no job pending in resque.
Please take a look at this behaviour.

Is there any method to requeue these workers??

+1

dacort commented Sep 13, 2011

Starting to see this as well. You can re-enqueue if you know the class of the job with something like this:
Resque.enqueue(JobClass, uuid, job_args)

I don't know if this is associated with restarts or not at this point.

I'm about to use this gem in a project and this issue worries me, has anyone found a cause and fix for this?

Owner

quirkey commented Jan 23, 2012

Are you sure that these jobs havent just ended up as failed in Resque? If you look at the code of Resque and Resque Status its pretty impossible to 'loose' jobs. Resque::Status doesnt actually change the behavior of how jobs are enqueued or processed, rather it wraps the perform method of your Job class to give it a unique ID and a JSON/Hash in redis to store meta data in. If you enqueue a job with Resque::Status, but it fails in Resque before ever getting to perform (classes unavailable, load errors, redis connection errors) then the status object will still exist (hence the list of queued jobs in the list) but the job will just be in the Failed list in Resque

@quirkey quirkey closed this Jan 23, 2012

I agree with @quirkey
@zoras @mhaylock Resque::Plugins::Status::Hash.expire_in does the job. If you set its value for instance to 30 seconds the "missing" job will be removed from the redis-status queue after this time.

zoras commented Jul 23, 2013

FYI, this is fixed in 0.3.3 dde6dca

I'm seeing the exact same problem now with Resque 1.24.1 and Resque-status 0.4.3. Job is "queued", but Resque considers it "processed". The job never ran.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment