deleting lock for master? #152

afrozl · 2016-08-11T11:44:23Z

I just noticed that delayed jobs are no longer getting scheduled. Is there a way to check and see if a scheduler lock is 'stuck'? I suspect that clearing the redis db would restart the scheduler polling, but I would like to avoid that, if possible.

evantahler · 2016-08-11T16:02:36Z

I'm going to bet that that is not the issue. If you are running more than one scheduler, yes, one will take the "master" role, and lock the others out. However, that lock only exists for 3 minutes (unless your override the default).

If you want to inspect the key in reds, it takes the form of self.connection.key('resque_scheduler_master_lock'), so that would be resque:resque_scheduler_master_lock with the default options.

afrozl · 2016-08-11T18:21:22Z

I'm not sure how it got 'stuck' but that was indeed the case. I took a look at the redis key and it was locked by a scheduler that no longer existed. As soon as I deleted the key, all the delayed jobs kicked in.

davbeck · 2017-08-04T14:19:29Z

I also ran into this. The lock had been stuck for a week.

maxschmeling · 2018-07-03T01:25:27Z

My lock gets stuck all the time. It's a major problem. I'll be digging into the code to see why that happens, but if anyone has any thoughts on what's happening, I would appreciate it.

evantahler · 2018-07-03T01:29:26Z

9 times out of 10 it is improper shut down behavior. How are you running your workers, how long do you give them before the SIGKILL signal and a hard shutdown (kill -9)? How long is your average job duration?

maxschmeling · 2018-07-03T01:31:23Z

@evantahler I'm quite certain I have invalid shutdown behavior, but I thought the timeout on the scheduler was to prevent improper shutdown from being an issue.

Some of our jobs are a couple minutes long, but most are less than a minute (and 10% or so are very short).

I'm running on Heroku. I'll look at my shutdown handling and see how it can be improved.

maxschmeling · 2018-07-12T21:43:56Z

I ended up with something like this and it seems to be working ok for now. Just thought I'd share. Not at all saying this is the best / most correct way.

// Called from SIGINT and SIGTERM
async function gracefulShutdown(worker, scheduler, queue, librato) {
  const stopProcessTimeout = function() {
    throw new Error("process stop timeout reached. Terminating now.");
  };
  setTimeout(stopProcessTimeout, shutdownTimeout);

  worker.on("exit", process.exit);

  await Promise.all([worker.end(), scheduler.end(), queue.end()]);
}

evantahler · 2018-07-12T23:56:17Z

That's exactly what you should do if your application is controlled by signals (which it certainly is on Heroku). You should also stop your http server and any connections you have open as well.

Actionhero does something similar https://github.com/actionhero/actionhero/blob/master/initializers/resque.js#L160-L166 (where those async stop() methods are in a signal catch: https://github.com/actionhero/actionhero/blob/master/bin/methods/start.js#L133-L135)

evantahler · 2018-07-12T23:56:38Z

... would you mind contributing something to the README about this?

maxschmeling · 2018-07-14T02:30:16Z

Absolutely. I'll send a pull request tomorrow or Monday.

evantahler added waiting-on-poster question labels Aug 11, 2016

afrozl closed this as completed Aug 11, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

deleting lock for master? #152

deleting lock for master? #152

afrozl commented Aug 11, 2016

evantahler commented Aug 11, 2016

afrozl commented Aug 11, 2016

davbeck commented Aug 4, 2017

maxschmeling commented Jul 3, 2018

evantahler commented Jul 3, 2018

maxschmeling commented Jul 3, 2018

maxschmeling commented Jul 12, 2018 •

edited

evantahler commented Jul 12, 2018

evantahler commented Jul 12, 2018

maxschmeling commented Jul 14, 2018

deleting lock for master? #152

deleting lock for master? #152

Comments

afrozl commented Aug 11, 2016

evantahler commented Aug 11, 2016

afrozl commented Aug 11, 2016

davbeck commented Aug 4, 2017

maxschmeling commented Jul 3, 2018

evantahler commented Jul 3, 2018

maxschmeling commented Jul 3, 2018

maxschmeling commented Jul 12, 2018 • edited

evantahler commented Jul 12, 2018

evantahler commented Jul 12, 2018

maxschmeling commented Jul 14, 2018

maxschmeling commented Jul 12, 2018 •

edited