Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTL for job in active queue #1504

Closed
ms10398 opened this issue Oct 14, 2019 · 9 comments
Closed

TTL for job in active queue #1504

ms10398 opened this issue Oct 14, 2019 · 9 comments

Comments

@ms10398
Copy link

ms10398 commented Oct 14, 2019

How can I specify TTL for a job? I tried to look through the docs but couldn't find anything related to it though I read that bull has this feature. So that if it stays in active state for the specified time it terminates automatically.

I stumbled upon this as the processor stops processing the jobs suddenly and starts working after restart. There is no log or error thrown but the processor gets stuck and jobs start piling in waiting queue.

Bull version

v3.11

@stansv
Copy link
Contributor

stansv commented Oct 14, 2019

Hi! Basically execution timeout is achieved with timeout option, see JobOpts description here: https://github.com/OptimalBits/bull/blob/develop/REFERENCE.md#queueadd

But hitting timeout doesn't mean your processing procedure would somehow be interrupted. Bull just registers job promise rejection callback with setTimeout(), and that's all. If you have a long-living loop in your processor it'd be better to check for timeout manually on each iteration and explicitly exit.

There's also a possibility to execute jobs in child node process (https://github.com/OptimalBits/bull/blob/develop/docs/README.md#sandboxed-processors).
In this case you can try to kill child process, but this is not implemented in Bull yet, so this would require manual tracking of mapping between each PID and job id.

@ms10398
Copy link
Author

ms10398 commented Oct 14, 2019

Actually there is no long-living loop but the jobs get stuck in active state even there is no error and the jobs are processed.

I am not sure exactly right now why is it happening.

So the time out will put it in failed queue?

@stansv
Copy link
Contributor

stansv commented Oct 14, 2019

Yes, a working example

import Bull from "bull";

const queue = new Bull('test');

queue.process(async job => {
  console.log("Executing job " + job.id);
  await new Promise((resolve) => { setTimeout(resolve, 10000); });
  console.log("Finishing execution of job " + job.id);
});

queue.add({ foo: 'bar' }, { timeout: 3000 });

queue.on('completed', (job) => { console.log("Job " + job.id + " is completed"); });
queue.on('failed', (job) => { console.log("Job " + job.id + " is failed"); });

Output

Executing job 1
Job 1 is failed
Finishing execution of job 1

The job's failedReason in Redis is Promise timed out after 3000 milliseconds.

As you can see, Bull fails the job while the callback is still executes; result is ignored then even if it completes successfully.

@ms10398
Copy link
Author

ms10398 commented Oct 14, 2019

Thanks for the help!

Can you help me with the problem which leads me to use timeouts?

The queue gets stuck with some job in active state and the waiting queue starts piling up

@stansv
Copy link
Contributor

stansv commented Oct 14, 2019

The debugging in particular cases may be quite tricky, depending on what you're doing in process callback. Are you using async callback as in my example or call done() once work is completed?

Anyway, the general advice in to add more logging to your procedure.. most likely a promise getting abandoned somewhere (ie neither reject not resolve is called).

@ms10398
Copy link
Author

ms10398 commented Oct 14, 2019

I am calling done() when the process is completed

@stansv
Copy link
Contributor

stansv commented Oct 14, 2019

I guess something goes wrong and your procedure fails before done() is called. The error is swallowed for some reason. Try to wrap your callback code with try-catch


queue.process((job, done) => {
  try {
    .. here is your business logic ..

    done(null, result);
  } catch(error) {
    done(error);
  }
});

@ms10398
Copy link
Author

ms10398 commented Oct 14, 2019

I will try that will follow back if it solves our problem.

Thanks a lot for the quick response really appreciated @stansv 👏

@ms10398
Copy link
Author

ms10398 commented Oct 17, 2019

Right now it is working fine for us. So I would proceed and close the issue if there is any further help needed will reopen it.

Thanks! 😄

@ms10398 ms10398 closed this as completed Oct 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants