Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Queue's asynchronous hooks aren't waiting to be finished! #2358

Closed
yagnesh-s-crest opened this issue May 20, 2022 · 5 comments
Closed

Queue's asynchronous hooks aren't waiting to be finished! #2358

yagnesh-s-crest opened this issue May 20, 2022 · 5 comments

Comments

@yagnesh-s-crest
Copy link

yagnesh-s-crest commented May 20, 2022

Description

When the hooks are being triggered on the lifecycle change of the Job, The triggered hook isn't waiting for the callback to finish before triggering another hook.

What is happening?

Currency on lifecycle change, the next hook is being triggered without caring for previously triggered hooks! For Example:

  • I have an asynchronous worker function that gives results in 5 seconds. The hooks I'm listening are "active, completed, and failed"
  • I'm creating one log-in in the database when the job starts ("active" hook) and then I modify the job to add the newly created document id (logId) into job data.
  • Let's say my current worker callback gives results in 5 seconds while the "active" hooks callback lasts 7 seconds. The next hook ("completed") will be triggered right after 5 seconds even though the "active" hook's callback hasn't finished!

Expected behavior

I think it really should wait for the previous hook's callback before starting the next hook if it's implemented into the design pattern. I want to know if it's really breaking change or if it is me doing things wrong. 👍

Minimal, Working Test code to reproduce the issue.

// Queue file - I have around 23 queues similar to this!
// (Needed to split only to force concurrency to 1x but still I'm getting 3x active jobs at a time on the queue,
// We'll it's a completely different issue though)

const Queue = require('bull')
const { pushWorker } = require('../../workers')

const {  jobStarted, jobCompleted, jobFailed } = require('../../utils/queue.utils.js')

const queue = new Queue('some_queue_name', {
  redis: REDIS_URL,
})

queue.process(1, pushWorker.pushAbcToXyz)

queue.on('active', jobStarted)
queue.on('completed', jobCompleted)
queue.on('failed', jobFailed)

// and then I'm using a queue.add method somewhere elsewhere to bulk push jobs into the queue (It can be over 10000 or just 10)


// pushWorker file
const pushAbcToXyz= async (job) => {
  // a code block with lots of db operation that'll take few time (let's assume it takes 5 seconds)
  await new Promise((resolve) => setTimeout(resolve, 5000))
}

// queue.utils file TLDR
const jobStarted = async (job) => {
  // a code block to create a new log in to data base with some other async stuffs (let's assume it takes 7 seconds)
  await new Promise((resolve) => setTimeout(resolve, 7000))
  // then we're updating job's data
  job.data.logId = 'generated logId'
  await job.update(job.data)
}

const jobCompleted = async (job) => {
  const { logId } = job.data
  // where I actually need to use logId added from the jobStarted hook!
}

const jobFailed = async (job) => {}

Bull version - ^4.6.2

Additional information

  • Also it would be nice if anyone can confirm concurrency issue I've mentioned just above 😃
@manast
Copy link
Member

manast commented May 20, 2022

This works as designed. I cannot imagine it working any other way, these are no "hooks" as you may be familiar with other frameworks such as Angular or Vue. Bull is a distributed system and these are "events" and they are triggered concurrently since you can potentially have thousands of jobs being processed in parallel.

Having said that, you are not supposed to use the events for updating database statuses and similar, that is not a robust way to design a system, you should update things inside the processor function so that you can get some guarantees that you do not get with events. I recommend you check some of the tutorials I wrote here: https://blog.taskforce.sh (they are for BullMQ but the principles can be applied for Bull too).

@manast manast closed this as completed May 20, 2022
@yagnesh-s-crest
Copy link
Author

@manast, Thanks for clearing out quickly, I'll update my design to process DB updates over the process.

However, can you mention a few suitable use cases of using events for a similar thing because simply wrapping the processor in try-catch will handle job complete/failed eq of Queue events, and executing something at the start of the processor is eq of active event right?

@manast
Copy link
Member

manast commented May 20, 2022

You do not need to wrap the processor in try/catch, Bull takes care of it already and will complete or fail the job depending on any exceptions thrown.

@yagnesh-s-crest
Copy link
Author

@manast, Yes I do understand it clearly, It's just that I need to mimick all those 3 events into the processor as you've mentioned (create a log on worker start and update log status to complete/fail based on processors execution)

and I can simply retain Bull's default behavior by throwing the same error that has been caught in catch once I'm done with my DB action! So that's the plan for now.

I was just wondering about some examples where using events are beneficial for real-world problems!

@manast
Copy link
Member

manast commented May 20, 2022

Maybe you should take a look at Flows https://docs.bullmq.io/guide/flows

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants