Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jobs are undefined in the queue #1386

Closed
akirilyuk opened this issue Jul 15, 2019 · 12 comments
Closed

Jobs are undefined in the queue #1386

akirilyuk opened this issue Jul 15, 2019 · 12 comments

Comments

@akirilyuk
Copy link
Contributor

Description

Hi, we started noticing errors in our production environment. Unfortunately this is the only piece of information I can share with you:

TypeError: client.isJobInList is not a function at Object.isJobInList (/app/node_modules/bull/lib/scripts.js:11:19) at Job._isInList (/app/node_modules/bull/lib/job.js:511:18) at Job.isActive (/app/node_modules/bull/lib/job.js:350:15) at result.then.state (/app/node_modules/bull/lib/job.js:383:27) at process._tickCallback (internal/process/next_tick.js:68:7)

Also we noticed that some jobs in the queue are becoming an empty object.

Minimal code to reproduce

 const jobs = await queue.getJobs();
      try {
        const jobsWithState = await Promise.all(
          jobs.map(async job => {
            try {
              const jobState = await job.getState();
              return { ...job, jobState };
            } catch (err) {
             // this is thrown al the time, however the jobId is undefined
              warn('bull:getJobMetrics: Could not get Job State', {
                jobId: job.jobId
              });
              return { ...job, jobState: undefined };
            }
          })
        );
      } catch (err) {
       // this is not thrown
        error('bull:getJobMetrics:', {
          errorMessage: err.message,
          errorTrace: err.stack
        });
        return null;
      }

Bull version

3.10.0

@stansv
Copy link
Contributor

stansv commented Jul 15, 2019

I cannot reproduce this. Can you please also attach your code which creating this queue?

Is this error thrown for all jobs of single queue.getJobs() call or occasionally appears only on some jobs returned by single queue.getJobs()? Are these faulty jobs same all the time or error appears randomly?

@manast
Copy link
Member

manast commented Jul 15, 2019

@stansv please contact me at manast@taskforce.sh

@akirilyuk
Copy link
Contributor Author

Hi sorry for coming back late, had very busy days..

This is how we initialise the queue.

const queue = new Queue(
        'jobs',
        `redis://${process.env.REDIS_HOST}:${process.env.REDIS_PORT}`,
        {
          settings: {
            stalledInterval: 5000,
            maxStalledCount: 9999,
            lockDuration: 60000,
            lockRenewTime: 30000
          }
        }
      )

More information: our setup consist of 2 different systems. One is an API which simply creates job tasks and pushes them to Bull. The other one is a worker server which processes the jobs.

We stop running jobs via the moveToCompleted function which is called with ignoreLock: true on the API side. Then the worker server listens to a global:completed event and calls the done callback to finish job processing on the worker side.

When the job is restarted again (meaning there is already a done job in redis), we do these calls:

await job.moveToCompleted('removed', true);
await job.remove();
 await queue.add(
        {
          ...payload
        },
        {
          jobId: payload.id,
          attempts: 99999,
          backoff: {
            type: 'customBackoff'
          }
        }
      )

@stansv
Copy link
Contributor

stansv commented Jul 18, 2019

I still cannot reproduce your error..
Try to add filter to queue.getJobs(), for example queue.getJobs(["active"]) or queue.getJobs(["completed"]), maybe this would help.

There are a bunch of pitfalls around usage of remove and moveToCompleted what may also lead to problems. I noticed that my sample code fails on remove if I don't specify 3rd parameter of moveToCompletednotFetch=true and also if job is locked, so I'm not sure how your code works if you had to use ignoreLock on moveToCompleted..

I believe that the best for you would be to reorganize your code so that there's no need to forcibly "stop" jobs, and remove them. Implement some check in process function so that job can know it's requested to complete and let it finish normally.

@akirilyuk
Copy link
Contributor Author

We have a check in the process function. We listen to the global:completed event and call the cone callback as soon as its called with the jobId of the currently executed job.

@Ginden
Copy link

Ginden commented Oct 15, 2019

My failing code is:

  const j: Job = await transporter.queue.getJob(jobId);
  if (j && await j.isActive()) { // this line fails
     return;
   }
   queue.add(...)

I get following stacktrace:

TypeError: client.isJobInList is not a function
at Object.isJobInList (/home/futuremind_admin/services/zabka-csv-parser/node_modules/bull/lib/scripts.js:11:19)
at Job._isInList (/home/futuremind_admin/services/zabka-csv-parser/node_modules/bull/lib/job.js:511:18)
at Job.isActive (/home/futuremind_admin/services/zabka-csv-parser/node_modules/bull/lib/job.js:350:15)
at [my code]
 at process._tickCallback (internal/process/next_tick.js:68:7), 

I'm not sure what is happening here and how to debug it.

@manast
Copy link
Member

manast commented Oct 15, 2019

You must wait for the queue to be ready, try with

await transporter.queue.isReady();

@entrptaher
Copy link

This should be added in the docs. ^ @manast ❤️

@stale
Copy link

stale bot commented Jul 12, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Jul 12, 2021
@manast
Copy link
Member

manast commented Jul 13, 2021

@entrptaher in reality the call to isReady should not be needed, this is actually a bug in that inside getJob it should check if the queue is ready before trying to do its thing.

@stale stale bot removed the wontfix label Jul 13, 2021
@manast manast added the bug label Jul 13, 2021
@dbousamra
Copy link

We are seeing this too.

@manast manast closed this as completed in 2f27faa Aug 26, 2021
github-actions bot pushed a commit that referenced this issue Aug 26, 2021
## [3.29.1](v3.29.0...v3.29.1) (2021-08-26)

### Bug Fixes

* protect getJob with isReady, fixes [#1386](#1386) ([2f27faa](2f27faa))
@manast
Copy link
Member

manast commented Aug 26, 2021

🎉 This issue has been resolved in version 3.29.1 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants