Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticache Jobs not always starting #658

Closed
dolpm opened this issue Jul 23, 2021 · 10 comments
Closed

Elasticache Jobs not always starting #658

dolpm opened this issue Jul 23, 2021 · 10 comments

Comments

@dolpm
Copy link

dolpm commented Jul 23, 2021

I am using Bull to handle a checkout flow. Oddly enough, when using elasticache my queued jobs do not always start. It is a pretty basic configuration (shown below). They are always added, and in the case that they do not start, a user can just re-start the checkout flow and once again there is around a 50/50 that it will begin.

image

My development environment (redislabs) works just fine...

For elasticache I am directly connecting to a single shard. I have also tried by connecting using cluster-mode and the same issue is occurring.... I use this version of IORedis on a bunch of other applications on this shard without issue.

Versions:
Bullmq: v1.40.0
IORedis: v4.27.6
Elasticache engine version: v5.0.6

Update: getting this error after some time
image

@manast
Copy link
Contributor

manast commented Jul 23, 2021

Seems to me like there is some kind of connection issue. Can you provide more information on what you mean with "restart the checkout flow" ?

@dolpm
Copy link
Author

dolpm commented Jul 23, 2021

In the context of Bull, restarting the checkout flow will mean that a new task gets added to the queue.

This code gets called and is successful every time BUT it is a toss up whether the task will actually start:
image

Nothing is blocking the queue, though. All tasks will run immediately IF they are able to start (even after another doesn't start).

I agree that is a connection issue, however, I have tried a bunch of different connections that have the same result.

Cluster implementation:
image

@manast
Copy link
Contributor

manast commented Jul 23, 2021

I still do not understand, "restart the checkout flow" === adding a job? So you mean, after adding a new job it may start processing again older jobs?

@dolpm
Copy link
Author

dolpm commented Jul 23, 2021

No. It will not start processing older jobs - that is why it is confusing.... Sometimes jobs that get added to the queue just never start.

Lets say a user is checking out an item, however the process breaks. The job will be added to the queue but never start processing.

After waiting for it to finish, they decide to try again. Lets say it is successful this time. In this case, a new job will be added to the queue and this new job will process immediately as if the original never existed in the first place.

The original job that didn't start doesn't block the queue at all.

@manast
Copy link
Contributor

manast commented Jul 23, 2021

But the Worker class is independent from the Queue class for adding jobs so it does not make any sense. I suggest you the following, write the simplest code that you can that just adds and process dummy jobs (that do not do anything). Since you say it is 50% chance of this happening you should be able to reproduce it easily. If you do not succeed then you know the issue is in the more complex code.

@dolpm
Copy link
Author

dolpm commented Jul 23, 2021

Will do. Is it possible that this has something to do w/ re-using the connection for the queue and worker?

@dolpm
Copy link
Author

dolpm commented Jul 23, 2021

image

works well until it doesn't.

@manast
Copy link
Contributor

manast commented Jul 23, 2021

Can you provide the code that reproduces it?

@dolpm
Copy link
Author

dolpm commented Jul 23, 2021

/* eslint-disable import/prefer-default-export */
import { Queue, Worker } from 'bullmq';
import IORedis from 'ioredis';

import processCheckout from './worker';

const QUEUE_NAME = 'checkouts';
const QUEUE_PREFIX = '{checkouts}';

let connection;
if (
  process.env.NODE_ENV === 'production' ||
  process.env.NODE_ENV === 'staging'
) {
  connection = new IORedis.Cluster([{ host: process.env.REDIS_URI }], {
    slotsRefreshTimeout: 1500,
    scaleReads: 'all',
    redisOptions: {
      showFriendlyErrorStack: true,
      lazyConnect: true
    }
  });
} else {
  connection = new IORedis(process.env.REDIS_URI);
}

const CheckoutQueue = new Queue(QUEUE_NAME, {
  prefix: QUEUE_PREFIX,
  connection
});

/*
const worker = new Worker(QUEUE_NAME, processCheckout, {
  prefix: QUEUE_PREFIX,
  connection
});
*/

const testWorker = new Worker(
  QUEUE_NAME,
  async (job) => {
    console.log('job started:', job.id, '\n');
  },
  {
    prefix: QUEUE_PREFIX,
    connection
  }
);

setInterval(async () => {
  const job = await CheckoutQueue.add(new Date().toString(), {}, {});
  console.log('job added:', job.id);
}, 5000);

@dolpm
Copy link
Author

dolpm commented Jul 24, 2021

Ended up following the issue back to webpack.

@dolpm dolpm closed this as completed Jul 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants