Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

random retryDelay #94

Closed
sylvainlap opened this issue Sep 30, 2022 · 4 comments
Closed

random retryDelay #94

sylvainlap opened this issue Sep 30, 2022 · 4 comments

Comments

@sylvainlap
Copy link

Hi.

I want to use redis-smq in my project, and wrote some tests to test the lib, but I got some random behavior with retryDelay.

In my config file, I have:

consumeOptions: {
    ttl: 0,
    retryThreshold: 3,
    retryDelay: 2000,
  },

I wrote an handler that will always fail:

async function handler(message, ack) {
  try {
    throw new Error('error');
  } catch (err) {
    debug(err);
    return ack(err);
  }
}

and register it
rsmq.consumer.consume(config.get('rsmq.queueName'), handler, () => {});

With the retryDelay, the handler should be called 3 times, at 2sec intervals. But the behavior is:

  • it is called the first time immediatly
  • then it is called a second time at a random moment, but I have too wait far more than 2sec
  • then it is called a third time at a random moment, but I have too wait far more than 2sec

So the retryThreshold is respected, but not the retryDelay. How comes ?

@weyoss
Copy link
Owner

weyoss commented Oct 2, 2022

@sylvainlap Thank you for reporting this issue.

I believe that retryDelay should be also respected. But from your experience, that you reported, it sounds to me that something has definitely went wrong regarding the retryDelay option.

I will investigate the issue ASAP.

Further updates will be posted later, so keep tuned.

@weyoss weyoss added bug and removed bug labels Oct 2, 2022
@weyoss
Copy link
Owner

weyoss commented Oct 2, 2022

This is not a bug.

After a quick checkup, I came to the conclusion that issue you reported is actually a normal behavior and it is not a bug. Let me make it more clear to you.

Here the script I used for testing:

import { Consumer, Producer,  Message, QueueManager } from 'redis-smq';

Message.setDefaultConsumeOptions({
  ttl: 0,
  retryThreshold: 5,
  retryDelay: 2000,
});

const queue = `queue_${Date.now()}`;

const consume = () => {
  const consumer = new Consumer();
  consumer.consume(
    queue,
    (msg) => {
      console.log(`Received message ID ${msg.getId()} at ${Date.now()}`);
      throw new Error('Explicit error');
    },
    (err) => {
      if (err) console.log(err);
    },
  );
  consumer.run();
};

const produce = () => {
  const producer = new Producer();
  producer.run((err) => {
    if (err) console.log(err);
    else {
      const msg = new Message().setBody('test message').setQueue(queue);
      producer.produce(msg, (err) => {
        if (err) console.log(err);
        else
          console.log(
            `Message ID ${msg.getId()} has been published at ${Date.now()}.`,
          );
      });
    }
  });
};

QueueManager.createInstance({}, (err, queueManager) => {
  if (err) console.log(err);
  else {
    queueManager?.queue.create(queue, false, (err) => {
      if (err) console.log(err);
      else {
        produce();
        consume();
      }
    });
  }
});

You should first understand the way RedisSMQ works.

RedisSMQ uses workers for various background tasks and one of those workers is the DelayWorker which responsibility is to handle unacknowledged messages that need to be re-queued with a certain delay, given that retryThreshold is > 1 and retryDelay > 0.

Another crucial moment that needs to be taken into account is that the workers are allowed to run only from a single consumer instance, given many consumer instances. So for each consumer instance, before running background workers, it tries to indefinitely acquire a global lock. Once the lock is acquired, the workers are launched. An acquired lock is set to expire after 60 seconds.

When you run the script above for the first time, a fresh lock is acquired and the DelayWorker is immediately launched. As a result the retryDelay is respected. If you terminate the running script by ^C (control+c) from the console, the consumer instance crashes and the lock is not released.

After that if you run the script for the second time, the lock is not acquired immediately because the previous lock, which was not released, has not yet expired. Once the lock is expired, the consumer acquires a new lock and from then the message is re-queued with respect to retryDelay.

Additionally, it is worth to mention that JavaScript clocks are not precise and are allowed to drift randomly. So when you set the retryDelay to some value like 2000, it is not guaranteed that the message will be re-queued exactly after 2 secs of delay.

@weyoss
Copy link
Owner

weyoss commented Oct 3, 2022

Closing as resolved.

@weyoss weyoss closed this as completed Oct 3, 2022
@sylvainlap
Copy link
Author

Great response ! Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants