You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We recently switched from a self-hosted Redis instance to the Azure Cache for Redis SaaS offering.
First, everything worked as expected. A couple of days ago, the queue started behaving in weird ways: Sometimes, adding a job works flawlessly. Adding the exact same job a second time shortly after the first one failed, sporadically results in the job immediately failing.
We added (console) logging to the worker events (active, error, etc.) and in the processing code as well, but we get absolutely no output. It looks like the job processing code doesn't run at all. What makes this even weirder is the fact that the job fails with an error message which is clearly there in our job processing code. But logging something right before throwing the error does not yield any log output. We added job.log() calls in our worker event handlers, but the log on the failed job is empty.
So, to sum this up: The jobs seem to fail sporadically without even invoking the processing code and pulling some (maybe old?) error message seemingly out of thin air.
I switched back to the self-hosted Redis instance and the problem went away immediately.
I've used bullmq (always with self-hosted Redis instances) for quite some time in different projects now and never saw anything like this.
Is it possible that the bullmq data in the Azure Redis instance somehow got corrupted or something like that? Does this ring any bell for the maintainers?
Thank you! 🙏
The text was updated successfully, but these errors were encountered:
It is difficult to make a good assessment without more information such as the content of the failed error messages. One thing I would check in your case is to verify that the "maxmemory" setting in the hosted redis instance is set to "noeviction" (https://redis.io/docs/manual/eviction/), as otherwise Redis will work as a cache and remove random keys which of course can make the queue behave very strange.
If the setting is correct, I advise you to create a smaller test app using the hosted Redis instance and keep adding features until you trigger the strange behaviour, usually that gives you the information needed to sort out the problem.
Of course it was all our fault: A second (older) instance of our application was still running elsewhere and was pointing to the same Azure Redis instance, basically snatching jobs from the other system. Doh! 😂
We recently switched from a self-hosted Redis instance to the Azure Cache for Redis SaaS offering.
First, everything worked as expected. A couple of days ago, the queue started behaving in weird ways: Sometimes, adding a job works flawlessly. Adding the exact same job a second time shortly after the first one failed, sporadically results in the job immediately failing.
We added (console) logging to the worker events (
active
,error
, etc.) and in the processing code as well, but we get absolutely no output. It looks like the job processing code doesn't run at all. What makes this even weirder is the fact that the job fails with an error message which is clearly there in our job processing code. But logging something right before throwing the error does not yield any log output. We addedjob.log()
calls in our worker event handlers, but the log on the failed job is empty.So, to sum this up: The jobs seem to fail sporadically without even invoking the processing code and pulling some (maybe old?) error message seemingly out of thin air.
I switched back to the self-hosted Redis instance and the problem went away immediately.
I've used bullmq (always with self-hosted Redis instances) for quite some time in different projects now and never saw anything like this.
Is it possible that the bullmq data in the Azure Redis instance somehow got corrupted or something like that? Does this ring any bell for the maintainers?
Thank you! 🙏
The text was updated successfully, but these errors were encountered: