-
Notifications
You must be signed in to change notification settings - Fork 359
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: moveToFailed causes Out of memory on millions of keys #2424
Comments
Hi @manast thanks for your early reply. A couple of things to add:
|
Ok, if using Dragonfly it could run the same script in parallel, but even in this case it will not be more than the number of threads used by Dragonfly, which would still be a relative low number (as they normally would match the number of cores in the CPU). I will try to bring somebody from Dragonfly to see if they can shed some light on what could cause this out-of-memory. |
Hey there! I'm Shahar from the Dragonfly team :) |
Hi @chakaz! Thanks for jumping in. Ill try to give you as much detail as possible. Yes our dragonfly is dockerized, we run it on a orchestrated environment running in a bare metal with 32GB ram.
This is its PS: we need to use the extra We were running 2 queues in paralell that already digested something like: Then its when there comes the trouble, we re-running the whole with Queue4 Hope this helps. |
Another log, this one from dragonfly itself:
|
|
@ivnnv I think it is worth to create an issue in Bull Arena's repo then, or use Taskforce.sh instead 😉 |
@chakaz we have upgraded dragonfly to the latest, increased memory to 10GB, and got rid of bull-arena (and removed the @manast thanks for the suggestion and for sharing and maintaining bull for years. We will def consider having Taskforce a try once we make our first dollar out of this still side project of us. |
Best of luck in your endeavor, keep us posted :) |
Version
5.1.9
Platform
NodeJS
What happened?
We have a large server running a bullmq instance with usually millions of records on "waiting" or "prioritized" (which eventually get completed and deleted (but new ones are being created all the time at wait/prior)
It eventually crashes giving this stacktrace:
Trying following the rabbit hole and correct me if im wrong it appears that many of the scripts load into memory the whole set of keys (instead just taking the specific job entry of the
active
set and swap it tofailed
?). Obviously this is just initial my impression as didnt dig too much into the code.But anyways, this is it, long story short: It seems that
movedToFailed
causesOut of Memory
on a server with millions of keys (even if the server has lots of memory available for both Redis and Node).How to reproduce.
No response
Relevant log output
No response
Code of Conduct
The text was updated successfully, but these errors were encountered: