-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Worker count for a code package? #64
Comments
Hi @calebclark , we will soon have a feature made just for this (scaling workers based on queue sizes), but for now, would a scheduled worker do the trick? Like schedule a worker every minute and have it run until the queue is empty. If the queue grows, you may get more than one running at once (if the first one doesn't finish within a minute), which is good though as it will go through the queue faster. Sort of an auto-scaling hack. |
No. Our volume is going to be very uneven. We'll likely have long stretches of time with no tasks and then suddenly have a need to fire up 10 or 20 workers to handle the volume. Having a scheduled worker would do little for us during the stretches of no work and underperform during the heavy volume. |
How about a master/slave pattern (very common). The scheduled worker, the master, starts every minute, checks queue info endpoint to get queue size, then spawn up any number of slave workers. If small number, just queue up one slave, if large, queue up a bunch. |
It seems that under sustained volume, if the master executes the logic you describe, with no knowledge of how many slaves are already running, then it potentially becomes only a matter of time before the ratio becomes 1 worker to 1 message. |
IronCache simple slave counter? |
I don't think it would ever get to 1 worker for 1 message unless you had some bad scheme for doing this. I don't know your use case, but simple example: Let's say you spawn one worker for every 100 messages in queue. Then:
And so on, and so on. |
+1 to last Travis' solution. Note: use loop with "while queue is not empty" condition for your slaves, not N messages for queue and exit. But spawn slaves by number of messages per worker as Travis recommended. This way provide automatic scaling on worker side. Number of messages per worker (or number slaves launch) is depend on your processing time of each message. If it is really heavy tasks - you may reach concurrency limit (depend on plan you're using). So, if you are able to predict average high number of messages and time to process each message (on N messages) you may calculate the maximum concurrency you need on msg. queue spikes (and please keep in mind it's avg. high, and adding safety factor for concurrency possibly good idea). |
Concurrency calculation example:
Then peak:
N messages per worker:
Of course you can extend you master worker with more complex prediction. IronCache will be pretty good to store such kind of information. And free acc. will be enough ( ; |
We are following Iron's suggestion of pushing high volume tasks to a queue on IronMQ and then firing up Workers to process those messages. However, there is no way to easily see how many workers are running or queued for a specific code package. The only way we could find is to request all tasks and loop through them to count how many tasks contain the code_id we're looking for.
This doesn't seem very efficient, either for us or Iron. What would you recommend? We don't want to be firing up a new worker whenever there are messages in IronMQ, because will have high a high volume of messages that each require very small amounts of processing.
The text was updated successfully, but these errors were encountered: