Worker count for a code package? #64

calebjclark · 2013-04-11T20:06:24Z

We are following Iron's suggestion of pushing high volume tasks to a queue on IronMQ and then firing up Workers to process those messages. However, there is no way to easily see how many workers are running or queued for a specific code package. The only way we could find is to request all tasks and loop through them to count how many tasks contain the code_id we're looking for.

This doesn't seem very efficient, either for us or Iron. What would you recommend? We don't want to be firing up a new worker whenever there are messages in IronMQ, because will have high a high volume of messages that each require very small amounts of processing.

treeder · 2013-04-11T20:21:18Z

Hi @calebclark , we will soon have a feature made just for this (scaling workers based on queue sizes), but for now, would a scheduled worker do the trick? Like schedule a worker every minute and have it run until the queue is empty. If the queue grows, you may get more than one running at once (if the first one doesn't finish within a minute), which is good though as it will go through the queue faster. Sort of an auto-scaling hack.

calebjclark · 2013-04-11T20:31:32Z

No. Our volume is going to be very uneven. We'll likely have long stretches of time with no tasks and then suddenly have a need to fire up 10 or 20 workers to handle the volume. Having a scheduled worker would do little for us during the stretches of no work and underperform during the heavy volume.

treeder · 2013-04-11T21:16:20Z

How about a master/slave pattern (very common). The scheduled worker, the master, starts every minute, checks queue info endpoint to get queue size, then spawn up any number of slave workers. If small number, just queue up one slave, if large, queue up a bunch.

calebjclark · 2013-04-11T22:00:00Z

It seems that under sustained volume, if the master executes the logic you describe, with no knowledge of how many slaves are already running, then it potentially becomes only a matter of time before the ratio becomes 1 worker to 1 message.

carimura · 2013-04-11T22:17:08Z

IronCache simple slave counter?

treeder · 2013-04-11T22:40:37Z

I don't think it would ever get to 1 worker for 1 message unless you had some bad scheme for doing this. I don't know your use case, but simple example:

Let's say you spawn one worker for every 100 messages in queue. Then:

time 0, queue size 0: no slaves
time 60, queue size 500: +5 slaves = 5 slaves
# if queue keeps growing even with 5 workers cranking through them, then you must have a lot of messages coming in, so case A:
time 120, queue size 1000: +10 slaves = 15 slaves
# if 5 workers can reduce the queue, then case B:
time 120, queue size 200: +2 slaves = 7 slaves
# now either 7 workers can eat through queue or they can't, but let's say they can so they all die off and we're at 0 slaves by the time the next master runs:
time 180, queue size 0: +0 slaves = 0 slaves

And so on, and so on.

featalion · 2013-04-11T23:44:32Z

+1 to last Travis' solution. Note: use loop with "while queue is not empty" condition for your slaves, not N messages for queue and exit. But spawn slaves by number of messages per worker as Travis recommended. This way provide automatic scaling on worker side. Number of messages per worker (or number slaves launch) is depend on your processing time of each message. If it is really heavy tasks - you may reach concurrency limit (depend on plan you're using). So, if you are able to predict average high number of messages and time to process each message (on N messages) you may calculate the maximum concurrency you need on msg. queue spikes (and please keep in mind it's avg. high, and adding safety factor for concurrency possibly good idea).

featalion · 2013-04-12T00:20:21Z

Concurrency calculation example:

Assume queue avg. spike level is 3000 msgs/s
Assume one worker can process 10 msgs/s
Assume your master worker scheduled run each minute (60 s)
Assume safety factor for concurrency is 0.15 (15%)

Then peak:

concurrency = 345   [ (1 + 0.15) * 3000 / 10  ]

N messages per worker:

N_msgs_per_worker = N_msgs_now_in_queue / (10 * 60)  [ 600 msgs per worker per minute ]

Of course you can extend you master worker with more complex prediction. IronCache will be pretty good to store such kind of information. And free acc. will be enough ( ;

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Worker count for a code package? #64

Worker count for a code package? #64

calebjclark commented Apr 11, 2013

treeder commented Apr 11, 2013

calebjclark commented Apr 11, 2013

treeder commented Apr 11, 2013

calebjclark commented Apr 11, 2013

carimura commented Apr 11, 2013

treeder commented Apr 11, 2013

featalion commented Apr 11, 2013

featalion commented Apr 12, 2013

Worker count for a code package? #64

Worker count for a code package? #64

Comments

calebjclark commented Apr 11, 2013

treeder commented Apr 11, 2013

calebjclark commented Apr 11, 2013

treeder commented Apr 11, 2013

calebjclark commented Apr 11, 2013

carimura commented Apr 11, 2013

treeder commented Apr 11, 2013

featalion commented Apr 11, 2013

featalion commented Apr 12, 2013