Pooling tasks issue #16

zatsepinvl · 2019-04-25T07:28:07Z

Hi! I have noticed that implementation of Pooler assumes working only with one task per time:
https://github.com/piercus/step-function-worker/blob/master/lib/pooler.js#L71
'pool should not be called when task on going'

What is the reason for such approach?

piercus · 2019-04-25T09:30:40Z

@zatsepinvl Thanks for your question

You can do multiple task per time using worker's concurrency

var worker = new StepFunctionWorker({
  activityArn : '<activity-ARN>',
  workerName : 'workerName',
  fn : fn,
  concurrency : 2 // default is 1
});

You can increase it like concurrency: 100 to have a lot of parallel task.

1 worker = multiple pooler = multiple parallel task.

1 pooler = 0 or 1 task

Horizontal scaling (parallel tasking) is managed by the worker's poolers numbers (Worker's concurrency param is managing that), see https://github.com/piercus/step-function-worker/blob/master/lib/worker.js#L88

The word "pooler" come from the concept of long polling which is used to request task from AWS step function.

The pooler's role is to ask AWS step-function's activity for a task, but when the task is associated to the pooler, then this pooler is "taken" and this pooler should not call the "pool" method anymore

Any suggestion is welcome

zatsepinvl · 2019-04-25T21:53:43Z

Great explanation, thanks!

My issue is about the following case.

Imagine that there are about 10k step function executions work at the same time. Every execution includes activity task. Every task handling may wait for a long time for user input (up to activity task timeout). So, if relation of pooler to task is about 1 to 0 or 1, thus actual number of concurrent executions are limited by concurrency configuration of worker. Is it right?

What is the maximum value of concurrency?

And more generally, can it be implemented as 1 pooler to 0 or n tasks so execution concurrency is limited only by runtime performance capabilities?

piercus · 2019-04-26T08:21:24Z

Is it right?

Yes it is right

What is the maximum value of concurrency?

Each pooler will create an http request so it will be limited by the max number of http request of your environment.

And more generally, can it be implemented as 1 pooler to 0 or n tasks so execution concurrency is limited only by runtime performance capabilities?

I agree we need to change the design

Use case infos

I'd like to know more about your use case, step-function-worker is useful when the processing does not "fit" into a lambda function, on my personnal use cases, those processing were cpu-intensive and i cannot run 10k in parallel, so i did use very low concurrency values (1,2,3 max), can you please explain more your use case (do you prefer activity over lambda or do you have a specific reason that makes lambda unusable in your use case ?)

Proposal

Here is a proposal for new design, please confirm it will fix your concerns

Actual

Worker's concurrency parameter is limiting 2 different things :
- The number of parallel pooling request made to AWS activity step function
- The number of parallel task
This is not possible to have 'unlimited' parallel task
Default parallel tasks is 1
Not possible to have more task that the max number of http connection
task is a 0-1 child of pooler

Expected

2 different concurency parameters should be used
- concurrency should be set to "deprecated" and should be replaced by taskConcurrency
- taskConcurrency (1 for retrocompatibility) will manage the number of concurrent tasks
- poolConcurrency (1 by default) will manage the number of parallel poolers
This is possible to have 'unlimited' parallel task (by using taskConcurrency: null)
Default parallel tasks is 1
Possible to have as many parallel tasks as needed
tasks and pollers are 0-n children of worker (workers.tasks and worker.poolers)

Next step

Please confirm/comment this proposal
If you want to PR this, i would be glad to comment, but please create unit test
Else i will do this redesign, but i can't start working on this before mid-may 2019

zatsepinvl · 2019-04-26T09:37:37Z

Use case infos

My use case is about chat-bot. I have step function as description of flow and activity as messages dispatcher. Dispatcher is responsible for listening user messages and responding them back. Process can be described like this: [task] Some question to user to fill in -> [activity] (dispatcher pools task -> send message from tasks to user -> wait for response -> return user response as a result of task). Assuming that chat-bot is used by 1m users, 10k active executions are quite reasonable fact. In this case number of active executions are the number of users that service can process.

Proposal

Expected design looks appropriate to satisfy my use case.

piercus · 2019-05-20T08:17:59Z

Hello @zatsepinvl

I have released 3.0 alpha on https://github.com/piercus/step-function-worker/releases/tag/v3.0-alpha.

Is it fixing your concerns ?

zatsepinvl · 2019-06-05T13:03:42Z

Thank you! Looks like appropriate.

…ing #16

BREAKING CHANGE: this is a breaking change Redesign the concurrency architecture for 3.0 following #16

… for 3.0 following #16

piercus · 2021-10-19T20:01:28Z

@zatsepinvl It's been a while, but for information, v3.0 has been released today

Thank you for your help on this

piercus mentioned this issue Apr 30, 2019

I encountered the following error in docker. #17

Closed

piercus mentioned this issue May 17, 2019

Redesign #19

Merged

zatsepinvl closed this as completed Jun 5, 2019

piercus added a commit that referenced this issue Oct 19, 2021

BREAKING CHANGE: Redesign the concurrency architecture for 3.0 follow…

73b9e04

…ing #16

piercus added a commit that referenced this issue Oct 19, 2021

feat: Redesign

d24c32f

BREAKING CHANGE: this is a breaking change Redesign the concurrency architecture for 3.0 following #16

piercus added a commit that referenced this issue Oct 19, 2021

Merg master\n\nBREAKING CHANGE: Redesign the concurrency architecture…

fa252fa

… for 3.0 following #16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pooling tasks issue #16

Pooling tasks issue #16

zatsepinvl commented Apr 25, 2019

piercus commented Apr 25, 2019 •

edited

zatsepinvl commented Apr 25, 2019 •

edited

piercus commented Apr 26, 2019

zatsepinvl commented Apr 26, 2019

piercus commented May 20, 2019

zatsepinvl commented Jun 5, 2019

piercus commented Oct 19, 2021

Pooling tasks issue #16

Pooling tasks issue #16

Comments

zatsepinvl commented Apr 25, 2019

piercus commented Apr 25, 2019 • edited

zatsepinvl commented Apr 25, 2019 • edited

piercus commented Apr 26, 2019

Use case infos

Proposal

Actual

Expected

Next step

zatsepinvl commented Apr 26, 2019

Use case infos

Proposal

piercus commented May 20, 2019

zatsepinvl commented Jun 5, 2019

piercus commented Oct 19, 2021

piercus commented Apr 25, 2019 •

edited

zatsepinvl commented Apr 25, 2019 •

edited