New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fan-out / Fan-in is slow due to internal overhead #81

Closed
johnvarney opened this Issue Oct 29, 2017 · 5 comments

Comments

Projects
None yet
4 participants
@johnvarney

johnvarney commented Oct 29, 2017

I am running a test based on the fan-in fan-out sample using a 1600 item array of 3 or 4 character strings.
The activity function just returns the string that was passed in as the input, taking between 0 and 94 ms.
There appears to be a huge amount of overhead as the orchestration function replays many times and it all takes about 180 to 240 seconds to complete.
I can see the history in the Azure storage account when I run it locally for testing but do not see the history when it is run in Azure.
I was expecting that completion would take a small multiple of the individual activity function time as they would be spawned very quickly and would be working in parallel.

@cgillum

This comment has been minimized.

Collaborator

cgillum commented Oct 30, 2017

Thanks for opening this issue. I suspect the bottleneck isn't so much the parallelism, but the "fan-in" part which is single-threaded and currently involves a large amount of storage I/O.

Just for context, we haven't done any performance optimization yet. I suspect the thing that's killing perf in this specific scenario is the fact that we're re-fetching the instance history every time the orchestrator receives a batch of response messages. Some internal caching could really help us here because there is a ton of Azure Storage I/O that we're doing unnecessarily (we wanted to focus on correctness of behavior before doing perf work).

I've opened a separate issue to track the performance work I mentioned: #82. I'll keep this issue open as well to track making sure we improve your specific scenario.

@cgillum cgillum self-assigned this Oct 30, 2017

@cgillum cgillum changed the title from Fan-in fan-out test to Fan-out / Fan-in is slow due to internal overhead Oct 30, 2017

@NTTLuke

This comment has been minimized.

NTTLuke commented Feb 26, 2018

Hi @cgillum,
I have the same issue (#92) working with Fan-out/Fan-in pattern.
I saw it has been closed by referring to this.
Any news ?
Thank you

@cgillum

This comment has been minimized.

Collaborator

cgillum commented Feb 26, 2018

Nothing to share yet. Things have been a little slow on this front but will be picking up again very soon. The main point of contention for this work item is deciding whether to do a partial tactical performance fix or do a larger, longer-term performance boost on the Durable Task Framework itself. Stay tuned. This will definitely be addressed for GA.

@andreujuanc

This comment has been minimized.

andreujuanc commented Apr 12, 2018

I would prefer a long-term solution for GA.

We are dealing with beta anyway, it's understandable.

@cgillum

This comment has been minimized.

Collaborator

cgillum commented May 1, 2018

A fix has been made which will vastly improve performance in the next release. See this PR: #289

@cgillum cgillum closed this May 1, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment