New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CRDT Slower as workers are added. #5
Comments
That sounds like a great plan! If it is helpful, the timely logging infrastructure produces streams of scheduling events, message communication and receipt events, stuff like that. It can be helpful to determine what is on the critical path. For inspiration, maybe check out https://github.com/MaterializeInc/materialize/tree/master/src/dataflow/src/logging The |
Thanks for the tips! I'll look into it. |
The reason for this seems to be that the program as written exhibits two iterative scopes that need to perform many thousands of iterations. The ancestor collapsing takes over 7000 iterations, and the "blank star" collapsing takes over 13000 iterations. The control flow aspects of these iterations take some time (tens of microseconds, it seems) and they are inherently sequential rather than parallel. The problem can be fixed by using a different algorithm for these stages. If you collapse these paths using an iterated contraction algorithm, as in the
So, certainly some better scaling here, and just generally better performance as well (the overhead of the loops is large, even ignoring the lack of scaling). |
I'm going to close this out, as I believe the mystery has been resolved! |
Following up on TimelyDataflow/differential-dataflow#273 and giving a more concrete example. CRDT seems to exhibit particularly poor scaling. Adding additional workers results in worse run times:
Looking at some perf flame graphs of one worker versus eight workers:
One Worker
Eight Workers
For eight worker, it seems a lot of time is spent is spent on
step_or_park
but not actually stepping. Instrumenting theadvance
function with an atomic counter:I'll try to figure out why
advance
is being called so much with eight workers.The text was updated successfully, but these errors were encountered: