You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The collective muting & unmuting behavior of DataChannel actors needs to be reduced. The effective startup time of Wallaroo can be dramatically affected by small changes in pipeline structure, such as where a sub-pipeline is .merge()d into a main pipeline.
What is the expected behavior?
Less-than-exponential(-seeming) behavior as the # of Wallaroo workers grows.
Test output such as https://gist.github.com/slfritchie/5d6cbcb243d8197f61c659445feb9454 shows a set of results when starting a 2 or 4 or 8 worker cluster whose application pipeline includes 3 sources and two .merge() operations on sub-pipelines. As soon as worker0 polls ready, then a single operation is sent to the first source, and the sink's output is shown with a timestamp.
Relevant times to examine (and consider that each test was run only once):
Time from first worker logging |~~ INIT PHASE IV: Cluster is ready to work! ~~| to last worker's log.
Time for last worker's ready log to the processing time of the 1 work item processed by Wallaroo.
If the source is modified as suggested in the README.md file, moving the .merge(other)call from line 57 to line 62, i.e., immediately before the .to_sink() call, then the times drop significantly.
The timing with the (*) label is most anomalous: it's too big compared to the typical latency. But these simple results show that a small change in pipeline definition can change the time of first processed item end-to-end from roughly 10 seconds down to 1.3 seconds.
The text was updated successfully, but these errors were encountered:
Is this a bug, feature request, or feedback?
Enhancement/performance improvement
What is the current behavior?
The collective muting & unmuting behavior of DataChannel actors needs to be reduced. The effective startup time of Wallaroo can be dramatically affected by small changes in pipeline structure, such as where a sub-pipeline is
.merge()
d into a main pipeline.What is the expected behavior?
Less-than-exponential(-seeming) behavior as the # of Wallaroo workers grows.
What OS and version of Wallaroo are you using?
Ubuntu Bionic/18.04 LTS + Wallaroo @ commit 35d2038
Steps to reproduce?
See README.md in tarball at http://wallaroolabs-dev.s3.amazonaws.com/scott/count2.tar.gz. Instructions include options for building & running a demonstration test via a VM or Docker.
Test output such as https://gist.github.com/slfritchie/5d6cbcb243d8197f61c659445feb9454 shows a set of results when starting a 2 or 4 or 8 worker cluster whose application pipeline includes 3 sources and two
.merge()
operations on sub-pipelines. As soon as worker0 polls ready, then a single operation is sent to the first source, and the sink's output is shown with a timestamp.Relevant times to examine (and consider that each test was run only once):
|~~ INIT PHASE IV: Cluster is ready to work! ~~|
to last worker's log.ready
log to the processing time of the 1 work item processed by Wallaroo.In the source as-is, those times are:
If the source is modified as suggested in the
README.md
file, moving the.merge(other)
call from line 57 to line 62, i.e., immediately before the.to_sink()
call, then the times drop significantly.The timing with the
(*)
label is most anomalous: it's too big compared to the typical latency. But these simple results show that a small change in pipeline definition can change the time of first processed item end-to-end from roughly 10 seconds down to 1.3 seconds.The text was updated successfully, but these errors were encountered: