The fork builds upon the original repository and implements further algorithms for analyzing stream processors. It currently focuses on the 0.9 version of Timely Dataflow and Differential Dataflow and won't refrain from breaking existing upstream abstractions (even though they should be relatively easy to add back in at a later point in time).
To try out SnailTrail, decide between online (via TCP) and offline (from file) mode.
First, start a computation you would like to log: As example,
cargo run --example triangles <input file> <batch size> <load balance factor> <#computation workers>from within
cargo run --example triangles livejournal.graph 100 3 -w2will load the
livejournal.graphto use in the triangles computation, which is started with a batch size of 100. It is distributed over two workers, which will each write out events to three files.
If you don't have the triangles computation ready, you can try a very basic log that is easily tweakable with
cargo run --example custom_operator 3 -- -w2.
Run SnailTrail to create a PAG: From
cargo run --example inspect <# SnailTrail workers> <# of (simulated) source computation workers> <from-file?>.
cargo run --example inspect 2 6 fwill run SnailTrail with two workers, reading from the 6 files (
2 workers * 3 load balance factor) we generated in step 1.
This creates a PAG as a differential
Collection, to log it and use it, tweak
First start SnailTrail, similarly to Offline step 2, but without the
from-file?flag set; e.g.:
cargo run --example inspect 2 6.
Now, start the computation you would like to log with env variable
E.g., just like in offline mode:
env SNAILTRAIL_ADDR=localhost:8000 cargo run --example triangles livejournal.graph 100 3 -w2
This creates a PAG, to log it and use it, tweak
Further debug logging of the examples and SnailTrail is provided by Rust's
env_log crates. Passing
trace) as env variable when running examples should write out further logging to your
Show me the code!
Check out the
Structure section of this
README for a high-level overview.
The "magic" mostly happens at
timely-adapter/src/connect.rsfor (1) logging a computation and (2) connecting to it from SnailTrail,
PAGcreation, and the
custom_operator.rsexamples tying it all together.
Disclaimer: The PAG creation currently only creates local edges (I still need to update the remote edge creation for all the new stuff I added, they're commented out in
(Roughly in order of appearance)
In this repository
||timely / differential 0.9 adapter|
||Shared definitions of core data types and serialization of traces.|
||PAG generation & algorithms for timely 0.9 with epochal semantics.|
|adapter||Flink||not publicly available|
|adapter||Timely < 0.9||not publicly available|
|adapter||Heron||not publicly available|
||Shared definitions of core data types and serialization of traces (in Rust, Java).|
||Constructs the Program Activity Graph (PAG) from a flat stream of events which denote the start/end of computation and communication. Also has scripts to generate various plots.|
||Calculates a ranking for PAG edges by computing how many times an edge appears in the set of all-pairs shortest paths ("critical participation", cf. the paper).|
Adapters read log traces from a stream processor (or a serialized representation) and convert the logged messages to
LogRecord representation. This representation can then be used for PAG construction.
Depending on the stream processor, window semantics also come into play here. For example, the
timely-adapter currently uses an epoch-based window, which should make many algorithms on the PAG easier than working on a fixed window PAG.
Glue code, type definitions, (de)serialization, and intermediate representations that connect adapters to algorithms.
Implementation of various algorithms that run on top of the PAG to provide insights into the analyzed distributed dataflow's health and performance.
docs subfolder for additional documentation. Of course, also check out the
examples and code documentation built with
- Hoffmann et al.: SnailTrail Paper (NSDI '18)
- Malte Sandstede: A Short Introduction to SnailTrail (ETH '19)
- Vasia Kalavri: Towards self-managed, re-configurable streaming dataflow systems (UGENT '19)
- Moritz Hoffmann: SnailTrail: Generalizing Critical Paths for Online Analysis of Distributed Dataflows (NSDI '18)
- Vasia Kalavri: Online performance analysis of distributed dataflow systems (O'Reilly Velocity London '18)
SnailTrail is primarily distributed under the terms of both the MIT license and the Apache License (Version 2.0), with portions covered by various BSD-like licenses.
See LICENSE-APACHE, and LICENSE-MIT for details.