Skip to content

Commit

Permalink
documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
frankmcsherry committed May 12, 2019
1 parent b7b1465 commit 6693238
Showing 1 changed file with 39 additions and 0 deletions.
39 changes: 39 additions & 0 deletions src/operators/arrange/mod.rs
@@ -1,4 +1,43 @@
//! Types and traits for arranging collections.
//!
//! Differential dataflow collections can be "arranged" into maintained, worker-local
//! indices that can be re-used by other dataflows at relatively low cost.
//!
//! The `arrange` operator, and its variants, takes a `Collection` and produces as an
//! output an instance of the `Arrangement` type. An arrangement is logically equivalent
//! to its input collection, but it is distributed across workers and maintained in a
//! way that makes it easy to re-use.
//!
//! The `arrange` operator receives update triples `(data, time, diff)` from its input,
//! and responds to changes in its input frontier, which as it advances signals further
//! times that will no longer be observed in input updates. For each frontier advance,
//! the operator creates a new "batch", containing exactly those updates whose times are
//! in advance of the previous frontier but not in advance of the new frontier. Updates
//! are partitioned among workers by a key, and each batch is indexed by this key.
//!
//! This sequence of batches defines a continually expanding view of committed updates
//! in the collection.
//! The sequence is presented by the `Arrangement` in two forms (its fields):
//!
//! 1. A timely dataflow `Stream` of batch elements.
//!
//! The stream is used by operators that want to exploit the arranged structure of
//! batches, but want the push-based computational model of timely dataflow.
//! Many differential dataflow operators can consume streams of batches, although
//! they may also rely on access to the second representation of the sequence.
//!
//! 2. A `Trace` type that provides a compact representation of the accumulated batches.
//!
//! A trace is logically equivalent to a sequence of batches, but it is able to alter
//! the representation for efficiency. In particular, the trace may merge batches so
//! that the total number is kept small, and it may merge logical times if it able to
//! determine that no trace users can distinguish between them.
//!
//! Importantly, the `Trace` type has no connection to the timely dataflow runtime.
//! This means a trace can be used in a variety of contexts where a `Stream` would not be
//! appropriate, for example outside of the dataflow in which the arragement is performed.
//! Traces may be directly inspected by any code with access to them, and they can even be
//! used to introduce the batches to other dataflows with the `import` method.

use std::rc::{Rc, Weak};
use std::cell::RefCell;
Expand Down

0 comments on commit 6693238

Please sign in to comment.