An ERDOS operator receives data on :pyReadStreams <erdos.ReadStream>
, and sends processed data on :pyWriteStreams <erdos.WriteStream>
. We provide a standard library of operators for common dataflow patterns under :pyerdos.operators
. While the standard operators are general and versatile, some applications may implement custom operators to better optimize performance and take fine-grained control over exection.
All operators must inherit from the :py~erdos.Operator
base class and implement :py~erdos.Operator.__init__
and :py~erdos.Operator.connect
methods.
- :py
~erdos.Operator.__init__
takes all :pyReadStreams <erdos.ReadStream>
from which the operator receives data, all :pyWriteStreams <erdos.WriteStream>
on which the operator sends data, and any other arguments passed when calling :py~erdos.Operator.connect
. Within :py~erdos.Operator.__init__
, the state should be initialized, and callbacks may be registered across :pyReadStreams <erdos.ReadStream>
. - The :py
~erdos.Operator.connect
method takes :pyReadStreams <erdos.ReadStream>
and returns :pyWriteStreams <erdos.WriteStream>
which are all later passed to :py~erdos.Operator.__init__
by ERDOS. The :pyReadStreams <erdos.ReadStream>
and :pyWriteStreams <erdos.WriteStream>
must appear in the same order as in :py~erdos.Operator.__init__
.
While ERDOS manages the execution of callbacks, some operators require more finegrained control. Operators can take manual control over the thread of execution by implementing :pyOperator.run() <erdos.Operator.run>
, and pulling data from :pyReadStreams <erdos.ReadStream>
. Callbacks are not invoked while run executes.
erdos.Operator
erdos.OperatorConfig
Full example at python/examples/simple_pipeline.py.
_literalinclude/python_examples/simple_pipeline.py
_literalinclude/python_examples/simple_pipeline.py
_literalinclude/python_examples/simple_pipeline.py