## Use gQuant to Build cuStreamz Examples

The stream processing market is experiencing exponential growth with businesses relying heavily on real-time analytics, inferencing, monitoring, and more. [cuStreamz](https://medium.com/rapids-ai/gpu-accelerated-stream-processing-with-rapids-f2b725696a61) is the first GPU-accelerated streaming data processing library. Written in Python, it is built on top of RAPIDS, the GPU-accelerator for data science libraries. cuStreamz accelerates [Streamz](https://streamz.readthedocs.io/en/latest/) by leveraging RAPIDS cuDF under the hood to perform computations on streaming data in an accelerated manner on GPUs. The Python library Streamz helps you build pipelines to manage continuous streams of data. 

It is easy to wrap the cuStreamz streaming pipelines into gQuant nodes and visualize it. In this notebook example, we will step-by-step build a stream pipeline example in gQuant. You will learn how to write a basic gQuant node, register it for the gQuant Jupyterlab plugin, and use GPU to accelerate the computations.

Let's first import the streamz libraries and gQuant libraries.

In [1]:
import streamz 
import streamz.dataframe
import cudf

In [2]:
from gquant.dataframe_flow import Node
from gquant.dataframe_flow import NodePorts, PortsSpecSchema, TaskSpecSchema
from gquant.dataframe_flow import ConfSchema
from gquant.dataframe_flow import TaskGraph

## Double the Streaming Numbers by Streamz 

A gQuant node is defined by inheriting from the `Node` class and overriding methods `init`, `meta_setup`, `ports_setup`, `conf_schema`, `process`.

The `ports_setup` must return an instance of `NodePorts` which encapsulates the ports specs. Ports specs are dictionaries with port attributes/options per `PortsSpecSchema`.

The `process` method receives an input dictionary where keys are input ports and values are input data. It return a dictionary where the keys correspond to the output ports. 

The `meta_setup` is used to compute the output meta information. It returns a dictionary where keys correspond to the output ports.

The `conf_schema` is used to define the Node configuration [JSON schema](https://json-schema.org/). gQuantlab UI uses [RJSF](https://github.com/rjsf-team/react-jsonschema-form) project to generate HTML form elements based on the JSON schema. [RJSF playground](https://rjsf-team.github.io/react-jsonschema-form/) is a good place to learn how to write JSON schema and visualize it. `conf_schema` returns `ConfSchema` which encapsulate the JSON schema and UI schema together.

The `column` and `port_types` information sometimes are determined dynamically. gQuant provides a few utility functions to help get dynamical graph information. `self.get_connected_inports()` will return a dictionay where keys are connected inport names and values are inport types. 
`self.get_input_meta()` will return a dictionary where keys are connected inport names and values are column name/type pairs from the parent node. `self.outport_connected(port_name)` method returns a boolean if the output port `port_name` is connected.

As a `hello-world` example for streamz, we will build an example that doubles continuous streams of numbers and print it. We need 3 nodes for it: source, map stream and sink. 

### Define the Source Stream Node

In [3]:
class StreamNode(Node):

    def ports_setup(self):
        input_ports = {}
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Source node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        columns_out = {
            'stream_out': {'element': 'number'}
        }
        return columns_out

    def process(self, inputs):
        output = {'stream_out': streamz.Stream()}
        return output


As shown above, this source node has one output port which outputs `streamz.Stream` object. It has no configuration, so we just leave it empty. In the `meta_setup` method, we specify that the element type in the stream is a number. The down-stream node can connect to it only if its `self.required` has the same key value pair. The `Stream` could stream many types of data and just knowing that something is a `streamz.Stream` is not enough to know what is actually being streamed. Using columns setup enables one to implement meta-typecking enforcement i.e. above one expects the stream output to contain `{'element': 'number'}` which is just some custom type specification. When the outputs are dataframes (pandas, cudf, dask-dataframes) then the columns has a concrete meaning i.e. what columns are present and what are their types in the dataframes.

In the `process` method, `StreamNode` outputs the `stream.Stream()` as the source end of the pipeline. Later, we will use the `emit` method to add numbers to it.

### Define the Double Element Node

In [4]:
class TransformNode(Node):

    def ports_setup(self):
        input_ports = {
            'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Transform Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream_in': {'element': 'number'}
        }
        columns_out = {
            'stream_out': {'element': 'number'}
        }
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']

        def double(ele):
            return ele*2

        output = {'stream_out': in_stream.map(double)}
        return output


`TransformNode` definition is similar to `StreamNode`, but it has an input port `stream_in`. It defines the key value pair `element->number` in the `self.required` dictionary, so it is compatible to connect to the source node. 

In the `process` method, it maps each elements in the stream by `double` function.

### Define the Print Elements Node

In [5]:
class SinkNode(Node):

    def ports_setup(self):
        input_ports = {
            'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Simple SinkNode configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        icols = self.get_input_meta()
        try:
            icoltype = icols['stream_in']['element']
        except KeyError:
           icoltype = None

        self.required = {'stream_in': {}}
        if icoltype in ('number', 'numbers'):
            self.required['stream_in']['element'] = icoltype
        elif icoltype is not None:
            self.required['stream_in']['element'] = 'number'

        columns_out = {
            'stream_out': {'element': 'number'}
        }
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']

        def frame_counter(frame_counter, element):
            return (frame_counter + 1, (frame_counter, element))

        def print_sink(counter_element):
            counter = counter_element[0]
            element = counter_element[1]
            print('Frame {}: {}'.format(counter, element))            
        
        stream_out = in_stream\
            .accumulate(frame_counter, returns_state=True, start=0)\
            .sink(print_sink)
        output = {'stream_out': stream_out}
        # output = {'stream_out': in_stream.sink(print)}
        return output

The `SinkNode` `process` is relatively simple. An accumulate is used to keep a frame counter, and the sink prints the data with its corresponding frame counter.

The `meta_setup` is slightly involved. Below `SinkNode` will also support streaming `numbers` i.e. a set of numbers as well as one number at a time. This node can do this dynamically. Initially its required setup is `{'stream_in': {}}` it can connect to anything that produces `streamz.Stream`. However, once it is connected the required columns setting is enforced to be either `'number'` or `'numbers'`, which is matched to the connected port. If neither `'number'` or `'numbers'` matches the connected port, then `'number'` is set. Unless the connected port specifies for its output meta `'number'` or `'numbers'` this node will raise a `LookupError` for the unexpected columns type. This validations is done when the taskgraph run is invoked, during an intermediate stage after the taskgraph build, but prior calling the process computations of the nodes within the taskgraph.

### Define the Graph Programmatically 

Having these three nodes, we can construct a simple task graph to connect them. We added a `Output_Collector` node so that we can get a handle to both the source and sink. 

In [6]:
module_name = 'streamz'
source_spec = {
    TaskSpecSchema.task_id: 'source',
    TaskSpecSchema.node_type: StreamNode,
    TaskSpecSchema.conf: {},
    TaskSpecSchema.inputs: {},
    TaskSpecSchema.module: module_name
}


task_spec = {
    TaskSpecSchema.task_id: 'double',
    TaskSpecSchema.node_type: TransformNode,
    TaskSpecSchema.conf: {},
    TaskSpecSchema.inputs: {
        'stream_in': 'source.stream_out'
    },
    TaskSpecSchema.module: module_name
}

sink_spec = {
    TaskSpecSchema.task_id: 'sink',
    TaskSpecSchema.node_type: SinkNode,
    TaskSpecSchema.conf: {},
    TaskSpecSchema.inputs: {
        'stream_in': 'double.stream_out'
    },
    TaskSpecSchema.module: module_name
}

out_spec = {
    TaskSpecSchema.task_id: '',
    TaskSpecSchema.node_type: "Output_Collector",
    TaskSpecSchema.conf: {},
    TaskSpecSchema.inputs: {
        'in0': 'source.stream_out',
        'in1': 'sink.stream_out',        
    }
}

task_list = [source_spec, task_spec, sink_spec, out_spec]

### Dynamically Register the GQuant Nodes

The task graph defined above is ready to be evaluated programmatically by 

In [7]:
task_graph = TaskGraph(task_list)
task_graph.run()

Results([('source.stream_out', <Stream>), ('sink.stream_out', <sink: print_sink>)])

The result has both the source and sink nodes stored as a dictionary.

To use the task graph in the gQuant Jupyterlab plugin, we can either put them in a python file and edit `gquantrc` file to register them. You can find an example in the `05_customize_nodes_with_ports` notebook. Or, we can dynamically register them by `register_lab_node` method.

In [8]:
TaskGraph.register_lab_node(module_name, StreamNode)
TaskGraph.register_lab_node(module_name, TransformNode)
TaskGraph.register_lab_node(module_name, SinkNode)

GQuantWidget(sub=HBox())

We can visualize this simple task graph

In [9]:
task_graph = TaskGraph(task_list)
task_graph.draw()

GQuantWidget(sub=HBox(), value=[OrderedDict([('id', 'source'), ('type', 'StreamNode'), ('conf', {}), ('inputs'…

Let's evaluate it and get the source node. We will stream numbers by calling `emit` method on the `source` stream.

In [10]:
r = task_graph.run()

The result is saved to variable `r`. It can be used as a dictionary. Let's find the keys:

In [11]:
r.get_keys()

('source.stream_out', 'sink.stream_out')

We can emit some numbers to the source and see the stream processing is in effect.

In [12]:
for i in range(10):
    r['source.stream_out'].emit(i)

Frame 0: 0
Frame 1: 2
Frame 2: 4
Frame 3: 6
Frame 4: 8
Frame 5: 10
Frame 6: 12
Frame 7: 14
Frame 8: 16
Frame 9: 18


## Show the Doubling Results in a Sliding Window

Imagine you want to build a monitor that can watch the latest 50 numbers in the stream. Streamz library provides a few stream process nodes that make it easy to do. Let's add a sliding window node that can group the numbers in a sliding window of defined length. 

### Define the Sliding Window Node

In [13]:
class SlideWindowNode(Node):

    def ports_setup(self):
        input_ports = {
            'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "SlideWindow Node configure",
            "type": "object",
            "properties": {
                "window": {
                    "type": "integer"
                }
            }
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream_in': {'element': 'number'}
        }
        columns_out = {
            'stream_out': {'element': 'numbers'}
        }
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']
        stream_out = in_stream.sliding_window(self.conf['window'],
                                              return_partial=False)
        output = {'stream_out': stream_out}
        return output

It is similar to the `TransformNode`. In the `process` method, it uses `in_stream.sliding_window` method to group the elements in a sliding window. In the `conf_schema`, we define a `window` field of type `integer`. It defines the sliding window size. As it buffers the numbers in a window, the `columns_out` has element type `numbers` instead of `number`. Note, the key-value string can be arbitary strings as long as the compatible input/output ports share the same key-value pairs. 

### Define the Plot Node

Define a node to visualize the numbers online

In [14]:
import bqplot
import numpy as np
import bqplot.pyplot as plt

class PlotSinkNode(Node):

    def ports_setup(self):
        input_ports = {
            'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'fig_out': {
                PortsSpecSchema.port_type: bqplot.Figure
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Plot configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream_in': {'element': 'numbers'}
        }
        columns_out = {'fig_out': {}}
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']
        axes_options = {'x': {'label': 'x'}, 'y': {'label': 'y'}}
        x = []
        y = []
        fig = plt.figure(animation_duration=10)
        lines = plt.plot(x=x, y=y, colors=['red', 'green'], axes_options=axes_options)
        output = {}

        def update(numbers):
            y = np.array(numbers)
            elements = y.shape[-1]
            x = np.arange(elements)
            lines.x = x
            lines.y = y

        in_stream.sink(update)
        output.update({'fig_out': fig})
        return output

This `PlotSinkNode` is a sink of the stream that use `update` function to plot the numbers in the sliding window. So we can see an animiation of number changes as new numbers are added.

Register these two new nodes:

In [15]:
TaskGraph.register_lab_node(module_name, PlotSinkNode)
TaskGraph.register_lab_node(module_name, SlideWindowNode)

If you draw a taskgraph you can right-click and select "Create Taskgraph from this Cell" so that you can just load the taskgraph next time. We already drew the graph and saved it in the `plot.gq.yaml` file.  Load it from the disk:

In [16]:
task_graph = TaskGraph.load_taskgraph('../taskgraphs/streamz/plot.gq.yaml')
task_graph.draw()

GQuantWidget(sub=HBox(), value=[OrderedDict([('id', 'source'), ('type', 'StreamNode'), ('conf', {}), ('inputs'…

It applies `sliding_windw` after `doubling` the numbers, then feed them to `plot` sink node.

Run it to get the source and sink

In [17]:
r = task_graph.run()

Show the plot:

In [18]:
r['plot.fig_out']

Figure(animation_duration=10, axes=[Axis(label='x', scale=LinearScale(), side='bottom'), Axis(label='y', orien…

Let's add `sine` wave numbers to the stream:

In [19]:
for i in range(1000):
    r['source.stream_out'].emit(np.sin(i/3.14))

You can see the moving sin waves in the plot. Try to change the sliding window size and play with it!

## Process Two Branches of the stream

Streamz library can handle complicated streams in the pipeline. For illustration purpose, we add a `ZipNode` that can merge two streams together so the element from the two streams are zipped into a tuple.

### Define the Zip Node to Combine Branches

In [20]:
class ZipNode(Node):

    def ports_setup(self):
        input_ports = {
            'stream1_in': {
                PortsSpecSchema.port_type: streamz.Stream
            },
            'stream2_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }

        # dynamic ports
        # add one additional port
        iports_conn = self.get_connected_inports()
        nports = len(iports_conn)
        if nports >= 2:
            input_ports = {'stream{}_in'.format(ii+1): {
                PortsSpecSchema.port_type: streamz.Stream
            } for ii in range(nports+1)}
        
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Zip Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream1_in': {'element': 'numbers'},
            'stream2_in': {'element': 'numbers'}
        }
        iports_conn = self.get_connected_inports()
        nports = len(iports_conn)
        if nports > 2:
            self.required = {
                iport: {
                    'element': 'numbers'
                } for iport in iports_conn
            }
        columns_out = {
            'stream_out': {'element': 'numbers'}
        }
        return columns_out

    def process(self, inputs):
        streams_list = [istream for istream in inputs.values() if istream]
        zip_stream = streamz.zip(*streams_list)
        output = {'stream_out': zip_stream}
        return output

`ZipNode` defines two required input ports: `stream1_in` and `stream2_in`. Once these two ports are connected another port is dynamically added and so on. In the `process` method all the streams are zipped together.

Register this node:

In [21]:
TaskGraph.register_lab_node(module_name, ZipNode)

We wired the graph in the `two_branches.gq.yaml` file. Load it from the disk:

In [22]:
task_graph = TaskGraph.load_taskgraph('../taskgraphs/streamz/two_branches.gq.yaml')
task_graph.draw()

GQuantWidget(sub=HBox(), value=[OrderedDict([('id', 'source'), ('type', 'StreamNode'), ('conf', {}), ('inputs'…

It first doubles the elements. Then it branches off. In one branch, it doubles the elements again. After merging these two branches by `ZipNode`, it sends the stream of tuples to the `plot` sink node. 

Let's run this graph:

In [23]:
r = task_graph.run()

Show the plot:

In [24]:
r['plot.fig_out']

Figure(animation_duration=10, axes=[Axis(label='x', scale=LinearScale(), side='bottom'), Axis(label='y', orien…

Let's add sine wave numbers to the stream

In [25]:
for i in range(1000):
    r['source.stream_out'].emit(np.sin(i/3.14))

It shows two sine waves. One is doubled and the other one is quadrupled

## Use GPU to Accelerate

Stream processing can be accelerated in the GPU by RAPIDS cudf dataframe. Streamz can work with both Pandas and cuDF dataframes, to provide sensible streaming operations on continuous tabular data.

We will take the above streaming pipeline and convert it to be accelerated in the GPU.

### Define the Node to Convert a Tuple of Numbers to Cudf Dataframe

The sliding window node aggregate the numbers in a tuple. We can define a Node to convert the tuple of numbers to cudf dataframe

In [26]:
class TupleToCudf(Node):

    def ports_setup(self):
        input_ports = {
             'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "Tuple of data to Cudf dataframe Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream_in': {'element': 'numbers'}
        }
        columns_out = {
            'stream_out': {'element': 'cudf'}
        }
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']

        def convert(ele):
            df = cudf.DataFrame({'x': ele})
            return df

        output = {'stream_out': in_stream.map(convert)}
        return output

It maps tuple of numbers to cudf dataframe elements by `in_stream.map` method.

### Define the Node to Convert Cudf stream to streamz.DataFrame 

The `streamz.dataframe` module provides a streaming dataframe object that implements many of dataframe methods. It provides a Pandas-like interface on streaming data. Let's write a `ToDataFrame` Node that converts the cudf dataframe stream to streaming dataframe.  

In [27]:
# Create a streamz dataframe to get stateful word count
class ToDataFrame(Node):

    def ports_setup(self):
        input_ports = {
            'stream_in': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        output_ports = {
            'df_out': {
                PortsSpecSchema.port_type: streamz.dataframe.DataFrame
            },
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "streamz Dataframe Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'stream_in': {'element': 'cudf'}
        }
        columns_out = {
            'df_out': {'x': 'float64'}
        }
        return columns_out

    def process(self, inputs):
        in_stream = inputs['stream_in']
        sdf = streamz.dataframe.DataFrame(
            in_stream,
            example=cudf.DataFrame({'x':[]})
        )
        output = {'df_out': sdf}
        return output

In the `process` method, it calls `streamz.dataframe.DataFrame` constructor to convert a stream of cudf dataframe into a streamz dataframe.

### Define the Node to Double the numbers via streamz.DataFrame API

It is straight forward to write a Node that doubles one column in a dataframe. 

In [28]:
class GPUDouble(Node):

    def ports_setup(self):        
        input_ports = {
            'df_in': {
                PortsSpecSchema.port_type: streamz.dataframe.DataFrame
            }   
        }
        output_ports = {
            'df_out': {
                PortsSpecSchema.port_type: streamz.dataframe.DataFrame
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "GPU double Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'df_in': {'x': 'float64'}
        }
        columns_out = {
            'df_out': {'x': 'float64'}
        }
        return columns_out

    def process(self, inputs):
        in_df = inputs['df_in']
        in_df['x'] = in_df['x'] * 2
        output = {'df_out': in_df}
        return output

## Define the Node to Convert streamz.DataFrame to Cudf.Series stream

Once the streamz Dataframe finished the computation, we can convert it back to normal stream for display.

In [29]:
class ToStream(Node):

    def ports_setup(self):
        input_ports = {
            'df_in': {
                PortsSpecSchema.port_type: streamz.dataframe.DataFrame
            }
        }
        output_ports = {
            'stream_out': {
                PortsSpecSchema.port_type: streamz.Stream
            }
        }
        return NodePorts(inports=input_ports, outports=output_ports)

    def conf_schema(self):
        json = {
            "title": "df to stream Node configure",
            "type": "object",
            "properties": {}
        }
        ui = {}
        return ConfSchema(json=json, ui=ui)

    def init(self):
        self.required = {}

    def meta_setup(self):
        self.required = {
            'df_in': {'x': 'float64'}
        }
        columns_out = {
            'stream_out': {'element': 'numbers'}
        }
        return columns_out

    def process(self, inputs):
        in_df = inputs['df_in']

        def toarray(df_series):
            return df_series.to_array()

        outstream = in_df.stream.pluck('x').map(toarray)
        output = {'stream_out': outstream}
        return output

The stream can be accessed by the `stream` property in the streamz Dataframe. We call `pluck` method to return the `x` column of the dataframe and convert the resulting series to a numpy array. This numpy array is compatible for the `{'element': 'numbers'}` meta-typecheck to be operated in the plot and print sink nodes.

Register the newly added Nodes:

In [30]:
TaskGraph.register_lab_node(module_name, TupleToCudf)
TaskGraph.register_lab_node(module_name, ToDataFrame)
TaskGraph.register_lab_node(module_name, GPUDouble)
TaskGraph.register_lab_node(module_name, ToStream)

We already wired the graph in the `plot.gq.yaml` file. Load it from the disk:

In [31]:
task_graph = TaskGraph.load_taskgraph('../taskgraphs/streamz/gpu_double_two_branches.gq.yaml')

In [32]:
task_graph.draw()

GQuantWidget(sub=HBox(), value=[OrderedDict([('id', 'source'), ('type', 'StreamNode'), ('conf', {}), ('inputs'…

It first aggregates the numbers into a tuple of numbers by sliding window node. It converts the tuple of numbers into a stream of cudf dataframes. After converting it to streamz.Dataframe, we can use normal dataframe API to do transformations on the dataframe. Then it converts back to stream of cudf.Series of numbers for ploting and printing. 

Let's run the TaskGraph:

In [33]:
r = task_graph.run()

Show the plot:

In [34]:
r['plot.fig_out']

Figure(animation_duration=10, axes=[Axis(label='x', scale=LinearScale(), side='bottom'), Axis(label='y', orien…

Let's add `sine` wave numbers to the stream:

In [35]:
for i in range(200):
    r['source.stream_out'].emit(np.sin(i/3.14))

Frame 0: (array([ 0.        ,  0.62623027,  1.18948077,  1.63310562,  1.9124896 ,
        1.99953517,  1.88548819,  1.58181832,  1.11906554,  0.54376872,
       -0.08621472, -0.70752758, -1.25768464, -1.68135679, -1.93593548,
       -1.99581782, -1.85498145, -1.52759025, -1.04656983, -0.46029625,
        0.17226916,  0.78750952,  1.32355033,  1.72648214,  1.95578225,
        1.98839002,  1.8210261 ,  1.47052223,  0.97212844,  0.37596804,
       -0.25800333, -0.8660274 , -1.3869554 , -1.76839778, -1.97199301,
       -1.97726559, -1.78368527, -1.41072034, -0.89587976, -0.29094087,
        0.34325784,  0.94293524,  1.44778196,  1.80702576,  1.98453762,
        1.96246521,  1.74302837,  1.34829577,  0.81796554,  0.2053728 ]), array([ 0.        ,  1.25246053,  2.37896155,  3.26621123,  3.82497919,
        3.99907034,  3.77097638,  3.16363664,  2.23813107,  1.08753745,
       -0.17242944, -1.41505516, -2.51536928, -3.36271358, -3.87187096,
       -3.99163563, -3.7099629 , -3.0551805 , -2.093

In [36]:
## Clean up

In [37]:
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}