# Streaming Analytics for Life Event Prediction

This notebook provides the main streaming analytics pipeline that operationalizes the Life Event Prediction model in a real time data flow.  As new life events occur, they are passed through the model to determine whether they are signifcant enough to warrant action. 

The README notebook (README.ipynb) in this project provides context and instructions on configuring the context and executing this notebook. It also includes references to the Life Event Prediction model which is used to score customer interaction events in this notebook.

**This project contains Sample Materials, provided under license.  
Licensed Materials - Property of IBM.  
© Copyright IBM Corp. 2019. All Rights Reserved.  
US Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.**


## Configuration Variables

The following variables can be adjusted to match your environment, according to how the Streams Add-on, remote data set (database connection), and Eventstreams/Kafka were configured (see the README notebook for details).  In particular, STREAMS_INSTANCE_NAME, CUSTOMER_SCORE_EVENTS_DATASET, EVENTSTREAMS_CREDENTIALS, and SIGNIFICANT_EVENTS_EVENTSTREAMS_TOPIC may need to change according to your setup.  If using the described default setup in the README notebook, the values below should work.  For the other variables, they shouldn't need to change at all, unless you are using a different set of live or historical events, or a different replay speed.

In [None]:
import os
import getpass

# The streams instance where the jobs will be submitted
STREAMS_INSTANCE_NAME = "streams"
# The local dataset filename/path to replay "real-time" customer events from, and how fast to replay them (in events per second)
# Note that the replay speed can be set much higher than the model can generate new scores, which will limit the effective
# throughput of the system unless the scores can be sped up or parallelized.
REAL_TIME_EVENTS_DATASET = os.environ['DSX_PROJECT_DIR'] + '/datasets/live_event.csv'
REAL_TIME_EVENT_GENERATION_RATE = 2

# The local dataset filename/path to load historical customer events from
HISTORICAL_EVENTS_DATASET = os.environ['DSX_PROJECT_DIR'] + '/datasets/historical_event.csv'

# The remote dataset to store new customer scores to.  Should be defined in the Project as a remote datasource and associated remote dataset.
# If set to None, the job that saves the customer scores to the database will not be started.
CUSTOMER_SCORE_EVENTS_DATASET = "LFE_SCORES"

# The filename/path for the credentials to the remote EventStreams instance that significant events will be published to, and the topic to use.
# If the credentials is set to None, or the topic is set to None, the job that publishes significant events to EventStreams will not be started.
EVENTSTREAMS_CREDENTIALS = os.environ['DSX_PROJECT_DIR'] + '/datasources/credentials/eventstreams_' + getpass.getuser() + '.json'
SIGNIFICANT_EVENTS_EVENTSTREAMS_TOPIC = "SIGNIFICANT_LFE_SCORES"

# Streams Job Names for the various microservices
REAL_TIME_EVENT_GENERATION_JOB_NAME = 'real_time_event_generation_' + getpass.getuser()
SCORE_EVENTS_JOB_NAME = 'score_events_' + getpass.getuser()
RECORD_SCORE_HISTORY_JOB_NAME = 'record_score_history_' + getpass.getuser()
SIGNIFICANT_EVENT_GENERATION_JOB_NAME = 'significant_event_generation_' + getpass.getuser()
PUBLISH_SIGNIFICANT_EVENTS_JOB_NAME = 'publish_significant_events_' + getpass.getuser()

## Python Module Installation

As a convenience, if your ICP4D cluster can install Python modules using pip, the following cell can be executed to ensure the correct modules are installed and available.  Should only need to be done once per environment.  Alternatively, these modules can be installed into the ICP4D Python environment in some other way (see ICP4D documentation).

In [None]:
# Cell to potentially install required modules in the environment
!pip install --upgrade --user streamsx --no-warn-script-location
!pip install streamsx.database
!pip install streamsx.eventstreams
!pip install kafka-python

!pip show streamsx

!pip install --upgrade scikit-learn==0.21.2


## Python Imports

The following cell imports the needed Python modules for this notebook, including various general utilities, Streams support, and specific modules for executing the LFE models.


In [None]:
# Cell to prep the kernel, importing modules, doing basic housekeeping, setting helper variables from the environment, etc

# General Imports
import sys
import os, requests, urllib3
import getpass
import io
import csv
import time
import glob
import shutil
import json
import collections
import math
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

# Import pandas
import pandas as pd

# Import Streams topology packages
from streamsx.topology.topology import Topology
from streamsx.topology import context
from streamsx.topology.schema import StreamSchema

# Import other Streams helper packages
import streamsx.ec
import streamsx.eventstreams as eventstreams
import streamsx.database as db

print("INFO: streamsx package version: " + context.__version__)
print("INFO: streamsx.database package version: " + db.__version__)
print("INFO: streamsx.eventstreams package version: " + eventstreams.__version__)

# Import scripts from this project
if '../scripts' not in sys.path:
    sys.path.insert(0, '../scripts')
import StreamsScoringPipeline
import memory
import submit_support



## Streams Instance Connection

This cell connects to the Streams instance, and configures some basic parameters on that connection for later use.


In [None]:
# Cell to grab Streams instance config object and REST reference
from icpd_core import icpd_util
streams_cfg=icpd_util.get_service_instance_details(name=STREAMS_INSTANCE_NAME)

import streamsx.rest as rest
streams_cfg[context.ConfigParams.SSL_VERIFY] = False
streams_instance = rest.Instance.of_service(streams_cfg)


## Cancel Existing Streams Jobs

The following cell can be executed to find all existing Streams jobs and cancel them, so you can re-execute the notebook from scratch and be sure the old jobs have been terminated before starting the new jobs or clearing the database tables.


In [None]:
# Cancel existing jobs so that the there is a clean slate to start from
for job in streams_instance.get_jobs():
    if job.name.endswith(getpass.getuser()):
        print("Cancelling existing job:", job.name)
        job.cancel()

## Database Connection

If using a remote dataset to store customer score results, the following cell uses the information in the remote dataset and datasource to configure the connection to the database for later use.


In [None]:
import dsx_core_utils

if CUSTOMER_SCORE_EVENTS_DATASET is not None:
    # Setup remote DB connectors and credentials
    remote_dataSet = None
    remote_dataSource = None
    remote_db2credentials = None
    remote_table_name = None
    try:
        remote_dataSet = dsx_core_utils.get_remote_data_set_info(CUSTOMER_SCORE_EVENTS_DATASET)
        remote_dataSource = dsx_core_utils.get_data_source_info(remote_dataSet['datasource'])
        remote_db2credentials = { 'username': remote_dataSource['user'], 'password': remote_dataSource['password'], 'jdbcurl': remote_dataSource['URL'] }
        remote_table_name = remote_dataSet['table']
    except:
        print("Unable to retrieve dataset or datasource information for the dataset given by CUSTOMER_SCORE_EVENTS_DATASET.  It may not be defined.")


## Preparing the Database Tables

If using a remote dataset to store customer scores, the following cell deletes the database table and re-creates it, with the correct schema.  In a more realistic scenario, you wouldn't want to delete the old data, since you would just be adding new scores to and existing table, but in this accelerator, the same live scores are replayed, so the data in the database would be duplicated if the jobs were re-executed.


In [None]:
# Import database utilities
import jaydebeapi

# Easy way to drop and recreate the DB table
if CUSTOMER_SCORE_EVENTS_DATASET is not None and remote_dataSource is not None:
    if (sys.version_info >= (3, 0)):
        conn = jaydebeapi.connect(remote_dataSource['driver_class'], remote_dataSource['URL'], {'user': remote_dataSource['user'], 'password': remote_dataSource['password'], 'clientProgramName': "pipeline-prep-" + getpass.getuser()})
    else:
        conn = jaydebeapi.connect(remote_dataSource['driver_class'], [remote_dataSource['URL'], remote_dataSource['user'], remote_dataSource['password']])

    curs = conn.cursor()
    try:
        curs.execute('DROP TABLE ' + remote_table_name)
    except jaydebeapi.DatabaseError:
        print("Error dropping table.  Probably doesn't exist yet.")
    else:
        print("Table dropped.")
    try:
        curs.execute('CREATE TABLE ' + remote_table_name + ' (CUSTOMER_ID INTEGER NOT NULL, ERROR_CODE INTEGER, EVENT_DATE DATE, EVENT_TYPE VARCHAR(255), HOME_PURCHASE_PROB REAL, RELOCATE_PROB REAL, INSERTION_TIME TIMESTAMP NOT NULL)')
        curs.execute('CREATE INDEX ' + remote_table_name + '_INDEX ON ' + remote_table_name + ' (CUSTOMER_ID)')
    except Exception as e:
        print("Problem creating table: " + str(e))
    else:
        print("Table created.")
    curs.close()
    conn.close()
else:
    print("Skipping DB Table creation/clearing, since CUSTOMER_SCORE_EVENTS_DATASET was None, or the dataset didn't exist.")


## Streams Submission Helper Function

The `submitToStreams()` helper function here cancels a job, if it is already running, and then re-builds the topology into a Streams bundle and submits the job to the Streams instance.


In [None]:
# Definitions of general helper functions to help submit jobs, query jobs, query job health, cancel jobs, etc
def submitToStreams(topology):
    # Create local copy of the streams config so this can be thread-safe
    local_streams_cfg = dict(streams_cfg)

    # Cancel the job from the instance if it is already running...
    for job in streams_instance.get_jobs():
        if job.name == topology.name:
            print("Cancelling old job:", job.name)
            job.cancel()
    
    # Set the job config
    job_config = context.JobConfig(job_name = topology.name, tracing = "debug")
    job_config.add(local_streams_cfg)
    
    # Actually submit the job
    print("Building and submitting new job:", topology.name)
    submission_result = context.submit('DISTRIBUTED', topology, local_streams_cfg)
    return submission_result


## Streams Operator Classes

The following class definitions define the behavior of some Streams operators specific to the accelerator's needs.

### CSVFileReader

This operator class acts as a Streams source operator, reading in a CSV file from the bundle and emitting each line as a tuple.  This is used to replay the live events file.  A helper function, `delay()` is also defined, which is a used in the Streams topology as a trivial operator to put some timing delay between the tuples emitted from this source.

Streams source operators must execute _outside_ the notebook environment. Hence the CSVFileReader class is defined in a python script submit_support.py


### StoreEvents

This operator class loads all the historical events into the operator's memory at operator/job startup time, and then stores each new customer event into the operator's memory as it flows through, enabling the LFE scoring model to have access to a customer's full event history to compute the scores.


In [None]:
class StoreEvents(object):
    """store each cust_id's events in it's own list
    Notes:
        The historical data is read in on the notebook side and populates the
        memory.events on the Streams side during the __enter__(). Memory.events
        is colocated, refer to <topology>.colocate([]).
    Args:
        history_csv : file of historical events [cust_id, date, event]

    """

    def __init__(self, history_csv=None):
        self.dh = pd.read_csv(history_csv)

    def __enter__(self):
        for historical in self.dh.values:
            lst = list(historical)
            memory.events[lst[0]].append(lst)

    def __exit__(self, exc_type, exc_value, traceback):
        pass  # exit required

    def __call__(self, tuple: list):
        """store an event memory shared by multiple operators.
        Note:
            memory is shared between operators
        Args:
            tuple :  tuple is the event : [cust_id, date, event]
        Return:
            dict events
        """
        memory.events[tuple[0]].append(tuple)

        return {"cust_id": tuple[0], "sc_end_date": tuple[1], 'event_type_id': tuple[2]}


### SignificantEvents

This operator class is used to determine if a new customer score is "Significant" or not, based on a set of thresholds with a deadband in between, and whether that that customer's prior score was in the same threshold region.


In [None]:
class SignificantEvents:
    def __init__(self, high_threshold=0.50, low_threshold=0.40):
        self.high_threshold = high_threshold
        self.low_threshold = low_threshold
        self.prior_alerts = collections.defaultdict(object)
        self.idx = 0

    def check_update_alert(self, tpl, lfe_class):
        cust_id = tpl["cust_id"]
        old_alert = self.prior_alerts[cust_id][lfe_class]
        
        if tpl[lfe_class]['error_code'] > 0:
            try:
                cur_prob = tpl[lfe_class]['probabilities'][0][1]
            except:
                cur_prob = math.nan
                
            if cur_prob > self.high_threshold and (old_alert is None or old_alert is "Low"):
                self.prior_alerts[cust_id][lfe_class] = "High"
                return "High"
            elif cur_prob < self.low_threshold and (old_alert is not None and old_alert is "High"):
                self.prior_alerts[cust_id][lfe_class] = "Low"
                return "Low"
            else:
                return None         
        else:
            # Ignore failed score events
            return None
        
        
    def __call__(self, tpl):
        cust_id = tpl['cust_id']
        ntuple = {'cust_id': cust_id, 'idx' : self.idx}
        self.idx += 1
        ntuple['event'] = tpl
        
        # A customer with no prior scores will alert only if a probability is above the threshold
        if cust_id not in self.prior_alerts:
            # Set the prior alerts to None for this customer
            self.prior_alerts[cust_id] = {'lfe_home_purchase': None, 'lfe_relocation': None}
        
        # Check to see if we need to send any alerts, and update the prior alerts for next time
        home_purchase_alert = self.check_update_alert(tpl, 'lfe_home_purchase')
        relocation_alert = self.check_update_alert(tpl, 'lfe_relocation')
        
        if home_purchase_alert is None and relocation_alert is None:
            # No alerts to send
            return None
        else:
            # Something significant happened.
            ntuple['message'] = ""
            if home_purchase_alert is not None:
                ntuple['message'] += "Change to a " + home_purchase_alert + " Home Purchase Probability; "
            if relocation_alert is not None:
                ntuple['message'] += "Change to a " + relocation_alert + " Relocation Probability"
            return {'new':ntuple, 'all':self.prior_alerts}


## Streams Topology Creation

The following cells define functions to create the various Streams Topology graphs for each Streams job.

### Real-Time Event Generation

This job replays the live events, and it's topology is simply a `CSVFileReader` source operator, delayed to the desired event generation rate, and published to a Streams topic `real_time_events`.


In [None]:
def createRealTimeEventGenerationTopology():
    topo = Topology(REAL_TIME_EVENT_GENERATION_JOB_NAME)

    # add files to be contained in the archive which is deployed to the node running the application
    # in this sample we need the `dataset` with sample data to be present at the worker node
    filename_in_bundle = topo.add_file_dependency(REAL_TIME_EVENTS_DATASET, 'etc')

    # let the csv file reader the source/edge node in our topology, producing the 'records' stream                    
    real_time_events_input = topo.source(submit_support.CSVFileReader(filename_in_bundle))

    # Insert a delay between record replay
    delay =  1.0/REAL_TIME_EVENT_GENERATION_RATE   #variable for lambda, not expression
    real_time_events = real_time_events_input.filter(lambda t: time.sleep(float(delay)) or True)

    real_time_events.view(name="real_time_events", description="Real Time Customer Events")

    # publish results
    real_time_events.publish(topic="real_time_events")
    
    return topo


### Score Events

This job performs the actual scoring of customer event histories, as each new event comes in.  It subscribes to the Streams topic `real_time_events`, and uses the `StoreEvents` operator to store them in operator memory for future scoring use, with the pre-loaded historical customer events, and then feeds them into the actual LFE model scoring code (defined in the `StreamsScoringPipeline.py` script).  Finally, the resulting customer scores are published to the Streams topic `new_scores`.


In [None]:
def createScoreEventsTopology():
    # Stage some files needed by the model scoring code into the expected subdirectory needed by the scoring code, but that can still be added to the Streams bundle.
    stage_directory = os.path.join(os.environ.get("DSX_PROJECT_DIR"), "datasets", "datasets")
    os.makedirs(stage_directory, exist_ok=True)
    for stage_file in glob.glob(os.path.join(os.environ.get("DSX_PROJECT_DIR"), "datasets", "*.json")):
        staged = shutil.copy(stage_file, stage_directory)
        print("Stage file '{}' for Streams".format(staged))

    topo = Topology(SCORE_EVENTS_JOB_NAME)
    
    topo.exclude_packages.add('pandas')
    topo.add_pip_package('sklearn')
    topo.add_pip_package('scipy')
    
    topo.add_file_dependency('../datasets/datasets', 'etc')  # Specify staged files to move.
    
    #load the history + real flow into memory
    real_time_events = topo.subscribe(topic="real_time_events")
    
    store_events = real_time_events.map(StoreEvents(history_csv=HISTORICAL_EVENTS_DATASET), name="store_events")
    store_events.view(name="store_events", description="store real time events")
    
    # score & save
    new_scores = store_events.map(StreamsScoringPipeline.score_life_events(project_path=os.environ.get("DSX_PROJECT_DIR")), name="new_scores")
    new_scores.view(name="new_scores", description="New Customer Scores")
    new_scores.publish(topic="new_scores")

    ## colocate shared memory 
    store_events.colocate([store_events, new_scores])

    return topo


### Record Score History

This job saves the customer scores into the database for off-line analysis of customer scores.  It subscribes to the Streams topic `new_scores`, converts the tuple schema to match the database table schema, and inserts the customer score into the database table.


In [None]:
def createRecordScoreHistoryTopology():
    topo = Topology(RECORD_SCORE_HISTORY_JOB_NAME)

    # SQL statements
    sql_insert = 'INSERT INTO ' + remote_table_name + ' (CUSTOMER_ID, ERROR_CODE, EVENT_DATE, EVENT_TYPE, HOME_PURCHASE_PROB, RELOCATE_PROB, INSERTION_TIME) VALUES (?, ?, ?, ?, ?, ?, CURRENT TIMESTAMP)'
    sql_select = 'SELECT * FROM ' + remote_table_name

    new_scores = topo.subscribe(topic="new_scores")
    
    # convert it to SPL schema for the database operator run_statement
    tuple_schema = StreamSchema("tuple<int64 cust_id, int64 error_code, rstring event_date, rstring event_type_id, float64 home_prob, float64 reloc_prob>")

    def convert(tpl):
        try:
            result = (tpl["cust_id"], \
                      tpl["error_code"], \
                      tpl["sc_end_date"], \
                      tpl["event_type_id"], \
                      tpl["lfe_home_purchase"]["probabilities"][0][1] if tpl["lfe_home_purchase"]["error_code"] == 1 else None, \
                      tpl["lfe_relocation"]["probabilities"][0][1] if tpl["lfe_relocation"]["error_code"] == 1 else None)
        except Exception as e:
            print("Got exception: ", e, flush=True)
            print("Tuple was: ", tpl, flush=True)
            result = (tpl["cust_id"], tpl["error_code"], tpl["sc_end_date"], tpl["event_type_id"], None, None)
        return result
    
    db_events = new_scores.map(convert, name="db_events", schema=tuple_schema)
    db_events.view(name="db_events", description="storing events to db")

    insert_results = db.run_statement(name="INSERT", stream=db_events, sql=sql_insert, sql_params="cust_id, error_code, event_date, event_type_id, home_prob, reloc_prob" ,credentials = remote_db2credentials)

    return topo


### Significant Event Generation

This job looks at new customer scores and determines if a new score is 'significant'.  It subscribes to the Streams topic `new_scores` and uses the `SignificantEvents` operator (defined above), to determine the significance of this score.  If it is significant, it publishes it to the Streams topic `significant_events`.


In [None]:
def createSignificantEventGenerationTopology():
    topo = Topology(SIGNIFICANT_EVENT_GENERATION_JOB_NAME)

    new_scores = topo.subscribe(topic="new_scores")

    full_significant_events = new_scores.map(SignificantEvents(), name="full_significant_events")
    full_significant_events.view( name="full_significant_events", description="significant event and history")

    significant_events = full_significant_events.map(lambda t: t["new"], name="significant_events")
    significant_events.view(name="significant_events", description="significant event only")
    significant_events.publish(topic="significant_events")
    
    return topo


### Publish Significant Events

This job publishes the significant scoring events to an external Kafka/Eventstreams bus, for use by some other business logic to use or display.  The topology here subscribes to the Streams topic `significant_events` and re-publishes to the chosen external Kafka topic.

A similar streams job could be added, subscribing to the same Streams topic, to do some other filtering or actions based on these significant events, such as automatically sending an email, or doing deeper customer analysis when significant events have occurred for that customer.


In [None]:
def createPublishSignificantEventsTopology():
    topo = Topology(PUBLISH_SIGNIFICANT_EVENTS_JOB_NAME)

    significant_events = topo.subscribe(topic="significant_events")
    significant_events_as_json = significant_events.as_json()

    with open(EVENTSTREAMS_CREDENTIALS) as f:
        eventstreams.publish(significant_events_as_json, topic=SIGNIFICANT_EVENTS_EVENTSTREAMS_TOPIC, credentials=json.load(f))
   
    return topo


## Job Submission

In the following cells, the various Streams topologies are built and jobs submitted to the Streams instance for continual execution.  To speed up the topology building process, many of the jobs are built and submitted at once, using Python threads.  However, building and submitting the final job, to actually replay the live events, is delayed until all the main Streams jobs have been submitted, so that they are fully up and running before any new live events are fed into the system.


In [None]:
import concurrent.futures

# Thread executor so we can build and submit streams jobs in parallel.
executor = concurrent.futures.ThreadPoolExecutor()
futureset = set()

In [None]:
if EVENTSTREAMS_CREDENTIALS is not None and SIGNIFICANT_EVENTS_EVENTSTREAMS_TOPIC is not None:
    sr1 = executor.submit(submitToStreams,createPublishSignificantEventsTopology())
    futureset.add(sr1)
    time.sleep(0.5)  # Sleep for a bit to give the thread a chance to start and create the progress widget in this cell
else:
    print("Skipping " + PUBLISH_SIGNIFICANT_EVENTS_JOB_NAME + " job creation and submission because either EVENTSTREAMS_CREDENTIALS or SIGNIFICANT_EVENTS_EVENTSTREAMS_TOPIC was set to None.")


In [None]:
sr2 = executor.submit(submitToStreams,createSignificantEventGenerationTopology())
futureset.add(sr2)
time.sleep(0.5)  # Sleep for a bit to give the thread a chance to start and create the progress widget in this cell

In [None]:
if CUSTOMER_SCORE_EVENTS_DATASET is not None and remote_db2credentials is not None:
    sr3 = executor.submit(submitToStreams,createRecordScoreHistoryTopology())
    futureset.add(sr3)
    time.sleep(0.5)  # Sleep for a bit to give the thread a chance to start and create the progress widget in this cell
else:
    print("Skipping " + RECORD_SCORE_HISTORY_JOB_NAME + " job creation and submission because CUSTOMER_SCORE_EVENTS_DATASET was None, or the dataset didn't exist.")


In [None]:
sr4 = executor.submit(submitToStreams,createScoreEventsTopology())
futureset.add(sr4)
time.sleep(0.5)  # Sleep for a bit to give the thread a chance to start and create the progress widget in this cell

Here is where we wait for the Streams jobs to finish build/submission before starting on the Real time Event Generation job.

In [None]:
# Barrier to wait for all the currently running submission jobs before we move on to start the final real-time event generation job, below
concurrent.futures.wait(futureset)
futureset.clear()

In [None]:
sr5 = submitToStreams(createRealTimeEventGenerationTopology())
if(sr5.job):
    print("JobId: ", sr5.job.id, " Name: ", sr5.job.name)


## Job Listing, Status, and Control

The following cell defines some notebook widgets to display the currently submitted Streams jobs, with their status, and allows you to cancel them all, if you want.


In [None]:
import ipywidgets as widgets
import threading
import pandas as pd
import time
import getpass

# Check for the job lists and their current health, and update the output
def updater(ow):
    df = pd.DataFrame(index=['Name','Health'])
    for job in streams_instance.get_jobs():
        if job.name.endswith(getpass.getuser()):
            df[job.id] = [job.name, job.health]
    ow.append_display_data(df.transpose())
    ow.clear_output(wait=True)

# The basic thread loop
def threadfunc(ow):
    while(True):
        updater(ow)
        time.sleep(5)

# Respond when they ask to cancel the jobs
def canceller(button, ow):
    button.disabled = True
    button.description = 'Cancellation in Progress ...'
    for job in streams_instance.get_jobs():
        if job.name.endswith(getpass.getuser()):
            try:
                job.cancel()
            except:
                pass
    updater(ow)
    button.description = 'Cancel All Jobs'
    button.disabled = False
    
# Layout the widgets
label = widgets.HTML(value='<h3>Current State of Running Streams Jobs</h3>')
o = widgets.Output(layout={'border': '1px solid black'})
b = widgets.Button(description='Cancel All Jobs', button_style='danger', layout={'width': '25%'})
b.on_click(lambda w: canceller(w, o))
vbox = widgets.VBox([label, o, b])

# Start up the thread
t = threading.Thread(target=threadfunc, args=(o,))
t.start()


## List of Current Streams Jobs

Below, the current set of running Streams jobs will be listed, along with their 'health' state.  A job may be unhealthy if it is starting up, or shutting down, or has a run-time error that could not be recovered.  To see more information about the Streams jobs, go to the 'My Instances' panel in the Cloud Pak for Data navigation menu, and select the 'Jobs' tab.  From there, the operator graph of each job can be viewed, and job logs can be downloaded.  Individual jobs can be cancelled there, or all currently running jobs can be cancelled by clicking the button below.


In [None]:
# Display the Widgets
display(vbox)
