# How to Score JMP Models in ESP

## 0. Setting Environment Variables

JMP can export its models to Python scripts, which SAS Event Stream Processing can run with the help of SAS Micro Analytic Service (MAS). In order to enable Python running in MAS, you must set the following two environment variables on the system where you run your ESP server.

<code>export MAS_PYPATH=<*pathto*>/python</code>

<code>export MAS_M2PATH=/opt/sas/viya/home/SASFoundation/misc/embscoreeng/mas2py.py</code>

This example uses the following supporting files:
* mnist_input_data.py
* mnist_jmp_test_red.csv
* In the MNIST_data directory: 
 * t10k-images-idx3-ubyte.gz
 * t10k-labels-idx1-ubyte.gz
 * train-images-idx3-ubyte.gz
 * train-labels-idx1-ubyte.gz
* In the demo_models/JMP directory: NN20_20red.py, NN40_40.py, NN50_50.py, NN60_60.py, NN100_100.py, and the contents of the \_pycache\_ subdirectory
* The contents of the \_pycache\_ subdirectory

Ensure that you specify the correct paths to these referenced files throughout the notebook.

In [1]:
import sys
sys.path.append("<pathto>python-esppy") # This is unique for each user

## 1. Loading Data

Begin by importing the mnist_input_data training data. This data contains 60,000 training examples and 10,000 examples of handwritten digits. 

You split this data into two distinct data sets: test_images and test_labels. Later, you use these two data sets to build a model that analyzes the image data. 

In [2]:
import mnist_input_data
mnist = mnist_input_data.read_data_sets("<path>/MNIST_data/", one_hot=True)
test_images = mnist.test.images
test_labels = mnist.test.labels

FileNotFoundError: [Errno 2] No such file or directory: '<path>/MNIST_data/'

## 2. Creating Demo Project

To create a SAS Event Stream Processing project, you first need to import the esppy library. 

Ensure that you have the latest version of SAS Event Stream Processing on your machine by running <code>git pull</code> in the ESP directory on your system. (The earliest version that you can use is 6.1.)

You then run <code>esppy.ESP</code> to establish a conncetion with your ESP server. You must specify a host and port to successfully establish a server connection.

In [None]:
import esppy

In [None]:
esp = esppy.ESP('http://<host>:<port>')

You create a SAS Event Stream Processing project by running <code>esp.create_project(*project*)</code>. Here, you specify *esp_mnist* as your project and name it proj. 

In [None]:
proj = esp.create_project('esp_mnist')

Now you create a schema. A schema is used to ensure that the data types you want processed in your model match the type of data contained in the data set that you loaded from *mnist_input_data*.

The schema you define sets an id in the format of *v(number)*, where *(number)* ranges from 1 to 255, and the string *double*.

In [None]:
schema = ['id*:int64']
for i in range(255):
    schema.append('v{}'.format(i+1)+':'+'double')
schema.append('digit:string')
schema = tuple(schema)

You use this schema to feed the data you downloaded from *mninst_input_data* into a Source window you name *JMP_src*.The schema examines the data and formats it correctly for Python to interpret.  

In [None]:
JMP_src = esp.SourceWindow(schema=schema, index_type='empty', insert_only=True)
proj.windows['w_data1'] = JMP_src

Here, you read in a previously defined model, *JMP_NN* and name it *JMP_model_file*. 

You define a Calculate window and name it *JMP_win*. The Calculate window is where the analytical part of the model is run. Data from the Source window flows into the Calculate window, is analyzed and creates an output event.

You must specify the **path** to the Python file that contains the model.

In [None]:
JMP_model_file = '<path>/NN60_60.py'
JMP_win = esp.CalculateWindow.JMPHelper(copy_vars = ('digit:string'))
JMP_win.add_model_info(model_name='JMP_NN', 
                       model_file=JMP_model_file, source='w_data1')

An edge is used to connect two windows. In this case, you use an edge with the role of data to connect the *JMP_src* data window to *JMP_win*. For more information on using edges, see [Edge Roles](https://go.documentation.sas.com/?cdcId=espcdc&cdcVersion=6.1&docsetId=espan&docsetTarget=p0v2sood1298h8n10tvox93xh2tb.htm).

In [None]:
proj.windows["w_JMP"] = JMP_win

JMP_src.add_target(JMP_win, role='data')

You create a Calculate window which runs calculations to determine the models fit statistics, commonly referred to as FitStat. You use <code>esp.calculate.FitStat</code> and name this calculate window *JMP_fitstat*. You must specify several parameters such as, <code>schema</code>, <code>classLabels</code> and <code>windowLength</code>. You also must map the inputs and outputs. For more information on Fitstat windows, see [Computing Fit Statistics for Scored Results](https://go.documentation.sas.com/?cdcId=espcdc&cdcVersion=6.1&docsetId=espan&docsetTarget=p1k5j3rok1x59on15i884xa66ajq.htm&locale=e).

In [None]:
JMP_fitstat = esp.calculate.FitStat(schema=('id*:int64','mceOut:double'),
                                      classLabels='0,1,2,3,4,5,6,7,8,9',
                                      windowLength=100)

inputs = tuple(['Probability__digit_{}__:double'.format(i) for i in range(10)])

JMP_fitstat.set_inputs(inputs=inputs, 
                         response=('digit:string'))
JMP_fitstat.set_outputs(mceOut='mceOut:double')

Here, you use an edge to connect the *JMP_win* window to *JMP_fitstat* with the role of data. 

In [None]:
proj.windows['w_JMP_fitstat'] = JMP_fitstat

JMP_win.add_target(JMP_fitstat, role='data')

Here you print your XML file for you to view. This is an optional step and is not necessary to ensure the model runs correctly.

In [None]:
print(proj.to_xml(pretty=True))

## 3. Loading the Project into ESP

You load your project to the ESP server using <code>esp.load_project</code>.

In [None]:
esp.load_project(proj)

## 4. Publishing Data and Subscribing to Results

To view results, you must subscribe to the JMP windows and dataframes you have created. 

In [None]:
JMP_win.subscribe()
JMP_src.subscribe()

To read the data now contained in a csv, you must import the pandas library. 

In [None]:
import pandas as pd

Start a new thread to continuously read in and publish data to your notebook. 

First you define your new thread as *publish_thread*, and provide the necessary arguments to create this thread. These arguments include the data that you will be reading in which you use <code>pd.read_csv</code> and name *mnist_jmp_test_red*.

You also must specify how to publish the results, which you do by providing arguments to <code>window.publish_events</code>. Here, you tell this thread to publish the first 500 lines every 50 milliseconds, with a maximum of 10 events per second.

In [None]:
def publish_thread(window):
    mnist_jmp_test_red = pd.read_csv('./mnist_jmp_test_red.csv')
    window.publish_events(mnist_jmp_test_red.head(500), pause=50, rate=10)
    
from threading import Thread
thread = Thread(target = publish_thread, args = (JMP_src, ))
thread.start()

You can use the <code>.tail</code> argument to print rows of the *JMP_src* and *JMP_win* dataframes that you have created to your screen. By default, <code>.tail</code> prints the last 5 rows.

In [None]:
JMP_src.tail()

In [None]:
JMP_win.tail()

## 5. Displaying Results

Use the matplotlib.pyplot library to print images of the hand drawn digits from the mnist data set to the screen. To use this library, you must first import it.

In [None]:
import matplotlib.pyplot as plt

The following block of code creates two images working from the bottom of the dataframe that you created earlier. The first image shows a correct prediction from your model, while the second image shows an incorrect predicition. There are several pieces of this block of code that are important to understand.

First, <code>%matplotlib inline</code> allows for images to be displayed in the Jupyter Notebook. This line must be included to view the two graphs you create.

Second, you use <code>fig.add_subplot</code> to describe how you would like your plots to be arranged and what index you want to specify. For example, <code>ax1 = fig.add_subplot(121)</code> dictates the there are 1 row and 2 columns for the two plots you are creating, while the first graph is given an index of 1.

Third, you create two conditional if statements that separate the correct from image identifications from the incorrect identifications. 

In [None]:
%matplotlib inline

fig = plt.figure(figsize=(7,3), dpi=80)
plt.tight_layout()

ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
fig.canvas.draw()

n = len(JMP_win)
tmp = JMP_win[:n]

index = tmp[tmp['Most_Likely_digit'] == tmp['digit']].tail(1).index.values
correct_id = index[0] if len(index) > 0 else None

index = tmp[tmp['Most_Likely_digit'] != tmp['digit']].tail(1).index.values
incorrect_id = index[0] if len(index) > 0 else None

if correct_id is not None:
    ax1.clear() 
    ax1.imshow(test_images[correct_id].reshape(28,28), cmap='gray', interpolation='nearest')
    ax1.set_title("JMP Correct Prediction: {}".format(JMP_win.loc[correct_id][10]), fontsize=10)
        
if incorrect_id is not None:
    ax2.clear() 
    ax2.imshow(test_images[incorrect_id].reshape(28,28), cmap='gray', interpolation='nearest')
    ax2.set_title("JMP Incorrect Prediction: {}".format(JMP_win.loc[incorrect_id][10]), fontsize=10)

## 6. Cleanup

Finally, it is a good practice to clean up your work space. Here, you unsubscribe to *JMP_win* and *JMP_src* and delete the project and shutdown your esp server.

In [None]:
JMP_win.unsubscribe()
JMP_src.unsubscribe()

esp.delete_project("esp_mnist")

After you finish running your esp project, you might wish to shutdown your ESP server. Uncomment the code below and run <code>esp.shutdown()</code> to shutdown your server.

In [None]:
#esp.shutdown()