##  Map UDF Testing & Error Handling 

When using a Map UDF it is very possible to run into errors. Because Map UDFs run on each row of a table if a single row contains invalid data, an exception will be thrown each time. The UDF will continue to run for the remainder of the rows so your UDF will continue to function but as each exception gets processed by the engine, this will cause a significant slowdown. Using error handling will prevent performance degradation due to unhandled exceptions and will help troubleshoot potential problems with your data.
<html>
    <br>
This Jupyter Notebook will guide you through writing a simple function, and then add error handling to this function. This tutorial will use the data generated in <a href="./2%20-%20Import%20UDF%20Parser%20Debugging%20and%20Troubleshooting.ipynb" target="_self">Import UDF: Parser Troubleshooting and Debugging</a><br>
</html>


### The Bid-Ask Spread

We will continue with the bid-ask spread example from the previous Jupyter Notebook. As a reminder, bid-ask spread is the relationship between the highest price a buyer is willing to pay for a stock, and the lowest price a seller is willing to accept. This is either represented as the difference between the bid and the ask, or as a percentage of the difference divided by ask. The spread metrics is a useful tool for determining if a stock will trend upwards or downwards.

In [1]:
def bid_ask_spread (bid, ask):
    spread = (float(ask) - float(bid)) * 100 / float(ask)
    return str(spread)

If the above function is used on a non-numeric column, it will throw an exception. 

### Adding Error Logging

One way of UDFs troubleshooting techniques is using Xcalar Logger. Xcalar provides a logging library that logs UDF outputs to a dedicated directory in Xcalar. This [discourse article](https://discourse.xcalar.com/t/white-check-mark-having-a-hard-time-with-debugging-my-udf/291) contains more information about the Xcalar logger. 

<html><br>Please review the <a href="./2%20-%20Import%20UDF%20Parser%20Debugging%20and%20Troubleshooting.ipynb" target="_self">Import UDF: Simple Parser with Troubleshooting and Debugging</a> tutorial for information about accessing logs and other troubleshooting details.</html> 

In [2]:
def bid_ask_spread(bid, ask):
    import logging
    try:
        spread = (float(ask) - float(bid)) * 100 / float(ask)
        return str(spread)
    except ZeroDivisionError:
        #add error logging. These logs will appear in file xpu.out
        log_dict = {}
        log_dict ["Time Stamp"] = datetime.datetime.utcnow().strftime("%I:%M%p on %B %d, %Y")
        log_dict ["Source UDF"] = "bid_ask_spread"
        log_dict ["Description"] = "Error: Ask is 0"
        logging.error(log_dict)
    except Eception as e:
        log_dict = {}
        log_dict ["Time Stamp"] = datetime.datetime.utcnow().strftime("%I:%M%p on %B %d, %Y")
        log_dict ["Source UDF"] = "bid_ask_spread"
        log_dict ["Description"] = "Error: " + str(e)
        logging.error(log_dict)

### Inserting Errors into the Xcalar Table


There is also an alternative way to troubleshoot your UDF. Since Map UDFs are executed for every row in a table and careless use of logger may flood the logging buffer, it often makes sense to return the error from the map operation in place of the expected computation. The error will output to the field of your Xcalar table and users will be able to filter on it.

In [3]:
def bid_ask_spread(bid, ask):
    try:
        spread = (float(ask) - float(bid)) * 100 / float(ask)
        return str(spread)
    except ZeroDivisionError:
        return "Error: Ask is 0" # Now we dont flood the logger and observe the errors in the end table
    except Exception as e:
        return "Error: " + str(e)


## Testing Python code in Jupyter Notebook

To test your Python code in Jupyter highlight the code cell and click the <b>run</b> button. Jupyter invokes Python interpreter on the block and prints out the results including any errors that occur under the code block. You can debug and test your UDF functions before uploading to the Xcalar UDF editor for storage in your Workbook

In [4]:
# Xcalar Notebook Connector
# 
# Connects this Jupyter Notebook to the Xcalar Workbook <wb-1>
#
# To use any data from your Xcalar Workbook, run this snippet before other 
# Xcalar Snippets in your workbook. 
# 
# A best practice is not to edit this cell.
#
# If you wish to use this Jupyter Notebook with a different Xcalar Workbook 
# delete this cell and click CODE SNIPPETS --> Connect to Xcalar Workbook.

%matplotlib inline

# Importing third-party modules to facilitate data work. 
import pandas as pd
import matplotlib.pyplot as plt

# Importing Xcalar packages and modules. 
# For more information, search and post questions on discourse.xcalar.com
from xcalar.compute.api.XcalarApi import XcalarApi
from xcalar.compute.api.Session import Session
from xcalar.compute.api.WorkItem import WorkItem
from xcalar.compute.api.ResultSet import ResultSet

# Create a XcalarApi object
xcalarApi = XcalarApi()
# Connect to current workbook that you are in
workbook = Session(xcalarApi, "xdpadmin", "xdpadmin", 4399150, True, "TutorialNotebooks-HelloUDF-Full")
xcalarApi.setSession(workbook)

<HTML>
As explained in <a href="./4%20-%20Map%20UDF%20Example.ipynb" target="_self">Map UDF Example</a>, connect your Jupyter Notebook to Xcalar, and create a Map UDF template. Note that you will be using *stocks_bad* table created in <a href="./2%20-%20Import%20UDF%20Parser%20Debugging%20and%20Troubleshooting.ipynb" target="_self">Import UDF: Simple Parser with Troubleshooting and Debugging</a>.
</HTML>

1. Open the <b>CODE SNIPPETS</b> dropdown menu again.
2. This time select <b>Create Map UDF</b>.
3. Fill out the following fields in the <b>UDF Template</b> modal:
  * <b>Module Name</b>: Choose a name for your UDF.
  * <b>Function Name</b>: Enter 'bid_ask_spread' (name of your udf function).
  * <b>Table Name</b>: Choose the *stocks* table (stocks#6 in the example below, but you should generate your own template that will come with a different name/suffix).
  * <b>Columns</b>: Select the bid column, and the ask column.
4. Click the <b>CONFIRM</b> button, a map UDF template will be added to your workbook.

Replace the template function with your version. Running your code will upload your UDF. 

In [6]:
# Xcalar Map UDF Template
#
# This is a function definition for a Python Map UDF written to apply to 
# table: <stocks#6> columns: <stocks__Bid, stocks__Ask>.
#
# Module name: <bid_ask_spread_check>
# Function name: <bid_ask_spread>
#
# REQUIREMENTS: Map UDF functions take one or more columns as arguments, and
# return a string. 
#
# To create a map UDF, edit the function definition below, named <bid_ask_spread>. 
#
# To test your map UDF, run this cell. (Hit <control> + <enter>.) 
#
# To apply the <bid_ask_spread_check> module to your table <stocks#6> 
# click the "Use UDF on Table stocks#6" button. 
#
# NOTE: Use discipline before replacing this module. Consider whether previous 
# uses of this map UDF could be broken by new changes. If so, versioning this 
# module may be appropriate. 
#
# Best practice is to name helper functions by starting with __. Such 
# functions will be considered private functions and will not be directly 
# invokable from Xcalar tools.
## Map UDF function definition.
def bid_ask_spread(bid, ask):
    try:
        spread = (float(ask) - float(bid)) * 100 / float(ask)
        return str(spread)
    except ZeroDivisionError:
        return "Error: Ask is 0" # Now we dont flood the logger and observe the errors in the end table
    except Exception as e:
        return "Error: " + str(e)

### WARNING DO NOT EDIT CODE BELOW THIS LINE ###
from xcalar.compute.api.Dataset import *
from xcalar.compute.coretypes.DataFormatEnums.ttypes import DfFormatTypeT
from xcalar.compute.api.Udf import Udf
from xcalar.compute.coretypes.LibApisCommon.ttypes import XcalarApiException
import random

def uploadUDF():
    import inspect
    sourceCode = "".join(inspect.getsourcelines(bid_ask_spread)[0])
    try:
        Udf(xcalarApi).add("bid_ask_spread_check", sourceCode)
    except XcalarApiException as e:
        if e.status == StatusT.StatusUdfModuleAlreadyExists:
            Udf(xcalarApi).update("bid_ask_spread_check", sourceCode)


# Publish Table to Jupyter Notebook
# 
# This snippet is configured to load <100> rows of Xcalar table <stocks#6> into a pandas dataframe named
# <stocks#6_pd>.
#
# To instantiate or refresh your pandas dataframe, run the Connect snippet, 
# and then run this snippet. 
#
# Best Practice is not to edit this code. 
#
# To use different data with this Jupyter Notebook:
# 1) Go to the table in your Xcalar Workbook.
# 2) From the table menu, click Publish to Jupyter.
# 3) Click full table or enter a number of rows and click submit.

# Imports data into a pandas dataframe.
def getDataFrameFromDict():
    from collections import OrderedDict
    resultSetPtr_6 = ResultSet(xcalarApi, tableName="stocks#17", maxRecords=100)
    stocks_6 = []
    for row in resultSetPtr_6:
        col_list = ["stocks::security","stocks::date","stocks::Bid","stocks::Ask","stocks::Bid size","stocks::Ask size","stocks::Last Sale","stocks::Last size","stocks::Volume","stocks::Total Sale",]
        kv_list = []
        for k in col_list:
            if k not in row:
                kv_list.append((k, None))
            else:
                kv_list.append((k, row[k]))
                if type(row[k]) is list:
                    for i in range(len(row[k])):
                        subKey = k + "[" + str(i) + "]"
                        if subKey in col_list:
                            row[subKey] = row[k][i]
        filtered_row = OrderedDict(kv_list)

        stocks_6.append(filtered_row)
    return pd.DataFrame.from_dict(stocks_6)
stocks_6_pd = getDataFrameFromDict()
for index, row in stocks_6_pd.iterrows():
    assert(type(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"])).__name__ == 'str')
    print(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"]))

uploadUDF()

41.829488258198936
4.89157322601784
-35.504574594263644
52.694848374308464
66.99142009778338
89.85617826632138
-35.89257526964109
43.55145633313592
75.14235520572845
93.38578651371826
-35.522211970039635
-65.07921091618199
-59.361583911698794
19.792538316037138
28.474250766294404
91.4524856682751
-62.28771657526619
-60.88464614862101
-71.26662210840908
-51.269992779906346
-68.6062369866209
-52.597833118792046
-12.90575194478862
-10.901533947318214
24.446799086707927
-32.45979824933984
-54.84567763689259
33.7747163341336
-9.246595698073726
44.48630397650268
88.99021916706484
-47.65642919699263
-64.57878365392305
70.77163771593949
-33.69780625700948
-56.20147394532976
-41.62794650796541
56.96815654165252
68.2660718715413
-63.97805934581686
30.878092847526077
67.64869716243395
44.80836604180834
-60.73361666005653
-5.132652331800202
-44.99518240931987
28.04841318125183
-32.054034405150205
91.3406203856994
-14.819505143056816
-35.427932545635144
89.25039322326785
-43.9569765080728
-39.74083

<html>
 Next: <a href="./6%20-%20Map%20UDF%20Business%20Rules.ipynb" target="_self">6 - Map UDF Business Rules</a><br>
 Back to <a href="./0%20-%20Introduction.ipynb" target="_self">Introduction</a><br>
</html>