# Map UDFs and Business Rules
 
<HTML>
<br>
In this tutorial we will use a Map UDF to validate and handle data quality constraints and to apply business rules when calculating bid-ask spread, extending the example developed in the previous tutorial. We will be using the stocks table created in <a href="./1%20-%20Import%20UDF%20Simple%20Parser.ipynb" target="_self">Import UDF: Simple Parser</a>.
    <br><br>
    <div style="position: relative;
    padding: 10px 10px 20px 100px;
    border: 1px solid #BFBFBF;
    box-shadow: 5px 5px 5px #aaaaaa;">
    <img src="xi-questionmark_yellow.png" 
         style="position: absolute; top: 0x; left: 10px; width:50px; height:50px" />
    <font style="font-size:20px">
How to Write a Map UDF</font>
<br>For more information about writing a Map UDF refer to the previous tutorial: [4 - Map UDF Example 1](4%20-%20Map%20UDF%20Example%201.ipynb).
    
</div>
<br>
</HTML>



### Business Rules
In the 'stocks' table, the ask price for some records might be 0 or negative. In business terms this means you have no current data for this field or it is invalid, or the data is corrupted. In this case you should fall back on the 'last sale' field to get the most recent valid data you have. If that field is also unavailable (0 or negative) return the value 1. 

Below we enhance the function from our previous example to handle this business logic: 

In [1]:
def bid_ask_spread (bid, ask, last_sale):
    ask = float(ask)
    bid = max(0,float(bid))
    if ask < 0:
        ask = last_sale
    if ask > 0:
        spread = (ask - bid) * 100 / ask
    else:
        spread = 1
    return str(spread)    

You can use this function on your Xcalar table to create a new column that will hold only valid data, either the ask or the sale.



### Connecting to Xcalar

Connect to Xcalar either by using the <b>CODE SNIPPETS</b> menu or by running the following cell:


In [2]:
# Xcalar Notebook Connector
# 
# Connects this Jupyter Notebook to the Xcalar Workbook <wb-1>
#
# To use any data from your Xcalar Workbook, run this snippet before other 
# Xcalar Snippets in your workbook. 
# 
# A best practice is not to edit this cell.
#
# If you wish to use this Jupyter Notebook with a different Xcalar Workbook 
# delete this cell and click CODE SNIPPETS --> Connect to Xcalar Workbook.

%matplotlib inline

# Importing third-party modules to facilitate data work. 
import pandas as pd
import matplotlib.pyplot as plt

# Importing Xcalar packages and modules. 
# For more information, search and post questions on discourse.xcalar.com
from xcalar.compute.api.XcalarApi import XcalarApi
from xcalar.compute.api.Session import Session
from xcalar.compute.api.WorkItem import WorkItem
from xcalar.compute.api.ResultSet import ResultSet

# Create a XcalarApi object
xcalarApi = XcalarApi()
# Connect to current workbook that you are in
workbook = Session(xcalarApi, "xdpadmin", "xdpadmin", 4399150, True, "TutorialNotebooks-HelloUDF-Full")
xcalarApi.setSession(workbook)

### Creating your Map UDF

As covered in the previous tutorial, create a Map UDF template in Jupyter.

1. Open the <b>CODE SNIPPETS</b> dropdown menu.
2. This time select <b>Create Map UDF</b>.
3. Fill out the following fields in the <b>UDF Template</b> modal:
  * <b>Module Name</b>: Choose a name for your UDF module 'business_rules'.
  * <b>Function Name</b>: Enter 'bid_ask_spread' (name of your udf function).
  * <b>Table Name</b>: Choose the stocks table you created.
  * <b>Columns</b>: Select the 'Bid', 'Ask' and 'Last Sale' columns.
4. Click the <b>CONFIRM</b> button, a map UDF template will be added to your Jupyter Notebook.
5. Finally, replace the 'validate_ask' function in the template with your 'validate_ask' function.



### Running your UDF
With the above steps done, you are ready to use your UDF, <b>Run</b> the newly created cell and you will see one valid email returned for each person on the list. You can also use your Map UDF directly in XD by right clicking on a table column. While the Map UDF template in Jupyter will only work on the sample table, when running from Xcalar you can apply it to any table.

In [4]:
# Xcalar Map UDF Template
#
# This is a function definition for a Python Map UDF written to apply to 
# table: <stocks#6> columns: <stocks__Bid, stocks__Ask, stocks__Last_Sale>.
#
# Module name: <business_rules>
# Function name: <bid_ask_spread>
#
# REQUIREMENTS: Map UDF functions take one or more columns as arguments, and
# return a string. 
#
# To create a map UDF, edit the function definition below, named <bid_ask_spread>. 
#
# To test your map UDF, run this cell. (Hit <control> + <enter>.) 
#
# To apply the <business_rules> module to your table <stocks#6> 
# click the "Use UDF on Table stocks#6" button. 
#
# NOTE: Use discipline before replacing this module. Consider whether previous 
# uses of this map UDF could be broken by new changes. If so, versioning this 
# module may be appropriate. 
#
# Best practice is to name helper functions by starting with __. Such 
# functions will be considered private functions and will not be directly 
# invokable from Xcalar tools.
## Map UDF function definition.
def bid_ask_spread (bid, ask, last_sale):
    ask = float(ask)
    bid = max(0,float(bid))
    if ask < 0:
        ask = last_sale
    if ask > 0:
        spread = (ask - bid) * 100 / ask
    else:
        spread = 1
    return str(spread)   

### WARNING DO NOT EDIT CODE BELOW THIS LINE ###
from xcalar.compute.api.Dataset import *
from xcalar.compute.coretypes.DataFormatEnums.ttypes import DfFormatTypeT
from xcalar.compute.api.Udf import Udf
from xcalar.compute.coretypes.LibApisCommon.ttypes import XcalarApiException
import random

def uploadUDF():
    import inspect
    sourceCode = "".join(inspect.getsourcelines(bid_ask_spread)[0])
    try:
        Udf(xcalarApi).add("business_rules", sourceCode)
    except XcalarApiException as e:
        if e.status == StatusT.StatusUdfModuleAlreadyExists:
            Udf(xcalarApi).update("business_rules", sourceCode)


# Publish Table to Jupyter Notebook
# 
# This snippet is configured to load <100> rows of Xcalar table <stocks#6> into a pandas dataframe named
# <stocks#6_pd>.
#
# To instantiate or refresh your pandas dataframe, run the Connect snippet, 
# and then run this snippet. 
#
# Best Practice is not to edit this code. 
#
# To use different data with this Jupyter Notebook:
# 1) Go to the table in your Xcalar Workbook.
# 2) From the table menu, click Publish to Jupyter.
# 3) Click full table or enter a number of rows and click submit.

# Imports data into a pandas dataframe.
def getDataFrameFromDict():
    from collections import OrderedDict
    resultSetPtr_6 = ResultSet(xcalarApi, tableName="stocks#17", maxRecords=100)
    stocks_6 = []
    for row in resultSetPtr_6:
        col_list = ["stocks::security","stocks::date","stocks::Bid","stocks::Ask","stocks::Bid size","stocks::Ask size","stocks::Last Sale","stocks::Last size","stocks::Volume","stocks::Total Sale",]
        kv_list = []
        for k in col_list:
            if k not in row:
                kv_list.append((k, None))
            else:
                kv_list.append((k, row[k]))
                if type(row[k]) is list:
                    for i in range(len(row[k])):
                        subKey = k + "[" + str(i) + "]"
                        if subKey in col_list:
                            row[subKey] = row[k][i]
        filtered_row = OrderedDict(kv_list)

        stocks_6.append(filtered_row)
    return pd.DataFrame.from_dict(stocks_6)
stocks_6_pd = getDataFrameFromDict()
for index, row in stocks_6_pd.iterrows():
    assert(type(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"], row["stocks::Last Sale"])).__name__ == 'str')
    print(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"], row["stocks::Last Sale"]))

uploadUDF()

1.3574217650268632
2.942638707485879
61.86401838991184
1.9888191323883326
0.9291894895445227
1.5311497692985014
2.88433720655365
1.1126191019758385
1.5802154785087639
0.883083217294609
3.2365825683544283
11.04388270121492
6.901891051544763
1.1146222683149105
2.364731285654689
1.3236100108410884
9.137725118136341
5.606079752763914
11.547988934556109
4.079331244512697
10.204521883756723
5.025585882475603
2.400679385020525
1.4548698548521142
1.811540434375807
1.5602663182430208
37.293968240442496
1.9465612053149692
87.78165524708035
1.406678015273137
1.0377897239823615
6.497490488643412
9.292956134769772
1.4784841649736427
3.688980360578879
4.096984616074888
2.58479144372402
0.6708848020559482
1.1830896317426787
7.7261402929900225
1.883967595038418
1.2688858591534222
1.6618517526636118
8.238441816596902
3.0099897514719514
5.020750351768567
1.1962988229235818
1.79029075218029
0.9749375074108885
2.1247632054333376
2.9260359547785746
0.9175932316051701
4.31253313738658
2.250450443254395
1.47

<html>
 Next: <a href="./7%20-%20Export%20UDF.ipynb" target="_self">7 - Export UDF</a><br>
 Back to <a href="./0%20-%20Introduction.ipynb" target="_self">Introduction</a><br>
</html>