# Creating a Map UDF with Jupyter
<HTML>
   <br> 
As you learned in the <a href="./0%20-%20Introduction.ipynb" target="_self">Introduction</a>, a UDF can be specified as a Map function when executing a Map operation. As a reminder, Map operations apply a function to every row in the table, and use the output of this function to populate a new column, creating an extra field for every row. <br> <br>
    In this tutorial we will implement a Map UDF that calculates [bid-ask spread](https://www.investopedia.com/terms/b/bid-askspread.asp) which is defined as the amount by which the ask price exceeds the bid price for an asset in the market. It represents the relationship between the highest price a buyer is willing to pay for a stock, and the lowest price a seller is willing to accept. This is a useful tool for determining if a stock will trend upwards or downwards. Bid-ask spread can be calculated as percentage in the following way:<br>
  <ol>
      `100 * (ask - bid) / ask` <br>
    </ol>
If, for example, ask price is `20$` and bid is `19$`, the bid ask spread is `5%` (1/20 x 100).
<br>
<br>

</HTML>


<HTML>

<div style="background-color : blue; color : white
    width: 284px;
    padding: 20px 20px 20px 100px;
    border: 1px solid #BFBFBF;
    background-color: white;box-shadow: 0px 0px 0px 0px #aaaaaa;"><font style="font-size:20px">
What is a map UDF?</font>
    <br>A Map UDF can be used to perform an operation on given columns and returns a string which is then inserted into a new column by a Xcalar map operation. To learn more about UDFs refer to
[Understanding UDFs](https://www.xcalar.com/documentation/help/XD/1.3.1/Content/C_AdvancedTasks/B_UDFUnderstand.htm?Highlight=MAP%20UDFs/). It may also be useful to watch an introductory video tutorial [How-to Series (UDFs) - How to write Map UDFs](https://www.youtube.com/watch?v=76BrJZtAcmc) on the Xcalar YouTube channel.
    <img src="xi-questionmark_yellow.png" 
         style="position: absolute;top: 40px;left: 125px;width:40px ;height:40px" />
</div>

</HTML>


While designing your Map UDFs keep in mind that map UDFs are executed for each row in the base table. 

<HTML>
<div style="background-color : blue; color : white
    width: 284px;
    padding: 20px 20px 20px 100px;
    border: 1px solid #BFBFBF;
    background-color: white;box-shadow: 0px 0px 0px #aaaaaa;"><font style="font-size:20px">
    Authoring UDFs</font>
    <br>There are two ways of creating a UDF. First, using Xcalar's native UDF panel and second, using Jupyter Notebook. This tutorial only covers authoring UDFs with Jupyter. To learn more about the native Xcalar Design UDF panel refer to 
[Creating a UDF in Xcalar Design](https://www.xcalar.com/documentation/help/XD/1.3.1/Content/C_AdvancedTasks/C_UDFTasks.htm).
    <img src="xi-unlock icon_blue.png" 
         style="position: absolute;top: 26px;left: 125px;width:35px ;height:45.94px" />
</div>

</HTML>





### Step 1 - Connecting to Xcalar Session:
As we have seen in the earlier examples of Import UDFs, to author a UDF from Jupyter Notebook, you need to connect to the Xcalar session.

1. Click the <b>CODE SNIPPETS</b> dropdown menu in the top right corner.
2. Select <b>Connect to Xcalar Workbook</b>.
3. You should see a new cell (similar to the one below) added to your Jupyter Notebook. Click on the cell to highlight it, and click the <b>Run</b> button.

In [1]:
# Xcalar Notebook Connector
# 
# Connects this Jupyter Notebook to the Xcalar Workbook <Fixed>
#
# To use any data from your Xcalar Workbook, run this snippet before other 
# Xcalar Snippets in your workbook. 
# 
# A best practice is not to edit this cell.
#
# If you wish to use this Jupyter Notebook with a different Xcalar Workbook 
# delete this cell and click CODE SNIPPETS --> Connect to Xcalar Workbook.

%matplotlib inline

# Importing third-party modules to faciliate data work. 
import pandas as pd
import matplotlib.pyplot as plt

# Importing Xcalar packages and modules. 
# For more information, search and post questions on discourse.xcalar.com
from xcalar.compute.api.XcalarApi import XcalarApi
from xcalar.compute.api.Session import Session
from xcalar.compute.api.WorkItem import WorkItem
from xcalar.compute.api.ResultSet import ResultSet

# Create a XcalarApi object
xcalarApi = XcalarApi()
# Connect to current workbook that you are in
workbook = Session(xcalarApi, "xdpadmin", "xdpadmin", 4399150, True, "TutorialNotebooks-HelloUDF-Full")
xcalarApi.setSession(workbook)

### Step 2 - Implementing your map UDF function:

In [2]:
def bid_ask_spread (bid, ask):
    spread = (float(ask) - float(bid)) * 100 / float(ask)
    return str(spread)

### Step 3 - Pasting the Map UDF into the Xcalar Template:
<HTML>
    <br>
    Now you will create the template for your Map UDF function. <b>Note</b>: This section expects that you have completed <a href="./1%20-%20Import%20UDF%20Simple%20Parser.ipynb" target="_self">Import UDF Simple Parser</a> and generated the *stocks.csv* dataset.
    <br>
</HTML>

###### 1. Open the <b>CODE SNIPPETS</b> dropdown menu again.
2. This time select <b>Create Map UDF</b>.
3. Fill out the following fields in the <b>UDF Template</b> modal:
  * <b>Module Name</b>: Choose a name for your UDF module, for example, 'map_example'.
  * <b>Function Name</b>: Enter 'bid_ask_spread' (name of your udf function).
  * <b>Table Name</b>: Choose the stocks table created by your Parser UDF. In the follwoing example it is `stocks#6`.
  * <b>Columns</b>: Select the 'bid' and 'sale' columns.
4. Click the <b>CONFIRM</b> button, a Map UDF template will be added to your Jupyter Notebook.
5. Replace the body of the function with the bid_ask_function we designed above and hit Shift/Enter to test your code.

In [None]:
# Xcalar Map UDF Template
#
# This is a function definition for a Python Map UDF written to apply to 
# table: <stocks#2> columns: <stocks__Bid, stocks__Ask>.
#
# Module name: <map_example>
# Function name: <bid_ask_spread>
#
# REQUIREMENTS: Map UDF functions take one or more columns as arguments, and
# return a string. 
#
# To create a map UDF, edit the function definition below, named <bid_ask_spread>. 
#
# To test your map UDF, run this cell. (Hit <control> + <enter>.) 
#
# To apply the <map_example> module to your table <stocks#2> 
# click the "Use UDF on Table stocks#2" button. 
#
# NOTE: Use discipline before replacing this module. Consider whether previous 
# uses of this map UDF could be broken by new changes. If so, versioning this 
# module may be appropriate. 
#
# Best practice is to name helper functions by starting with __. Such 
# functions will be considered private functions and will not be directly 
# invokable from Xcalar tools.
## Map UDF function definition.
def bid_ask_spread (bid, ask):
    spread = (float(ask) - float(bid)) * 100 / float(ask)
    return str(spread)

### WARNING DO NOT EDIT CODE BELOW THIS LINE ###
from xcalar.compute.api.Dataset import *
from xcalar.compute.coretypes.DataFormatEnums.ttypes import DfFormatTypeT
from xcalar.compute.api.Udf import Udf
from xcalar.compute.coretypes.LibApisCommon.ttypes import XcalarApiException
import random

def uploadUDF():
    import inspect
    sourceCode = "".join(inspect.getsourcelines(bid_ask_spread)[0])
    try:
        Udf(xcalarApi).add("map_example", sourceCode)
    except XcalarApiException as e:
        if e.status == StatusT.StatusUdfModuleAlreadyExists:
            Udf(xcalarApi).update("map_example", sourceCode)


# Publish Table to Jupyter Notebook
# 
# This snippet is configured to load <100> rows of Xcalar table <stocks#2> into a pandas dataframe named
# <stocks#2_pd>.
#
# To instantiate or refresh your pandas dataframe, run the Connect snippet, 
# and then run this snippet. 
#
# Best Practice is not to edit this code. 
#
# To use different data with this Jupyter Notebook:
# 1) Go to the table in your Xcalar Workbook.
# 2) From the table menu, click Publish to Jupyter.
# 3) Click full table or enter a number of rows and click submit.

# Imports data into a pandas dataframe.
def getDataFrameFromDict():
    from collections import OrderedDict
    resultSetPtr_2 = ResultSet(xcalarApi, tableName="stocks#17", maxRecords=100)
    stocks_2 = []
    for row in resultSetPtr_2:
        col_list = ["stocks::Security","stocks::date","stocks::Bid","stocks::Ask","stocks::Bid size","stocks::Ask size","stocks::Last Sale","stocks::Last size","stocks::Volume","stocks::Total Sale",]
        kv_list = []
        for k in col_list:
            if k not in row:
                kv_list.append((k, None))
            else:
                kv_list.append((k, row[k]))
                if type(row[k]) is list:
                    for i in range(len(row[k])):
                        subKey = k + "[" + str(i) + "]"
                        if subKey in col_list:
                            row[subKey] = row[k][i]
        filtered_row = OrderedDict(kv_list)

        stocks_2.append(filtered_row)
    return pd.DataFrame.from_dict(stocks_2)
stocks_2_pd = getDataFrameFromDict()
for index, row in stocks_2_pd.iterrows():
    assert(type(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"])).__name__ == 'str')
    print(bid_ask_spread(row["stocks::Bid"], row["stocks::Ask"]))

uploadUDF()



### Step 4 - Testing your  map UDF:
1. Highlight and run the template cell. Observe that the results of applying our function to bid and ask columns are printed.
2.    If you run this the first time, you will see <b>Use UDF on Table ...</b> button under the template code.
Click it - this will open the XD Map panel with your map UDF selected.
3.    Select the bid and the ask from your stocks table and click the  <b>Map</b> button.
4.    Observe a new column is created in your table with bid ask spread.

<HTML>
<br>
<div style="background-color : blue; color : white
    width: 284px;
    padding: 20px 20px 20px 100px;
    border: 1px solid #BFBFBF;
    background-color: white;box-shadow: 0px 0px 0px #aaaaaa;"><font style="font-size:20px">
Applying Map UDF from XD</font>
<br>To apply your Map function from XD, select one of the columns you want to apply the Map operation to. Open the Map panel by clicking the <b>Map</b> button in the column options menu. Then select your map function, select second operand and specify the result column name. Hit Map button. Observe a new column is created in your table with bid ask spread.
    <img src="xi-checkmark icon_green.png" 
         style="position: absolute;top: 225px;left: 130px;width:40px ;height:40px" />
    <img src="Map Operation.png" 
         style="width:640px ;" />
</div>
    
    

</HTML>


<html>
 Next: <a href="./5%20-%20Map%20UDF%20Testing.ipynb" target="_self">5 - Map UDF Testing</a><br>
 Back to <a href="./0%20-%20Introduction.ipynb" target="_self">Introduction</a><br>
</html>