# Intraday Seasonality Analysis Using DB2 and Python

This notebook shows how to extract and visualize intraday seasonality data for financial securities from DB2 using Python. This notebook covers the following:

1. Install the `ibm-db` Python client library
1. Connect to a DB2 database
1. Implement a Python function that:
    * Queries a DB2 database for intraday seasonality data for a security
    * Plots the query results
1. Plot intraday seasonality for select securities

## Install the `ibm-db` Python Client Library

IBM provides the [ibm-db Python library](https://code.google.com/p/ibm-db/) to facilitate connecting to IBM DB2. You can install the `ibm-db` library in your IBM Knowledge Anyhow Workbench environment using the `pip` installer:

In [None]:
!pip install ibm_db

The above command installs the `ibm_db` package in your IBM Knowledge Anyhow Workbench.  This package is now available to all notebooks. 

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** You only need to run this step once. Rerunning the command will simply detect that the <span style="white-space: pre;font-family: monospace;">ibm_db</span> library has already been installed in your Knowledge Anyhow Workbench and no repeated installation is required.

Also note, that connecting to DB2 requires a DB2 driver which is already pre-installed in your Knowledge Anyhow Workbench in <span style="white-space: pre;font-family: monospace;">/opt/db2_v10.5_linuxamd64/clidriver/lib/libdb2.so</span>.</div>

</div>

## Connect to a DB2 Database

If you have not previously done so as part of another tutorial, **[click here to import the credentials file](/tutorials/eurex/db2_bludb_credentials.json)** into your workbench.

You should now see the `db2_bludb_credentials.json` file in your "Recent Data" panel.

Create a database connection `conn`:

In [None]:
# import ibm_db driver modules, used to create DB2 connection from Python
import ibm_db
import ibm_db_dbi
# import JSON module, used to read JSON credentials file
import json

# load the database credentials JSON file
with file('/resources/db2_bludb_credentials.json') as f:
    db_credentials = json.load(f)
    # create a DB2 DSN connection string, using the db credentials
    dsn = '''DRIVER={{{dsn_driver}}};\
             DATABASE={dsn_database};\
             HOSTNAME={dsn_hostname};\
             PORT={dsn_port};\
             PROTOCOL={dsn_protocol};\
             UID={dsn_uid};\
             PWD={dsn_pwd};'''.format(**db_credentials)
    # create DB2 connection, using the dsn
    raw_conn = ibm_db.connect(dsn, db_credentials['dsn_uid'], db_credentials['dsn_pwd'])
    conn = ibm_db_dbi.Connection(raw_conn)

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** You can find details about this code and an in-depth description about creating a DB2 database connection in the **Tutorial - Access DB2 Using Python** that you can download from our [Welcome](/pages/welcome) page.
</div>

## Implement `analyse_seasonality` Function

The following code snippet implements a function that issues a SQL query against the `eurex` DB2 database containing financial data. Furthermore, it plots a graph of the results:

In [None]:
import pandas
from matplotlib import pyplot as plt
from IPython.core.pylabtools import figsize

pandas.options.display.mpl_style = 'default' # use pandas default style for plots

def analyse_seasonality(secid):

    # query intraday seasonality stats
    sqlstring = '''
        select minute, 
            avg(vwaprice) as vwaprice, 
            avg(range1) as range, 
            avg(units) as units, 
            avg(buyunits) as buyunits,
            avg(sellunits) as sellunits
        from (select 
             date(data_timestamp),
             minute(data_timestamp) as Minute,
             sum(units*price)/sum(units) as vwaprice,
             max(price)-min(price) as range1,
             sum(units) as units,
             sum(aggressor_side*units) as buyunits,
             sum((1-aggressor_side)*units) as sellunits
            from eurex.f_eurex_trades
            where security_id = {secid}
            and hour(data_timestamp) < '22'
            group by date(data_timestamp),minute(data_timestamp)
            order by date(data_timestamp), minute(data_timestamp)) as a
        group by minute
        order by minute'''.format(secid=secid)
    df = pandas.read_sql(sqlstring, conn)
    
    # query product name to be included in visualization
    product_id_sql_string = '''
        select distinct product_id
        from eurex.f_eurex_trades
        where security_id = {secid}'''.format(secid=secid)
    product_id_df = pandas.read_sql(product_id_sql_string, conn)
    product = product_id_df.iloc[0]['PRODUCT_ID']
    
    # define plot layout, graph size and main title
    figsize(10,10) 
    f, (ax1, ax2, ax3, ax4) = plt.subplots(4)
    title = "Intraday seasonalities - averages per minute: {0} - {1}".format(product, secid)
    f.suptitle(title, fontsize=16)
    x = df.index
    
    # plot Prices chart
    ax1.set_xlim(0, max(x))
    ax1.plot(x, df['VWAPRICE'])
    ax1.set_title('Prices')

    # plot Price range chart
    ax2.set_xlim(0, max(x))
    ax2.plot(x, df['RANGE'])
    ax2.set_title('Price range')

    # plot Traded contracts chart
    ax3.set_xlim(0, max(x))
    ax3.plot(x, df['UNITS'])
    ax3.set_title('Traded contracts')
    
    # plot Order imbalance chart
    ax4.set_xlim(0, max(x))
    ax4.plot(x, abs(df['BUYUNITS'] - df['SELLUNITS']) / df['UNITS'])
    ax4.set_title('Order imbalance')

## Plot Intraday Seasonality for Select Securities

Let's test the `analyse_seasonality` function by invoking it for the FDAX security ID **464957**. You should see a set of plots titled "Intraday seasonalities - averages per minute: FDAX - 464957".

In [None]:
# show plots inline in notebook
%matplotlib inline  

<div class="alert alert-block alert-info">**Note:** The <span style="white-space: pre;font-family: monospace;">%%time</span> notebook magic function at the top of a code cell measures and prints the cell execution time. It provides an easy way to measure performance of long running tasks during your analysis.
</div>

In [None]:
%%time
analyse_seasonality(464957)

Having confirmed that the `analyse_seasonality` function works, let's invoke the function for a set of additional securities.

Here we make use of [IPython Interactive Widgets](http://nbviewer.ipython.org/github/ipython/ipython/blob/master/examples/Interactive%20Widgets/Index.ipynb) to invoke the `analyse_seasonality` function for the selected security.

In [None]:
from ipywidgets import interact
from IPython.html import widgets

securities = [
    '464977',
    '454608',
    '566988',
    '567008',
    '558044',
    '661030',
    '661055',
    '653147',
    '760534',
    '760554',
    '750818',
    '855848',
    '855868',
    '847251',
]

# Display an interactive widget that plots intraday seasonality
# data for a list of securities.
def show_widget(securities):
    interact(
        analyse_seasonality,
        secid = widgets.Dropdown(description="Security", 
                                     options=securities, 
                                       value=securities[0]),
        div=widgets.HTML(value='<div id="intraday" style="width: 800px; height: 600px"></div>')
    )

show_widget(securities)

## Next Steps

Feel free to analyze additional securities.  To obtain a list of **all** valid security IDs, run the following query:

In [None]:
%%time
security_ids_df = pandas.read_sql('select distinct security_id, product_id \
                      from eurex.f_eurex_trades \
                      order by security_id', conn)

In [None]:
security_ids_df.head()

Show the intraday seasonality for the first 50 securities.