# Intraday Seasonality Analysis Using DB2 and R

This notebook shows how to extract and visualize intraday seasonality data for financial securities from DB2 using R. This notebook covers the following:

1. Load the IPython R extension
1. Install required R packages
1. Connect to a DB2 database
1. Implement an R function that:
    * Queries a DB2 database for intraday seasonality data for a security
    * Plots the query results
1. Plot intraday seasonality for select securities

## Load the IPython R Extension

In order to execute R commands in the IBM Knowledge Anyhow Workbench, you have to load the R extension for IPython. More information is available in the [Rpy2 documentation](http://rpy.sourceforge.net/rpy2/doc-2.4/html/interactive.html).  Run the following command to load the rpy IPython extension:

In [None]:
%load_ext rpy2.ipython

Now, you can run R commands directly in the notebook. Prefix notebook cells with the `%%R` IPython notebook cell magic. All commands in that notebook cell are interpreted as R commands.

## Install Required R Packages
Download and install the `rjson` package from [CRAN](http://cran.r-project.org/). This package reads JSON files from within the R environment.

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** Here we install an archived version of the **rjson** package that is compatible with the version of R we are running in the IBM Knowledge Anyhow Workbench.</div>


In [None]:
!wget http://cran.r-project.org/src/contrib/Archive/rjson/rjson_0.2.14.tar.gz -O /home/notebook/R/rjson_0.2.14.tar.gz

In [None]:
%%R
install.packages("/home/notebook/R/rjson_0.2.14.tar.gz", repos=NULL, type="source")

The above command installs the `rjson` package in your IBM Knowledge Anyhow Workbench. This package is now available to all notebooks. 

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** You only need to run this step once. Rerunning the command will simply overwrite an existing install of <span style="white-space: pre;font-family: monospace;">rjson</span> in <span style="white-space: pre;font-family: monospace;">/home/notebook/R/library/rjson</span>.

Also note, that connecting to DB2 requires a DB2 driver which is already pre-installed in your IBM Knowledge Anyhow Workbench in <span style="white-space: pre;font-family: monospace;">/opt/db2_v10.5_linuxamd64/clidriver/lib/libdb2.so</span> and the <span style="white-space: pre;font-family: monospace;">RODBC</span> R package which is pre-installed in <span style="white-space: pre;font-family: monospace;">/usr/lib/R/site-library/RODBC/</span>.</div>
</div>

## Connect to a DB2 Database

If you have not previously done so as part of another tutorial, **[click here to import the credentials file](/tutorials/eurex/db2_bludb_credentials.json)** into your workbench.

You should now see the `db2_bludb_credentials.json` file in your "Recent Data" panel.

Create a database connection `channel`:

In [None]:
%%R
library(rjson) # load rjson library, used to read JSON credentials file
library(RODBC) # load RODBC library, used to create ODBC database connection

# load the database credentials JSON file
db_credentials <- fromJSON(file='/resources/db2_bludb_credentials.json', method='C')
# create an odbc.ini configuration file in the user's home directory, using the db credentials
odbc_text <- paste('[ODBC Data Source]\n',
                   '[', db_credentials['dsn_database'] , ']\n',
                   'Driver=', db_credentials['odbc_driver_dir'], '\n',
                   'Authentication=', db_credentials['odbc_authentication'], '\n', sep='')
odbc_file_absolute_path <- file.path(Sys.getenv('HOME'), '.odbc.ini')
cat(odbc_text, file = odbc_file_absolute_path)
# create ODBC DSN connection string, using the db credentials
dsn <- paste('DSN=', db_credentials['dsn_database'] , ';',
             'DATABASE=', db_credentials['dsn_database'] , ';',
             'HOSTNAME=', db_credentials['dsn_hostname'] , ';',
             'PORT=', db_credentials['dsn_port'] , ';',
             'PROTOCOL=', db_credentials['dsn_protocol'] , ';',
             'UID=', db_credentials['dsn_uid'] , ';',
             'PWD=', db_credentials['dsn_pwd'] , ';',
             sep='')
# create DB2 connection channel, using the dsn
channel <- odbcDriverConnect(dsn, believeNRows = FALSE) # create database connection
odbcGetInfo(channel) # show database connection info to verify successful connection

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** You can find details about this code and an in-depth description about creating a DB2 database connection in the **Tutorial - Access DB2 Using R** that you can download from our [Welcome](/pages/welcome) page.
</div>

## Implement `analyse_seasonality` R Function
The following code snippet implements a function that issues a SQL query against the `eurex` DB2 database containing financial data. Furthermore, it plots a graph of the results:

In [None]:
%%R
analyse_seasonality = function(secid)
{
    sqlstring = paste("
        select minute, 
            avg(vwaprice) as vwaprice, 
            avg(range1) as range, 
            avg(units) as units, 
            avg(buyunits) as buyunits,
            avg(sellunits) as sellunits
        from (select 
             date(data_timestamp),
             minute(data_timestamp) as Minute,
             sum(units*price)/sum(units) as vwaprice,
             max(price)-min(price) as range1,
             sum(units) as units,
             sum(aggressor_side*units) as buyunits,
             sum((1-aggressor_side)*units) as sellunits
            from eurex.f_eurex_trades
            where security_id = ",secid,"
            and hour(data_timestamp) < '22'
            group by date(data_timestamp),minute(data_timestamp)
            order by date(data_timestamp), minute(data_timestamp)) as a
        group by minute
        order by minute",sep="")
    # query intraday seasonality stats
    datatab = sqlQuery(channel, sqlstring)
    
    # query product name to be included in visualization
    product_id_sql_string = paste("select distinct product_id
                                   from eurex.f_eurex_trades
                                   where security_id =", secid, sep="")
    product_id = sqlQuery(channel, product_id_sql_string)
    product = product_id[,'PRODUCT_ID']

    # configure layout of multiple graphs, plot graphs
    layout(c(1,kronecker(2:5,rep(1,1.8))))
    plot(0,yaxt='n',col='white',frame.plot=0,xlab="",ylab="",xaxt='n')
    title_string = paste('Intraday seasonalities - averages per minute: ', product, '-', secid, sep=" ")
    title(main=title_string, line = -3)
    par(mar=c(2,3.5,2,0)) # set margins
    plot(datatab[,'VWAPRICE'],type='h',las=1,frame.plot=0,xlab="",ylab="",col='darkblue',xaxt='n')
    axis(1,c(0:14*60+1,840),datatab[c(0:14*60+1,840),'minute'])
    title(main='Prices')
    
    plot(datatab[,'RANGE'],type='h',las=1,frame.plot=0,xlab="",ylab="",col='darkblue',xaxt='n')
    axis(1,c(0:14*60+1,840),datatab[c(0:14*60+1,840),'minute']) 
    title(main='Price range')
    
    plot(datatab[,'UNITS'],type='h',las=1,frame.plot=0,xlab="",ylab="",col='darkblue',xaxt='n')
    axis(1,c(0:14*60+1,840),datatab[c(0:14*60+1,840),'minute']) 
    title(main='Traded contracts')
    
    plot(abs(datatab[,'BUYUNITS']-datatab[,'SELLUNITS'])/datatab[,'UNITS'],type='h',las=1,frame.plot=0,xlab="",ylab="",col='darkblue',xaxt='n')
    axis(1,c(0:14*60+1,840),datatab[c(0:14*60+1,840),'minute']) 
    title(main='Order imbalance')
}

## Plot Intraday Seasonality for Select Securities

Let's test the `analyse_seasonality` function by invoking it for the FDAX security ID **464957**. You should see a set of plots titled "Intraday seasonalities - averages per minute: FDAX - 464957".

<div class="alert alert-block alert-info" style="margin-top: 20px">**Note:** The <span style="white-space: pre;font-family: monospace;">%%time</span> notebook magic function at the top of the cell measures and prints the cell execution time. It provides an easy way to measure performance of long running tasks during your analysis.
</div>

In [None]:
%%time
%%R
analyse_seasonality(464957)

Having confirmed that the `analyse_seasonality` function works, let's invoke the function for additional securities.

In [None]:
%%javascript
IPython.OutputArea.auto_scroll_threshold = 500; // prevent cell output from scrolling

In [None]:
%%time
%%R
analyse_seasonality(464977)
analyse_seasonality(454608)
analyse_seasonality(566988)
analyse_seasonality(567008)
analyse_seasonality(558044)
analyse_seasonality(661030)
analyse_seasonality(661055)
analyse_seasonality(653147)
analyse_seasonality(760534)
analyse_seasonality(760554)
analyse_seasonality(750818)
analyse_seasonality(855848)
analyse_seasonality(855868)
analyse_seasonality(847251)

## Next Steps

Feel free to analyze additional securities.  To obtain a list of **all** valid security IDs, run the following query:

In [None]:
%%R
security_ids_query = "select distinct security_id, product_id
                      from eurex.f_eurex_trades
                      order by security_id"
security_ids = sqlQuery(channel, security_ids_query)
options(scipen=999) # disable exponential notation for printing numbers
security_ids

# Summary 
In this tutorial you established a connection to a DB2 database from the Knowledge Anyhow Workbench. You queried and visualized data using R.

Additional tutorials for the IBM Knowledge Anyhow Workbench are available on our [Welcome](/pages/welcome) page.

## Want to learn more?

<a href="http://bigdatauniversity.com/courses/introduction-to-r/?utm_source=tutorial-intraday-seasonality-r&utm_medium=dswb&utm_campaign=bdu"><img src = "https://ibm.box.com/shared/static/r3jvb2wbr4meivra8swkmuf5uo30hd9g.png"> </a>

Created by: <a href="https://bigdatauniversity.com/?utm_source=bducreatedbylink&utm_medium=dswb&utm_campaign=bdu">The Cognitive Class Team</a>