In [17]:
# !pip install pandas_datareader
# !pip install ta
# !pip install yfinance

# Data handler

The `quantlib` package has a library to handle data. It pulls data from different sources, and lets the user prepair a dataset with different technical analysis indicators. To use it we first need to import the module:

In [20]:
import sys
sys.path.append('../src')

In [22]:
%load_ext autoreload
from quantlib.data_handler import *

The class `Data` has different methods to handle the data. The user can initializa an istance by simply passing a list of tickers that wanted to be pulled, and a tieme interval for which the data wants to be pulled. For example, if the user wants to pull data from the following tickers between the 4th of septhember of 2018 and the 2nd of september of 2021, he can simply do

In [30]:
tickers = ['AAPL', 'TSLA','KO', 'NIO', 'SPY']
data = Data(tickers, start_date=date(2018, 9, 4), end_date=date(2021, 9, 2))

[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed
[*********************100%***********************]  1 of 1 completed


This will construct a basic data class with the Open, High, Low and Close/Adjusted Close (OHLW) of each ticker listed. The `Data` instance has multiple dataframes containing information of the underlying tickers. The main data source is the `raw_data` attribute, this attribute is a dictionary that contains for each ticker (keys) a dataframe with all it's associated data. As a default, the `Data` constructor always creates the `log_return` column as we can see:

In [31]:
data.raw_data['AAPL']

Unnamed: 0_level_0,open,high,low,close,adj close,volume,log_return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2018-09-04,57.102501,57.294998,56.657501,57.090000,55.309586,109560400,
2018-09-05,57.247501,57.417500,56.275002,56.717499,54.948704,133332000,-0.006546
2018-09-06,56.557499,56.837502,55.325001,55.775002,54.035595,137160000,-0.016757
2018-09-07,55.462502,56.342499,55.177502,55.325001,53.599628,150479200,-0.008101
2018-09-10,55.237499,55.462502,54.117500,54.582500,52.880280,158066000,-0.013512
...,...,...,...,...,...,...,...
2021-08-26,148.350006,149.119995,147.509995,147.539993,147.539993,48597200,-0.005542
2021-08-27,147.479996,148.750000,146.830002,148.600006,148.600006,55721500,0.007159
2021-08-30,149.000000,153.490005,148.610001,153.119995,153.119995,90956700,0.029964
2021-08-31,152.660004,152.800003,151.289993,151.830002,151.830002,86453100,-0.008460


Another dataframe that the `Data` instances have is the `log_returns` dataframe. This dataframe has the returns of all the listed tickers.

In [32]:
data.log_returns

Unnamed: 0_level_0,AAPL,TSLA,KO,NIO,SPY
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2018-09-05,-0.006546,-0.028825,0.013757,,-0.002695
2018-09-06,-0.016757,0.000748,0.008558,,-0.003015
2018-09-07,-0.008101,-0.065111,-0.001093,,-0.001945
2018-09-10,-0.013512,0.081176,0.007409,,0.001737
2018-09-11,0.024969,-0.021454,-0.000869,,0.003292
...,...,...,...,...,...
2021-08-26,-0.005542,-0.014218,-0.009497,-0.017874,-0.005921
2021-08-27,0.007159,0.015229,0.001979,-0.005504,0.008901
2021-08-30,0.029964,0.026325,0.009479,-0.006592,0.004388
2021-08-31,-0.008460,0.006559,0.002311,0.039170,-0.001483


You can also print the expected returs as

In [33]:
data.expected_returns

AAPL    0.001345
TSLA    0.003371
KO      0.000442
NIO     0.002378
SPY     0.000661
dtype: float64

Or the variance and covariance matrix as

In [34]:
data.covariance

Unnamed: 0,AAPL,TSLA,KO,NIO,SPY
AAPL,0.000503,0.00043,0.000147,0.000348,0.000255
TSLA,0.00043,0.001866,0.00014,0.000827,0.000279
KO,0.000147,0.00014,0.000232,5e-05,0.000149
NIO,0.000348,0.000827,5e-05,0.003843,0.000226
SPY,0.000255,0.000279,0.000149,0.000226,0.000205


## Indicators

As our indicator calculator engine we use the `ta` package (the interested reader can find documentation [here](https://github.com/bukosabino/ta)). To if the user want's to add an indicator that the `ta` package supports he can use the `add_ta_indicator` metho of the `Data` class instances. This method requieres the indicator name and a dictionary with the key-worded arguments o needed in the indicator. For example, if we wanted to tag an RSI indicator to all of our listed tickers we could do the following:

In [36]:
%autoreload
data.add_ta_indicator(indicator_name='RSIIndicator',
                          indicator_kwargs={'window': 17})

In [38]:
data.raw_data['AAPL']

Unnamed: 0_level_0,open,high,low,close,adj close,volume,log_return,rsi
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2018-09-04,57.102501,57.294998,56.657501,57.090000,55.309586,109560400,,
2018-09-05,57.247501,57.417500,56.275002,56.717499,54.948704,133332000,-0.006546,
2018-09-06,56.557499,56.837502,55.325001,55.775002,54.035595,137160000,-0.016757,
2018-09-07,55.462502,56.342499,55.177502,55.325001,53.599628,150479200,-0.008101,
2018-09-10,55.237499,55.462502,54.117500,54.582500,52.880280,158066000,-0.013512,
...,...,...,...,...,...,...,...,...
2021-08-26,148.350006,149.119995,147.509995,147.539993,147.539993,48597200,-0.005542,54.466303
2021-08-27,147.479996,148.750000,146.830002,148.600006,148.600006,55721500,0.007159,56.812830
2021-08-30,149.000000,153.490005,148.610001,153.119995,153.119995,90956700,0.029964,64.987510
2021-08-31,152.660004,152.800003,151.289993,151.830002,151.830002,86453100,-0.008460,61.459852


In [42]:
import inspect
ind = data.indicators['RSIIndicator']['AAPL']

In [44]:
inspect(ind)

TypeError: 'module' object is not callable