# Advanced Data Science &amp; Python for Finance  <br><br> Capital IQ + Backtesting & Intrinio API 

-----

FIN580-59305

Jose Luis Rodriguez

Director of Margolis Market Information Lab at University of Illinois at Urbana-Champaign.

* linkedin.com/in/jlroo
* github.com/jlroo

-----

* [Intrinio API](#intrinio)
* [Common Financial Analyses](#commonanalyses)
* [Building A Trading Strategy](#tradingstrategy)
* [Backtesting with Pandas and Matplotlib](#backtesting)
* [Backtrader](#backtrader)
    

## Packages and Settings

First make sure that the API credentials are stored in a secure file to minimize exposure. We will use the package ``configparses`` to reach the credentials.

**Configuration**

In [None]:
import intrinio_sdk
import configparser as cp

**Scientific Analysis**

In [None]:
import pandas as pd

## Capital IQ - S&P Composite 1500 (^SP1500) > Constituents


<img src="https://compass2g.illinois.edu/bbcswebdav/courses/fin_580_120201_187292/sp1500.png"></img>

<br>

1. Download the constituents of SP1500 from Capital IQ
2. After downloading the constituents as excel file. Read the file using pandas `read_excel` functions
3. Make sure that when you read the file you skip the empty rows at the header and footer of the excel file


In [None]:
sp_df = pd.read_excel("../data/SP1500.xls", skiprows = 14, skipfooter = 12)

In [None]:
sp_df.head()

In [None]:
sp_df.tail()

In [None]:
sp_df.columns

In [None]:
sp_df.columns = ['company', 'Exchange:Ticker', 'currency',
                 'marketcap_mm', 'revenue_mm', 'pct_price_change_lastday',
                 'pct_pricechange_30day', 'pct_price_change_ytd', 'pct_price_change_12_month', 
                 'price_close', 'price_earnings_ratio', 'price_bookvalue_ratio', 'industry']

In [None]:
sp_df['exchange'] = sp_df['Exchange:Ticker'].apply(lambda i:i.split(":")[0])
sp_df['ticker']  = sp_df['Exchange:Ticker'].apply(lambda i:i.split(":")[1])

In [None]:
sp_df = sp_df.drop(columns=['Exchange:Ticker','currency'])
sp_df.columns

**Reorder columns**

In [None]:
sp_df = sp_df[['company','ticker', 'price_close', 'pct_price_change_lastday', 
               'pct_pricechange_30day', 'pct_price_change_ytd', 'pct_price_change_12_month',
               'price_earnings_ratio', 'price_bookvalue_ratio', 'marketcap_mm', 
               'revenue_mm','exchange', 'industry']]

sp_df.shape

**Remove the Percentage sign and parentheses from pct_price_change_lastday**

In [None]:
pct_price_change_lastday = sp_df['pct_price_change_lastday'].apply(lambda x: x.replace('-', ""))
pct_price_change_lastday = pct_price_change_lastday.apply(lambda x: x.replace('(', '-'))
pct_price_change_lastday = pct_price_change_lastday.apply(lambda x: x.replace(')', ''))
pct_price_change_lastday = pct_price_change_lastday.apply(lambda x: x.replace('%', ''))
pct_price_change_lastday = pd.to_numeric(pct_price_change_lastday)

pct_pricechange_30day = sp_df['pct_pricechange_30day'].apply(lambda x: x.replace('-', ""))
pct_pricechange_30day = pct_pricechange_30day.apply(lambda x: x.replace('(', '-'))
pct_pricechange_30day = pct_pricechange_30day.apply(lambda x: x.replace(')', ''))
pct_pricechange_30day = pct_pricechange_30day.apply(lambda x: x.replace('%', ''))
pct_pricechange_30day = pd.to_numeric(pct_pricechange_30day)

pct_price_change_ytd = sp_df['pct_price_change_ytd'].apply(lambda x: x.replace('-', ""))
pct_price_change_ytd = pct_price_change_ytd.apply(lambda x: x.replace('(', '-'))
pct_price_change_ytd = pct_price_change_ytd.apply(lambda x: x.replace(')', ''))
pct_price_change_ytd = pct_price_change_ytd.apply(lambda x: x.replace('%', ''))
pct_price_change_ytd = pd.to_numeric(pct_price_change_ytd)

pct_price_change_12_month = sp_df['pct_price_change_12_month'].apply(lambda x: x.replace('-', ""))
pct_price_change_12_month = pct_price_change_12_month.apply(lambda x: x.replace('(', '-'))
pct_price_change_12_month = pct_price_change_12_month.apply(lambda x: x.replace(')', ''))
pct_price_change_12_month = pct_price_change_12_month.apply(lambda x: x.replace('%', ''))
pct_price_change_12_month = pd.to_numeric(pct_price_change_12_month)

In [None]:
sp_df['pct_price_change_lastday'] = pct_price_change_lastday
sp_df['pct_pricechange_30day'] = pct_pricechange_30day
sp_df['pct_price_change_ytd'] = pct_price_change_ytd
sp_df['pct_price_change_12_month'] = pct_price_change_12_month

**After changing all the values to numeric drop any NAs from the data frame**

In [None]:
sp_df = sp_df.dropna()
sp_df.shape

**Save final dataframe to csv**

In [None]:
sp_df.to_csv("../data/SP1500.csv")

**Now find an industries with at least 10 companies**

In [None]:
pd.value_counts(sp_df['industry'])[(pd.value_counts(sp_df['industry'])>10)]

**Create a new dataframe with only the industry that you selected**

In [None]:
oil = sp_df[sp_df['industry']=='Oil and Gas Exploration and Production']
oil.head()

In [None]:
tickers = oil['ticker'].to_list()
tickers[:5]

<a id='intrinio'></a>
## Intrinio API 
**Secure method to load API credentials**

In [None]:
cfg = cp.ConfigParser()
cfg.read('../resources/credentials.cfg')

**Connect to Intrinio API using your sandbox API key**

In [None]:
API_KEY = cfg['intrinio']['app_key']

intrinio_sdk.ApiClient().configuration.api_key['api_key'] = API_KEY

security_api = intrinio_sdk.SecurityApi()

**Intrinio API Request**

In [None]:
# ~120 Trading Days
len(pd.bdate_range('2019-11-01','2020-04-28'))

In [None]:
# date | Return prices on or after the date (optional)
start_date = '2019-11-01'

# date | Return prices on or before the date (optional)
end_date = '2020-04-28'

# str | Return stock prices in the given frequency (optional) (default to daily)
frequency = 'daily' 

## Making multiple request to Intrinio API

In [None]:
dfs = []

for ticker in tickers:
    next_page = ''
    response = security_api.get_security_stock_prices(ticker,
                                                      start_date = start_date,
                                                      end_date = end_date)
    df = [p.to_dict() for p in response.stock_prices]
    next_page = response.next_page
    if next_page != None:
        response = security_api.get_security_stock_prices(ticker,
                                                          start_date = start_date,
                                                          end_date = end_date,
                                                          next_page = next_page)    
        df.extend(p.to_dict() for p in response.stock_prices)
    df = pd.DataFrame.from_dict(df)
    df['secid'] = ticker
    dfs.append(df)

In [None]:
oil_df = pd.concat(dfs)
oil_df.index = pd.DatetimeIndex(oil_df['date'])
oil_df = oil_df.drop('date', axis=1)
oil_df.index.name = None

#SORT DATETIME INDEX
oil_df = oil_df.sort_index()
oil_df.shape

In [None]:
oil_df.head()

In [None]:
oil_df.to_csv("../data/oil_df-jul.csv")