# Loading and Adjusting Sharadar US Equity Prices

The notebook demonstrates and tests the get_pricing method for loading and adjusting the Sharadar SEP bundle (ingested with the sep_ingest notebook in this repository). The corresponding example consists of four steps: (1) retrieving adjusted data with get_pricing from Zipline, (2) reading split-adjusted data from Sharadar files, (3) unadjusting Sharadar data for calculating unadjusted dividend ratios and (4) adding these dividend ratios to Sharadar for testing with data generated by the get_pricing method.

In [1]:
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import warnings
warnings.filterwarnings('ignore')

Importing the fsharadar.sep module from the Flounder extension

In [2]:
from fsharadar import sep

## 1. Getting Historical Prices Adjusted for Splits and Dividends

In [3]:
start_date = "2011-01-03"; end_date = "2021-02-12"

Selecting one field from Sharadar SEP bundle

In [4]:
sep.bundle_tags

['close', 'high', 'low', 'open', 'volume']

In [5]:
field = 'close'

Getting all tickers maintained in Sharadar SEP bundle

In [6]:
sep_bundle_data = sep.load()
sep_asset_finder = sep_bundle_data.asset_finder
sep_assets = sep_asset_finder.retrieve_all(sep_asset_finder.sids)
sep_tickers = [asset.symbol for asset in sep_assets]
len(sep_tickers)

11663

Applying get_pricing for loading adjusted stock prices

In [7]:
%%time
prices = sep.get_pricing(sep_tickers, start_date, end_date, field)

CPU times: user 22.5 s, sys: 1.65 s, total: 24.1 s
Wall time: 23.1 s


In [8]:
prices.head(2)

Unnamed: 0,Equity(101386 [GBBT]),Equity(101501 [BBUCQ]),Equity(101512 [GOVB]),Equity(101923 [CIBN]),Equity(103609 [AMCRY]),Equity(103618 [DASTF]),Equity(103628 [AXAHY]),Equity(103638 [NHYDY]),Equity(103642 [KKPNY]),Equity(103688 [BASFY]),...,Equity(633630 [CTAQU]),Equity(633631 [DWIN.U]),Equity(633632 [HMCOU]),Equity(633633 [ALTUU]),Equity(633634 [MUDSU]),Equity(633635 [NOACU]),Equity(633636 [SVFAU]),Equity(633638 [IPOC.U]),Equity(633639 [GSAH.U]),Equity(633851 [FSII])
2011-01-03 00:00:00+00:00,,0.01,6.787,4.063,15.488,,10.385,5.366,5.035,13.596,...,,,,,,,,,,4.41
2011-01-04 00:00:00+00:00,,0.01,6.787,4.063,15.342,,10.574,5.266,5.055,13.246,...,,,,,,,,,,4.17


## 2. Loading Sharadar Files for Testing Zipline Adjusted Prices

Tickers (with sids)

In [9]:
sharadar_tickers_file = "./SHARADAR_TICKERS.csv"

In [10]:
tickers_df = pd.read_csv(sharadar_tickers_file)
print(len(tickers_df.ticker.unique()))

23794


SEP (Sharadar Equity Prices) 

In [11]:
sharadar_sep_file = "./SHARADAR_SEP.csv"

In [12]:
%%time
from fsharadar.sep.ingest import read_sep_file

sep_df = read_sep_file(sharadar_sep_file, tickers_df)
print(len(sep_df.symbol.unique()))

11667
CPU times: user 18.3 s, sys: 1.13 s, total: 19.4 s
Wall time: 20.2 s


In [13]:
raw_data = sep_df.set_index(['symbol', 'date'])
raw_data.sort_index(level=0, inplace=True)

In [14]:
raw_data.head(2)

Unnamed: 0_level_0,Unnamed: 1_level_0,open,high,low,close,volume,dividends
symbol,date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
A,2011-01-03,41.56,42.14,41.411,41.88,3572300.0,0.0
A,2011-01-04,41.99,42.1,41.18,41.49,3588900.0,0.0


Actions

In [15]:
actions_file = "./SHARADAR_ACTIONS.csv"

In [16]:
%%time

from fsharadar.sep.ingest import read_actions_file

# read actions_file (with splits)
actions_df = read_actions_file(actions_file, tickers_df)

CPU times: user 288 ms, sys: 4.13 ms, total: 292 ms
Wall time: 309 ms


## 3. Unadjusting Sharadar Split-Adjusted Prices

In [17]:
%%time

from fsharadar.sep.ingest import unadjust_splits

unadj_sep_df = unadjust_splits(sep_df, actions_df)

CPU times: user 1min 32s, sys: 1.44 s, total: 1min 33s
Wall time: 1min 33s


In [18]:
unadj_raw_data = unadj_sep_df.set_index(['symbol', 'date'])
unadj_raw_data.sort_index(level=0, inplace=True)

## 4. Checking Differences in Adjusted Prices

In [19]:
def apply_dividend_ratios(split_adj_df, dividend_ratios, column):
    
    # Reverse the DataFrame order, sorting by date in descending order
    split_adj_df = split_adj_df.sort_index(ascending=False)
    
    split_adj_prices = split_adj_df[column]
    cum_dividend_ratios = dividend_ratios.cumprod()
    adj_ts = split_adj_prices*cum_dividend_ratios

    # Change the DataFrame order back to dates ascending
    adj_ts.sort_index(ascending=True, inplace=True)

    return adj_ts

The following code runs through each ticker, calculates and applies the unadjusted dividend ratios to Sharadar split-adjusted prices, and checks them for potential discrepencies with the corresponding adjusted prices from the get_pricing method of the fsharadar.sep module. 

In [20]:
%%time

from fsharadar.sep.ingest import calc_dividend_ratios

for i, asset in enumerate(prices.columns):
    
    ticker = asset.symbol
    
    unadj_raw_xs = unadj_raw_data.xs(ticker) 
    dividend_ratios = calc_dividend_ratios(unadj_raw_xs)
    
    split_adj_xs = raw_data.xs(ticker) 
    adj_raw_ts = apply_dividend_ratios(split_adj_xs, dividend_ratios, field).dropna()
    
    asset_prices_ts = prices[asset].dropna() 
    
    std_diff = np.std(adj_raw_ts.values - asset_prices_ts.values)
    
    if std_diff > 5.e-4:
        print(i, ticker, std_diff)  

CPU times: user 26.9 s, sys: 0 ns, total: 26.9 s
Wall time: 26.9 s
