### Predicting S&P500 stock returns using neural networks: acquiring and preparing market index data
 &nbsp;

This program starts a CSV data files downloaded from yahoo for the S&P 500 market index ETF, SPY. It then performs similar data transformations as the main data acquisition program. Also, it inputs dates from the main data acquisition program so that the datasets can be merged correctly. 

Previous versions of the program had other market indices as well.
  
It creates an output file: Indices_Long_V4.CSV that is then used for network training and testing.
 &nbsp;

 
 
March 2018

Murat Aydogdu

In [51]:
import sys
import warnings
if not sys.warnoptions:
    warnings.simplefilter('ignore')

In [52]:
from IPython.display import display
import pandas as pd
import numpy as np
import datetime

In [53]:
pd.set_option('display.max_columns', None)
pd.options.display.max_rows = 100
pd.options.display.float_format = '{:20,.4f}'.format

In [54]:
# If other indices are used, they can be read in and concatenated here
spy = pd.read_csv("SPY.CSV")
spy['Ticker'] = 'SPY'

df = spy
display(df)

Unnamed: 0,Date,Open,High,Low,Close,Adj Close,Volume,Ticker
0,1998-01-20,96.6875,98.0156,96.5000,97.8750,68.2305,5091700,SPY
1,1998-01-21,97.2187,97.6875,96.1562,96.9375,67.5770,4699400,SPY
2,1998-01-22,96.1562,96.8750,95.8750,96.0781,66.9779,4543400,SPY
3,1998-01-23,96.5000,96.7812,95.0000,95.9375,66.8798,6350300,SPY
4,1998-01-26,96.3750,96.7343,95.4062,95.8750,66.8363,4362900,SPY
5,1998-01-27,95.8125,97.5000,95.6562,96.8437,67.5116,7044200,SPY
6,1998-01-28,97.4062,98.1093,97.1875,97.7187,68.1216,4268600,SPY
7,1998-01-29,97.8437,99.5625,97.5625,98.2500,68.4919,8007700,SPY
8,1998-01-30,98.7812,98.9687,98.0000,98.3125,68.5355,3649100,SPY
9,1998-02-02,99.9062,100.5000,99.7500,99.9375,69.6683,5756300,SPY


In [55]:
df['V'] = df['Close']*df['Volume'] / 1000000
df['P'] = df['Adj Close']
df = df[['Ticker','Date','V','P']]
df.sort_values(by = ['Ticker','Date'], ascending=True, inplace=True)
print df.shape
display(df)

(5050, 4)


Unnamed: 0,Ticker,Date,V,P
0,SPY,1998-01-20,498.3501,68.2305
1,SPY,1998-01-21,455.5481,67.5770
2,SPY,1998-01-22,436.5212,66.9779
3,SPY,1998-01-23,609.2319,66.8798
4,SPY,1998-01-26,418.2930,66.8363
5,SPY,1998-01-27,682.1864,67.5116
6,SPY,1998-01-28,417.1220,68.1216
7,SPY,1998-01-29,786.7565,68.4919
8,SPY,1998-01-30,358.7521,68.5355
9,SPY,1998-02-02,575.2702,69.6683


In [56]:
# Summary statistics by ticker
def f(x):
    d = {}
    d['date_count'] = x['Date'].count()
    d['date_min'] = x['Date'].min()
    d['date_max'] = x['Date'].max()
    return pd.Series(d, index=['date_count', 'date_min', 'date_max'])

In [57]:
# Keep all the observations for now
dtdf = df

In [58]:
# The indices were downloaded from yahoo finance and their
# beginning and ending dates are different from
# those of the stocks in the main data set (that came from quandl)
df_summary = dtdf.groupby('Ticker').apply(f)
df_summary = df_summary.sort_values(['date_max', 'date_count', 'date_min'], ascending=[True, False, True])
display(df_summary)

Unnamed: 0_level_0,date_count,date_min,date_max
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
SPY,5050,1998-01-20,2018-02-12


Construct variables using lags and leads.

Today is day *t*. For each day *t*, we need P~t~, P~t-N~, P~t+N~, and AV~t~
 
 AV~t~ is the average dollar volume of last N days, ending (i.e., including) today. This is parallel to how returns are calculated.

In [59]:
dtdf.sort_values(by = ['Ticker','Date'], ascending=True, inplace=True)

In [60]:
# m : minus, p: plus
# Negative values can be used for leads ("forward lags")
# Return will be measured based on Pt and Pt-N
# Average dollar volume will be measured based on Vt through Vt-N

N = 20
lagN = N*1
leadN = N*-1

# Price: we need the N-lead and N-lag values
# Also, get the volatility of daily returns
cname = 'P'+'m'+str(1).zfill(2)
dtdf[cname] = dtdf.groupby('Ticker')['P'].shift(1)
dtdf['DR'] = (dtdf['P'] / dtdf['Pm01']) - 1
cname = 'P'+'m'+str(N).zfill(2)
dtdf[cname] = dtdf.groupby('Ticker')['P'].shift(lagN)
cname = 'P'+'p'+str(N).zfill(2)
dtdf[cname] = dtdf.groupby('Ticker')['P'].shift(leadN)
cname = 'AV'
# Average dollar volume of last N days, ending (including) today
dtdf[cname] = dtdf.groupby('Ticker')['V'].rolling(lagN).mean().reset_index(0,drop=True)
dtdf['DRSD'] = dtdf.groupby('Ticker')['DR'].rolling(lagN).std().reset_index(0,drop=True)

In [61]:
pd.options.display.max_rows = 50    
display(dtdf)

Unnamed: 0,Ticker,Date,V,P,Pm01,DR,Pm20,Pp20,AV,DRSD
0,SPY,1998-01-20,498.3501,68.2305,,,,72.1082,,
1,SPY,1998-01-21,455.5481,67.5770,68.2305,-0.0096,,71.7270,,
2,SPY,1998-01-22,436.5212,66.9779,67.5770,-0.0089,,72.2607,,
3,SPY,1998-01-23,609.2319,66.8798,66.9779,-0.0015,,72.5439,,
4,SPY,1998-01-26,418.2930,66.8363,66.8798,-0.0007,,71.9775,,
5,SPY,1998-01-27,682.1864,67.5116,66.8363,0.0101,,72.8707,,
6,SPY,1998-01-28,417.1220,68.1216,67.5116,0.0090,,73.2846,,
7,SPY,1998-01-29,786.7565,68.4919,68.1216,0.0054,,73.2846,,
8,SPY,1998-01-30,358.7521,68.5355,68.4919,0.0006,,73.1321,,
9,SPY,1998-02-02,575.2702,69.6683,68.5355,0.0165,,73.5460,,


In [62]:
# Compute the N-day return 
# and next period return that will determine Y

dtdf['R'] = (dtdf['P'] / dtdf['Pm20']) - 1
dtdf['YR'] = (dtdf['Pp20'] / dtdf['P']) - 1

dtdf = dtdf.dropna()

In [63]:
# These are the dates that were selected for the stocks in the
# main data acquisition program at this point of that program.
# This step ensures that the index data dates are the same as the stock data dates.
selected = ['2002-01-31',
 '2002-03-01',
 '2002-04-01',
 '2002-04-29',
 '2002-05-28',
 '2002-06-25',
 '2002-07-24',
 '2002-08-21',
 '2002-09-19',
 '2002-10-17',
 '2002-11-14',
 '2002-12-13',
 '2003-01-14',
 '2003-02-12',
 '2003-03-13',
 '2003-04-10',
 '2003-05-09',
 '2003-06-09',
 '2003-07-08',
 '2003-08-05',
 '2003-09-03',
 '2003-10-01',
 '2003-10-29',
 '2003-11-26',
 '2003-12-26',
 '2004-01-27',
 '2004-02-25',
 '2004-03-24',
 '2004-04-22',
 '2004-05-20',
 '2004-06-21',
 '2004-07-20',
 '2004-08-17',
 '2004-09-15',
 '2004-10-13',
 '2004-11-10',
 '2004-12-09',
 '2005-01-07',
 '2005-02-07',
 '2005-03-08',
 '2005-04-06',
 '2005-05-04',
 '2005-06-02',
 '2005-06-30',
 '2005-07-29',
 '2005-08-26',
 '2005-09-26',
 '2005-10-24',
 '2005-11-21',
 '2005-12-20',
 '2006-01-20',
 '2006-02-17',
 '2006-03-20',
 '2006-04-18',
 '2006-05-16',
 '2006-06-14',
 '2006-07-13',
 '2006-08-10',
 '2006-09-08',
 '2006-10-06',
 '2006-11-03',
 '2006-12-04',
 '2007-01-04',
 '2007-02-02',
 '2007-03-05',
 '2007-04-02',
 '2007-05-01',
 '2007-05-30',
 '2007-06-27',
 '2007-07-26',
 '2007-08-23',
 '2007-09-21',
 '2007-10-19',
 '2007-11-16',
 '2007-12-17',
 '2008-01-16',
 '2008-02-14',
 '2008-03-14',
 '2008-04-14',
 '2008-05-12',
 '2008-06-10',
 '2008-07-09',
 '2008-08-06',
 '2008-09-04',
 '2008-10-02',
 '2008-10-30',
 '2008-11-28',
 '2008-12-29',
 '2009-01-28',
 '2009-02-26',
 '2009-03-26',
 '2009-04-24',
 '2009-05-22',
 '2009-06-22',
 '2009-07-21',
 '2009-08-18',
 '2009-09-16',
 '2009-10-14',
 '2009-11-11',
 '2009-12-10',
 '2010-01-11',
 '2010-02-09',
 '2010-03-10',
 '2010-04-08',
 '2010-05-06',
 '2010-06-04',
 '2010-07-02',
 '2010-08-02',
 '2010-08-30',
 '2010-09-28',
 '2010-10-26',
 '2010-11-23',
 '2010-12-22',
 '2011-01-21',
 '2011-02-18',
 '2011-03-21',
 '2011-04-18',
 '2011-05-17',
 '2011-06-15',
 '2011-07-14',
 '2011-08-11',
 '2011-09-09',
 '2011-10-07',
 '2011-11-04',
 '2011-12-05',
 '2012-01-04',
 '2012-02-02',
 '2012-03-02',
 '2012-03-30',
 '2012-04-30',
 '2012-05-29',
 '2012-06-26',
 '2012-07-25',
 '2012-08-22',
 '2012-09-20',
 '2012-10-18',
 '2012-11-19',
 '2012-12-18',
 '2013-01-17',
 '2013-02-15',
 '2013-03-18',
 '2013-04-16',
 '2013-05-14',
 '2013-06-12',
 '2013-07-11',
 '2013-08-08',
 '2013-09-06',
 '2013-10-04',
 '2013-11-01',
 '2013-12-02',
 '2013-12-31',
 '2014-01-30',
 '2014-02-28',
 '2014-03-28',
 '2014-04-28',
 '2014-05-27',
 '2014-06-24',
 '2014-07-23',
 '2014-08-20',
 '2014-09-18',
 '2014-10-16',
 '2014-11-13',
 '2014-12-12',
 '2015-01-13',
 '2015-02-11',
 '2015-03-12',
 '2015-04-10',
 '2015-05-08',
 '2015-06-08',
 '2015-07-07',
 '2015-08-04',
 '2015-09-01',
 '2015-09-30',
 '2015-10-28',
 '2015-11-25',
 '2015-12-24',
 '2016-01-26',
 '2016-02-24',
 '2016-03-23',
 '2016-04-21',
 '2016-05-19',
 '2016-06-17',
 '2016-07-18',
 '2016-08-15',
 '2016-09-13',
 '2016-10-11',
 '2016-11-08',
 '2016-12-07',
 '2017-01-06',
 '2017-02-06',
 '2017-03-07',
 '2017-04-04',
 '2017-05-02',
 '2017-05-31',
 '2017-06-28',
 '2017-07-27',
 '2017-08-24',
 '2017-09-22',
 '2017-10-20',
 '2017-11-20',
 '2017-12-19']
#print selected

In [64]:
# This will get the monthly observations
dtdf = dtdf[dtdf['Date'].isin(selected)]
display(dtdf)

Unnamed: 0,Ticker,Date,V,P,Pm01,DR,Pm20,Pp20,AV,DRSD,R,YR
1013,SPY,2002-01-31,2253.3233,82.6768,81.7199,0.0117,84.3935,83.0859,1885.3035,0.0113,-0.0203,0.0049
1033,SPY,2002-03-01,2988.3592,83.0859,81.1939,0.0233,82.6768,83.9319,2490.9525,0.0137,0.0049,0.0102
1053,SPY,2002-04-01,2029.1493,83.9319,83.8953,0.0004,83.0859,78.2837,2182.3469,0.0082,0.0102,-0.0673
1073,SPY,2002-04-29,1894.0294,78.2837,78.6720,-0.0049,83.9319,79.1921,2045.6473,0.0107,-0.0673,0.0116
1093,SPY,2002-05-28,2620.0088,79.1921,79.6244,-0.0054,78.2837,71.7209,2276.7021,0.0147,0.0116,-0.0943
1113,SPY,2002-06-25,3254.1137,71.7209,73.3676,-0.0224,79.1921,62.2816,2515.6469,0.0136,-0.0943,-0.1316
1133,SPY,2002-07-24,9066.9717,62.2816,58.7749,0.0597,71.7209,70.3903,4265.8572,0.0252,-0.1316,0.1302
1153,SPY,2002-08-21,3794.4672,70.3903,69.3905,0.0144,62.2816,62.2669,4282.4974,0.0228,0.1302,-0.1154
1173,SPY,2002-09-19,4108.8392,62.2669,63.9210,-0.0259,70.3903,65.1822,3857.2105,0.0150,-0.1154,0.0468
1193,SPY,2002-10-17,6049.5048,65.1822,63.9121,0.0199,62.2669,66.9988,5363.2842,0.0270,0.0468,0.0279


In [65]:
# Rolling 12-month means and standard deviations. 
# They will be used to standardize returns and volumes with respect to each ticker's own history
frame = 12
dtdf['RTM'] = dtdf.groupby('Ticker')['R'].rolling(frame).mean().reset_index(0,drop=True)
dtdf['RTSD'] = dtdf.groupby('Ticker')['R'].rolling(frame).std().reset_index(0,drop=True)
dtdf['AVTM'] = dtdf.groupby('Ticker')['AV'].rolling(frame).mean().reset_index(0,drop=True)
dtdf['AVTSD'] = dtdf.groupby('Ticker')['AV'].rolling(frame).std().reset_index(0,drop=True)
dtdf['SDTM'] = dtdf.groupby('Ticker')['DRSD'].rolling(frame).mean().reset_index(0,drop=True)
dtdf['SDTSD'] = dtdf.groupby('Ticker')['DRSD'].rolling(frame).std().reset_index(0,drop=True)

dtdf['RT'] = (dtdf['R'] - dtdf['RTM']) / dtdf['RTSD']
dtdf['AVT'] = (dtdf['AV'] - dtdf['AVTM']) / dtdf['AVTSD']
dtdf['SDT'] = (dtdf['DRSD'] - dtdf['SDTM']) / dtdf['SDTSD']

In [66]:
display(dtdf)

Unnamed: 0,Ticker,Date,V,P,Pm01,DR,Pm20,Pp20,AV,DRSD,R,YR,RTM,RTSD,AVTM,AVTSD,SDTM,SDTSD,RT,AVT,SDT
1013,SPY,2002-01-31,2253.3233,82.6768,81.7199,0.0117,84.3935,83.0859,1885.3035,0.0113,-0.0203,0.0049,,,,,,,,,
1033,SPY,2002-03-01,2988.3592,83.0859,81.1939,0.0233,82.6768,83.9319,2490.9525,0.0137,0.0049,0.0102,,,,,,,,,
1053,SPY,2002-04-01,2029.1493,83.9319,83.8953,0.0004,83.0859,78.2837,2182.3469,0.0082,0.0102,-0.0673,,,,,,,,,
1073,SPY,2002-04-29,1894.0294,78.2837,78.6720,-0.0049,83.9319,79.1921,2045.6473,0.0107,-0.0673,0.0116,,,,,,,,,
1093,SPY,2002-05-28,2620.0088,79.1921,79.6244,-0.0054,78.2837,71.7209,2276.7021,0.0147,0.0116,-0.0943,,,,,,,,,
1113,SPY,2002-06-25,3254.1137,71.7209,73.3676,-0.0224,79.1921,62.2816,2515.6469,0.0136,-0.0943,-0.1316,,,,,,,,,
1133,SPY,2002-07-24,9066.9717,62.2816,58.7749,0.0597,71.7209,70.3903,4265.8572,0.0252,-0.1316,0.1302,,,,,,,,,
1153,SPY,2002-08-21,3794.4672,70.3903,69.3905,0.0144,62.2816,62.2669,4282.4974,0.0228,0.1302,-0.1154,,,,,,,,,
1173,SPY,2002-09-19,4108.8392,62.2669,63.9210,-0.0259,70.3903,65.1822,3857.2105,0.0150,-0.1154,0.0468,,,,,,,,,
1193,SPY,2002-10-17,6049.5048,65.1822,63.9121,0.0199,62.2669,66.9988,5363.2842,0.0270,0.0468,0.0279,,,,,,,,,


In [67]:
# After the time series scaling, the mean and st dev variables are no longer needed
# Also past and future prices are no longer needed
dtdf.drop(['Pm01','Pm20', 'Pp20','DR', 'RTM', 'RTSD','SDTM','SDTSD', \
           'AVTM', 'AVTSD'], axis=1, inplace = True)
display(dtdf)

Unnamed: 0,Ticker,Date,V,P,AV,DRSD,R,YR,RT,AVT,SDT
1013,SPY,2002-01-31,2253.3233,82.6768,1885.3035,0.0113,-0.0203,0.0049,,,
1033,SPY,2002-03-01,2988.3592,83.0859,2490.9525,0.0137,0.0049,0.0102,,,
1053,SPY,2002-04-01,2029.1493,83.9319,2182.3469,0.0082,0.0102,-0.0673,,,
1073,SPY,2002-04-29,1894.0294,78.2837,2045.6473,0.0107,-0.0673,0.0116,,,
1093,SPY,2002-05-28,2620.0088,79.1921,2276.7021,0.0147,0.0116,-0.0943,,,
1113,SPY,2002-06-25,3254.1137,71.7209,2515.6469,0.0136,-0.0943,-0.1316,,,
1133,SPY,2002-07-24,9066.9717,62.2816,4265.8572,0.0252,-0.1316,0.1302,,,
1153,SPY,2002-08-21,3794.4672,70.3903,4282.4974,0.0228,0.1302,-0.1154,,,
1173,SPY,2002-09-19,4108.8392,62.2669,3857.2105,0.0150,-0.1154,0.0468,,,
1193,SPY,2002-10-17,6049.5048,65.1822,5363.2842,0.0270,0.0468,0.0279,,,


In [68]:
pd.options.mode.chained_assignment = None  # default='warn'
# j-lags of returns and average volumes
J = 12
vars = ['R','AV','DRSD','RT','AVT','SDT']
for i in vars:
    for j in range (1,J+1):
        cname = i+str(j).zfill(2)
        dtdf[cname] = dtdf[i].shift(j)
display(dtdf)  

Unnamed: 0,Ticker,Date,V,P,AV,DRSD,R,YR,RT,AVT,SDT,R01,R02,R03,R04,R05,R06,R07,R08,R09,R10,R11,R12,AV01,AV02,AV03,AV04,AV05,AV06,AV07,AV08,AV09,AV10,AV11,AV12,DRSD01,DRSD02,DRSD03,DRSD04,DRSD05,DRSD06,DRSD07,DRSD08,DRSD09,DRSD10,DRSD11,DRSD12,RT01,RT02,RT03,RT04,RT05,RT06,RT07,RT08,RT09,RT10,RT11,RT12,AVT01,AVT02,AVT03,AVT04,AVT05,AVT06,AVT07,AVT08,AVT09,AVT10,AVT11,AVT12,SDT01,SDT02,SDT03,SDT04,SDT05,SDT06,SDT07,SDT08,SDT09,SDT10,SDT11,SDT12
1013,SPY,2002-01-31,2253.3233,82.6768,1885.3035,0.0113,-0.0203,0.0049,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1033,SPY,2002-03-01,2988.3592,83.0859,2490.9525,0.0137,0.0049,0.0102,,,,-0.0203,,,,,,,,,,,,1885.3035,,,,,,,,,,,,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1053,SPY,2002-04-01,2029.1493,83.9319,2182.3469,0.0082,0.0102,-0.0673,,,,0.0049,-0.0203,,,,,,,,,,,2490.9525,1885.3035,,,,,,,,,,,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1073,SPY,2002-04-29,1894.0294,78.2837,2045.6473,0.0107,-0.0673,0.0116,,,,0.0102,0.0049,-0.0203,,,,,,,,,,2182.3469,2490.9525,1885.3035,,,,,,,,,,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1093,SPY,2002-05-28,2620.0088,79.1921,2276.7021,0.0147,0.0116,-0.0943,,,,-0.0673,0.0102,0.0049,-0.0203,,,,,,,,,2045.6473,2182.3469,2490.9525,1885.3035,,,,,,,,,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1113,SPY,2002-06-25,3254.1137,71.7209,2515.6469,0.0136,-0.0943,-0.1316,,,,0.0116,-0.0673,0.0102,0.0049,-0.0203,,,,,,,,2276.7021,2045.6473,2182.3469,2490.9525,1885.3035,,,,,,,,0.0147,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1133,SPY,2002-07-24,9066.9717,62.2816,4265.8572,0.0252,-0.1316,0.1302,,,,-0.0943,0.0116,-0.0673,0.0102,0.0049,-0.0203,,,,,,,2515.6469,2276.7021,2045.6473,2182.3469,2490.9525,1885.3035,,,,,,,0.0136,0.0147,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1153,SPY,2002-08-21,3794.4672,70.3903,4282.4974,0.0228,0.1302,-0.1154,,,,-0.1316,-0.0943,0.0116,-0.0673,0.0102,0.0049,-0.0203,,,,,,4265.8572,2515.6469,2276.7021,2045.6473,2182.3469,2490.9525,1885.3035,,,,,,0.0252,0.0136,0.0147,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1173,SPY,2002-09-19,4108.8392,62.2669,3857.2105,0.0150,-0.1154,0.0468,,,,0.1302,-0.1316,-0.0943,0.0116,-0.0673,0.0102,0.0049,-0.0203,,,,,4282.4974,4265.8572,2515.6469,2276.7021,2045.6473,2182.3469,2490.9525,1885.3035,,,,,0.0228,0.0252,0.0136,0.0147,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1193,SPY,2002-10-17,6049.5048,65.1822,5363.2842,0.0270,0.0468,0.0279,,,,-0.1154,0.1302,-0.1316,-0.0943,0.0116,-0.0673,0.0102,0.0049,-0.0203,,,,3857.2105,4282.4974,4265.8572,2515.6469,2276.7021,2045.6473,2182.3469,2490.9525,1885.3035,,,,0.0150,0.0228,0.0252,0.0136,0.0147,0.0107,0.0082,0.0137,0.0113,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [69]:
dtdf = dtdf.dropna()
dtdf.shape
# Prior run: (72802, 59)

(178, 83)

In [70]:
print list(dtdf)

['Ticker', 'Date', 'V', 'P', 'AV', 'DRSD', 'R', 'YR', 'RT', 'AVT', 'SDT', 'R01', 'R02', 'R03', 'R04', 'R05', 'R06', 'R07', 'R08', 'R09', 'R10', 'R11', 'R12', 'AV01', 'AV02', 'AV03', 'AV04', 'AV05', 'AV06', 'AV07', 'AV08', 'AV09', 'AV10', 'AV11', 'AV12', 'DRSD01', 'DRSD02', 'DRSD03', 'DRSD04', 'DRSD05', 'DRSD06', 'DRSD07', 'DRSD08', 'DRSD09', 'DRSD10', 'DRSD11', 'DRSD12', 'RT01', 'RT02', 'RT03', 'RT04', 'RT05', 'RT06', 'RT07', 'RT08', 'RT09', 'RT10', 'RT11', 'RT12', 'AVT01', 'AVT02', 'AVT03', 'AVT04', 'AVT05', 'AVT06', 'AVT07', 'AVT08', 'AVT09', 'AVT10', 'AVT11', 'AVT12', 'SDT01', 'SDT02', 'SDT03', 'SDT04', 'SDT05', 'SDT06', 'SDT07', 'SDT08', 'SDT09', 'SDT10', 'SDT11', 'SDT12']


In [71]:
# Components of the dataset
# This can be used to select features to include in the final dataset

id_cols = ['Ticker','Date']
other_cols= ['V','P','AV','R','DRSD','YR']
raw_ret = ['R01', 'R02', 'R03', 'R04', 'R05', 'R06', 'R07', 'R08', 'R09', 'R10', 'R11', 'R12']
raw_vol = ['AV01', 'AV02', 'AV03', 'AV04', 'AV05', 'AV06', 'AV07', 'AV08', 'AV09', 'AV10', 'AV11', 'AV12']
raw_drsd = ['DRSD01', 'DRSD02', 'DRSD03', 'DRSD04', 'DRSD05', 'DRSD06', 'DRSD07', 'DRSD08', 'DRSD09', 'DRSD10', 'DRSD11', 'DRSD12']
ts_ret = [col for col in dtdf if col.startswith('RT')]
ts_vol = [col for col in dtdf if col.startswith('AVT')]
ts_drsd = [col for col in dtdf if col.startswith('SDT')]
print id_cols, other_cols, raw_ret, raw_vol, raw_drsd, ts_vol, ts_drsd

['Ticker', 'Date'] ['V', 'P', 'AV', 'R', 'DRSD', 'YR'] ['R01', 'R02', 'R03', 'R04', 'R05', 'R06', 'R07', 'R08', 'R09', 'R10', 'R11', 'R12'] ['AV01', 'AV02', 'AV03', 'AV04', 'AV05', 'AV06', 'AV07', 'AV08', 'AV09', 'AV10', 'AV11', 'AV12'] ['DRSD01', 'DRSD02', 'DRSD03', 'DRSD04', 'DRSD05', 'DRSD06', 'DRSD07', 'DRSD08', 'DRSD09', 'DRSD10', 'DRSD11', 'DRSD12'] ['AVT', 'AVT01', 'AVT02', 'AVT03', 'AVT04', 'AVT05', 'AVT06', 'AVT07', 'AVT08', 'AVT09', 'AVT10', 'AVT11', 'AVT12'] ['SDT', 'SDT01', 'SDT02', 'SDT03', 'SDT04', 'SDT05', 'SDT06', 'SDT07', 'SDT08', 'SDT09', 'SDT10', 'SDT11', 'SDT12']


In [72]:
# This will become the wide dataset: all data items for SPY for each day
df2 = dtdf[id_cols + other_cols + raw_ret + raw_vol + raw_drsd + ts_ret + ts_vol + ts_drsd]

display(df2)

Unnamed: 0,Ticker,Date,V,P,AV,R,DRSD,YR,R01,R02,R03,R04,R05,R06,R07,R08,R09,R10,R11,R12,AV01,AV02,AV03,AV04,AV05,AV06,AV07,AV08,AV09,AV10,AV11,AV12,DRSD01,DRSD02,DRSD03,DRSD04,DRSD05,DRSD06,DRSD07,DRSD08,DRSD09,DRSD10,DRSD11,DRSD12,RT,RT01,RT02,RT03,RT04,RT05,RT06,RT07,RT08,RT09,RT10,RT11,RT12,AVT,AVT01,AVT02,AVT03,AVT04,AVT05,AVT06,AVT07,AVT08,AVT09,AVT10,AVT11,AVT12,SDT,SDT01,SDT02,SDT03,SDT04,SDT05,SDT06,SDT07,SDT08,SDT09,SDT10,SDT11,SDT12
1473,SPY,2003-11-26,3515.9115,79.8439,3608.0084,0.0113,0.0067,0.0362,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,0.0498,-0.0153,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,2924.3869,3562.5233,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.0135,0.0141,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,0.8006,0.0323,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.3782,0.2799,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270,-0.4366,-0.3039
1493,SPY,2003-12-26,911.4315,82.7326,3320.4381,0.0362,0.0057,0.0454,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,0.0498,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,2924.3869,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.0135,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,0.8006,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.3782,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270,-0.4366
1513,SPY,2004-01-27,4050.8187,86.4884,3780.6978,0.0454,0.0066,0.0017,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270
1533,SPY,2004-02-25,3585.5063,86.6317,4301.9243,0.0017,0.0059,-0.0430,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868
1553,SPY,2004-03-24,5651.0602,82.9091,5496.2474,-0.0430,0.0090,0.0429,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436
1573,SPY,2004-04-22,7091.6689,86.4661,5161.9697,0.0429,0.0081,-0.0405,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833
1593,SPY,2004-05-20,4174.6476,82.9621,5638.3260,-0.0405,0.0076,0.0364,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769
1613,SPY,2004-06-21,2862.1487,85.9842,3956.2637,0.0364,0.0068,-0.0138,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126
1633,SPY,2004-07-20,5211.3328,84.7992,4377.0448,-0.0138,0.0063,-0.0245,0.0364,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,3956.2637,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,0.0068,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,-0.7813,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.1784,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.5235,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363
1653,SPY,2004-08-17,4432.8114,82.7256,5496.5340,-0.0245,0.0088,0.0357,-0.0138,0.0364,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,4377.0448,3956.2637,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,0.0063,0.0068,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,-0.9474,-0.7813,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,1.3173,0.1784,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,1.1540,-0.5235,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654


In [75]:
# Convert df_wide to "wide format" 
# Each ticker is attached to its respective columns
# This bversion only has SPY
df_wide = df2.pivot(index='Date', columns='Ticker')
df_wide.columns = [' '.join(col).strip() for col in df_wide.columns.values]
df_wide.reset_index(inplace=True)
print df_wide.shape
display(df_wide)

(178, 82)


Unnamed: 0,Date,V SPY,P SPY,AV SPY,R SPY,DRSD SPY,YR SPY,R01 SPY,R02 SPY,R03 SPY,R04 SPY,R05 SPY,R06 SPY,R07 SPY,R08 SPY,R09 SPY,R10 SPY,R11 SPY,R12 SPY,AV01 SPY,AV02 SPY,AV03 SPY,AV04 SPY,AV05 SPY,AV06 SPY,AV07 SPY,AV08 SPY,AV09 SPY,AV10 SPY,AV11 SPY,AV12 SPY,DRSD01 SPY,DRSD02 SPY,DRSD03 SPY,DRSD04 SPY,DRSD05 SPY,DRSD06 SPY,DRSD07 SPY,DRSD08 SPY,DRSD09 SPY,DRSD10 SPY,DRSD11 SPY,DRSD12 SPY,RT SPY,RT01 SPY,RT02 SPY,RT03 SPY,RT04 SPY,RT05 SPY,RT06 SPY,RT07 SPY,RT08 SPY,RT09 SPY,RT10 SPY,RT11 SPY,RT12 SPY,AVT SPY,AVT01 SPY,AVT02 SPY,AVT03 SPY,AVT04 SPY,AVT05 SPY,AVT06 SPY,AVT07 SPY,AVT08 SPY,AVT09 SPY,AVT10 SPY,AVT11 SPY,AVT12 SPY,SDT SPY,SDT01 SPY,SDT02 SPY,SDT03 SPY,SDT04 SPY,SDT05 SPY,SDT06 SPY,SDT07 SPY,SDT08 SPY,SDT09 SPY,SDT10 SPY,SDT11 SPY,SDT12 SPY
0,2003-11-26,3515.9115,79.8439,3608.0084,0.0113,0.0067,0.0362,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,0.0498,-0.0153,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,2924.3869,3562.5233,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.0135,0.0141,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,0.8006,0.0323,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.3782,0.2799,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270,-0.4366,-0.3039
1,2003-12-26,911.4315,82.7326,3320.4381,0.0362,0.0057,0.0454,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,0.0498,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,2924.3869,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.0135,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,0.8006,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.3782,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270,-0.4366
2,2004-01-27,4050.8187,86.4884,3780.6978,0.0454,0.0066,0.0017,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,-0.1203,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,4078.4834,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,0.0116,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,-1.1834,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,0.5856,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868,-0.7270
3,2004-02-25,3585.5063,86.6317,4301.9243,0.0017,0.0059,-0.0430,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,0.0214,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,3946.8377,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,0.0149,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,0.5142,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.3455,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436,-0.2868
4,2004-03-24,5651.0602,82.9091,5496.2474,-0.0430,0.0090,0.0429,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,0.0477,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,4822.7440,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.0150,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,0.7076,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,1.0893,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833,-0.3436
5,2004-04-22,7091.6689,86.4661,5161.9697,0.0429,0.0081,-0.0405,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,0.0711,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,4089.0734,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,0.0113,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,0.8923,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,0.1300,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769,-0.9833
6,2004-05-20,4174.6476,82.9621,5638.3260,-0.0405,0.0076,0.0364,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,0.0482,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,4158.7794,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.0095,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,0.5143,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,0.0522,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126,-1.1769
7,2004-06-21,2862.1487,85.9842,3956.2637,0.0364,0.0068,-0.0138,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,0.0332,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,3804.1198,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,0.0108,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.2001,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.4725,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363,-0.8126
8,2004-07-20,5211.3328,84.7992,4377.0448,-0.0138,0.0063,-0.0245,0.0364,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,-0.0468,3956.2637,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,4375.3509,0.0068,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,0.0095,-0.7813,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,-0.7796,0.1784,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,0.4593,-0.5235,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654,-0.9363
9,2004-08-17,4432.8114,82.7256,5496.5340,-0.0245,0.0088,0.0357,-0.0138,0.0364,-0.0405,0.0429,-0.0430,0.0017,0.0454,0.0362,0.0113,0.0304,-0.0086,0.0720,4377.0448,3956.2637,5638.3260,5161.9697,5496.2474,4301.9243,3780.6978,3320.4381,3608.0084,3652.4315,4032.9492,3506.3425,0.0063,0.0068,0.0076,0.0081,0.0090,0.0059,0.0066,0.0057,0.0067,0.0060,0.0098,0.0054,-0.9474,-0.7813,0.6364,-1.3232,0.6623,-1.6283,-0.7837,0.4586,0.3664,-0.1121,0.2722,-0.4254,0.9396,1.3173,0.1784,-0.3539,1.7771,1.6088,2.6079,0.8181,-0.3511,-1.5347,-0.6485,-0.5439,0.1599,-0.8949,1.1540,-0.5235,-0.2942,-0.0143,0.1903,0.4355,-0.8823,-0.7999,-1.2187,-1.1382,-1.5697,-0.6454,-1.4654


In [76]:
# This is the final index dataset. It has all SPY data for each date
df_wide.to_csv('Indices_Wide_V4.CSV', index = False, header = True, float_format='%.4f')