### Overview
This notebook details process of pulling Treasury curve data from the Federal Reserve API (FRED).
<br><br>
Requires one to register for an <a target="_blank" href="https://research.stlouisfed.org/docs/api/api_key.html">API key</a>.<br>
For simplicity, using an external package (<a target="_blank" href="https://github.com/mortada/fredapi">fredapi</a>),
which requires separate installation.  It's a simple urllib wrapper around the Fed's root API <a target="_blank" href="https://api.stlouisfed.org/fred">endpoint</a>.
<br>
You could also just <a target="_blank" href="http://docs.python-requests.org/en/master/">query</a> the API directly as well.
<br><br>
Any other data feed, including <a target="_blank" href="https://www.bloomberg.com/professional/solution/bloomberg-terminal/">Bloomberg</a>, <a target="_blank" href="https://financial.thomsonreuters.com/en/products/infrastructure/trading-infrastructure/fintech-digital-solutions/financial-data-api-delivery.html">Reuters</a>, or <a target="_blank" href="https://www.quandl.com/">Quandl</a>, should work just fine.
<br>
The goal is just to compile historical recordings of various segments of the Treasury curve.
<br>
The process below caches the pulled results on disk in the repo's /data directory.

In [1]:
# load packages
print('package versioning')
print ('------------')
# standard library
import datetime

#external
import pandas as pd
print('pandas: ', pd.__version__)
import numpy as np
print('numpy: ', np.__version__)
from fredapi import Fred

package versioning
------------
pandas:  0.20.3
numpy:  1.13.3


### pull data from Federal Reserve API (FRED)

In [2]:
# supply authorization creds
key = ''  #  <----- set API key here
fred = Fred(api_key=key)

In [3]:
# specify target variables | ticker/titles as provided by the Federal Reserve Bank of St. Louis (FRED)
pulldict = {
'FEDFUNDS':'Effective Federal Funds Rate',
'DGS1MO':'1-Month Treasury Constant Maturity Rate',
'DGS3MO':'3-Month Treasury Constant Maturity Rate',
'DGS6MO':'6-Month Treasury Constant Maturity Rate',    
'DGS1':'1-Year Treasury Constant Maturity Rate',
'DGS2':'2-Year Treasury Constant Maturity Rate',
'DGS3':'3-Year Treasury Constant Maturity Rate',
'DGS5':'5-Year Treasury Constant Maturity Rate',
'DGS7':'7-Year Treasury Constant Maturity Rate',
'DGS10':'10-Year Treasury Constant Maturity Rate',
'DGS20':'20-Year Treasury Constant Maturity Rate',
'DGS30':'30-Year Treasury Constant Maturity Rate'  
}

In [4]:
# iterate through dictionary, request data and construct pd dataframe
import time
df = pd.DataFrame()
for pull_ticker in pulldict.keys():
    data_pull = pull_ticker
    description = pulldict[pull_ticker]
    print(data_pull, description, '| ',end='\n')
    tmp_df = fred.get_series_latest_release(data_pull)
    tmp_df = pd.DataFrame(tmp_df)
    tmp_df.rename(columns={list(tmp_df.columns.values)[0]: description}, inplace=True)
    df = df.join(tmp_df, how='outer')
    time.sleep(0.5)  # intentionally time-space requests

FEDFUNDS Effective Federal Funds Rate | 
DGS1MO 1-Month Treasury Constant Maturity Rate | 
DGS3MO 3-Month Treasury Constant Maturity Rate | 
DGS6MO 6-Month Treasury Constant Maturity Rate | 
DGS1 1-Year Treasury Constant Maturity Rate | 
DGS2 2-Year Treasury Constant Maturity Rate | 
DGS3 3-Year Treasury Constant Maturity Rate | 
DGS5 5-Year Treasury Constant Maturity Rate | 
DGS7 7-Year Treasury Constant Maturity Rate | 
DGS10 10-Year Treasury Constant Maturity Rate | 
DGS20 20-Year Treasury Constant Maturity Rate | 
DGS30 30-Year Treasury Constant Maturity Rate | 


In [5]:
# rename & order df columns
current_col_list = ['Effective Federal Funds Rate','1-Month Treasury Constant Maturity Rate','3-Month Treasury Constant Maturity Rate','6-Month Treasury Constant Maturity Rate','1-Year Treasury Constant Maturity Rate','2-Year Treasury Constant Maturity Rate','3-Year Treasury Constant Maturity Rate','5-Year Treasury Constant Maturity Rate','7-Year Treasury Constant Maturity Rate','10-Year Treasury Constant Maturity Rate','20-Year Treasury Constant Maturity Rate','30-Year Treasury Constant Maturity Rate']
new_col_list = ['FEDFUNDS','DGS1MO','DGS3MO','DGS6MO','DGS1','DGS2','DGS3','DGS5','DGS7','DGS10','DGS20','DGS30']
for x in range(0,len(new_col_list)):
    df.rename(columns={'{0}'.format(current_col_list[x]): '{0}'.format(new_col_list[x])}, inplace=True)
df = df[new_col_list]
# rename index column
df.index.names = ['date']

In [6]:
# all columns have daily frequency except FEDFUNDS, which is monthly | ffill column to ensure existing/latest month is graphed
df['FEDFUNDS'] = df['FEDFUNDS'].fillna(method='ffill').fillna(method='bfill')

In [7]:
# # optional:  [basic plot of data pulled]
# import matplotlib as mpl
# import matplotlib.pyplot as plt
# %matplotlib inline

# for col_x in df.columns.values:
#     df[col_x].dropna().plot(color='black',alpha=0.6)

In [8]:
# set earliest period to 2006
sdate = '2006-02-09'
sdate = np.datetime64(sdate)
edate = np.datetime64(str(datetime.datetime.now().date()))
df = df.truncate(before=sdate, after=edate).copy()

In [9]:
# attempt to fill NaNs w/ ffill only
for col_x in df.columns.values:
    df[col_x] = df[col_x].fillna(method='ffill',limit=3)

In [10]:
# resample to monthly observations | median
df_daily_biz = df.resample('B').median().copy()
df_weekly = df.resample('W').median().copy()
df_monthly = df.resample('M').median().copy()
df_quarterly = df.resample('Q').median().copy()

In [11]:
print(len(df_daily_biz))
print(len(df_weekly))
print(len(df_monthly))
print(len(df_quarterly))

3048
611
141
48


In [12]:
# ensure no missing values exist
check_null = df_monthly.isnull().sum(axis=0).sort_values(ascending=False)/float(len(df))
print(np.sum(check_null) == 0)

True


### cache df objects

In [13]:
#### cache data | keep index
df_daily_biz.to_csv('data/historical_data_daily.csv')
df_weekly.to_csv('data/historical_data_weekly.csv')
df_monthly.to_csv('data/historical_data_monthly.csv')
df_quarterly.to_csv('data/historical_data_quarterly.csv')