# 1.4 - Purgind the Series of the Fed Information Effect

This script purges the series created in 1.2 of the Fed Information Effect using the Greenbook forecast data gathered in 1.3. Following Mirranda-Aggripino (2016), taking $s_t$ to be the raw shock for meeting $t$ calculated in 1.2 and $\Gamma_t$ to be a vector of Greenbook forecast data corresponding to meeting $t$, I take $u_t$ to be the *true* shock - that which cannot be explained by the beliefs the Fed currently has about real GDP growth, inflation and unemployment for the current and next 4 quarters, conducting OLS regression as follows...

$$s_t = \alpha + \mathbf{\beta}\cdot\mathbf{\Gamma}_t + u_t$$

Potential specifications for $\Gamma$ are constructed from...
- all combinations of real GDP growth, inflation and unemployment.
- forecasts, forecast revisions, or both.
- Only the current quarter $q_0$, or all data through to $q_1$, $q_2$, $q_3$ or $q_4$.

...providing $7\cdot3\cdot5=105$ possible specifications.

Specifcations taken forth are...

- Forecasts and forecast revisions for real GDP growth and unemployment through to $q_2$ (this minimises the Akaike Information Criterion, and also aligns with the specification used by Mirranda-Aggripino (2016))
- Forecasts only for real GDP growth and inflation through to $q_4$ (this has a strong AIC and adjusted $R^2$ relative to other parsimonious specifications, and aligns better with my paper's contention that The Fed optimises over longer horizons than the present literature accounts for).
- The full specification - forecasts and forecast revisions for real GDP growth, inflation and unemployment through to $q_4$.

### Preamble

This script makes use of...

- Pandas
- StatsModels

In [109]:
import pandas as pd
import statsmodels.api as sm

### Import the Raw Shock Dataframe

This block imports the raw shock dataframe created in 1.2, and fixes the date indices such that it is commensurable with the Greenbook forecast dataframe.

In [110]:
shock_df = pd.read_csv('shock.csv')

shock_df = shock_df.rename(columns = {'Unnamed: 0':'Date'})

shock_df['Date'] = [pd.Timestamp(date) for date in shock_df['Date']] # .csv format saves dates as strings; this gets them
                                                                     # back into Pandas timestamp format
shock_df = shock_df.set_index('Date')

### Import the Greenbook Dataframe

This block imports the greenbook forecast dataframe created in 1.3, fixes the date indices, and corrects for rounding errors from storage in Unix timestamp format.

In [111]:
greenbook_df = pd.read_csv('greenbook.csv')

greenbook_df = greenbook_df.rename(columns = {'Unnamed: 0': ''})

greenbook_df = greenbook_df.set_index('')

greenbook_df = greenbook_df.T

greenbook_df.index = [pd.Timestamp(int(ts) - (int(ts) % 86400), unit = 's') for ts in greenbook_df.index]

greenbook_df.index.name = 'Date'

### Get Matching Dataframe Indices

This block drops Greenbook forecast data for meeting dates for which I have no shock (currently just 12th November, 1997) 

In [112]:
## Drop dates for which no shocks

greenbook_df = greenbook_df.drop(index = greenbook_df.index[[date not in shock_df.index for date in greenbook_df.index]])

## Check for any dates in the shock index that are not in the Greenbook index (misalignment likely reflects an error in 1.1).

if list(greenbook_df.index) == list(shock_df.index):
    
    print('Indices align.')
    
else:
    
    print('Indices do not align.')

Indices align.


### Initialising the Regressor and Regressand Dataframes

This block gets the raw shock into a single-column regressand dataframe, and adds a constant (i.e. a column of only 1s) to the Greenbook forecast dataframe. The OLS regression method in the StatsModels package is compatible with pandas dataframes, so the data are stored this way as an expedient.

In [113]:
regressand_df = shock_df.drop(columns = shock_df.columns[:-1])

regressors_df = sm.add_constant(greenbook_df)

### Getting String Lists for Each Potential Specification

This block uses some mathematical chicanery to get each of the 105 specifications mentioned above from `spec_list`.

In [114]:
spec_list = [[['rgdp_q0','rgdp_q1','rgdp_q2','rgdp_q3','rgdp_q4'],
             ['rgdp_rev_q0','rgdp_rev_q1','rgdp_rev_q2','rgdp_rev_q3','rgdp_rev_q4']],
            [['infl_q0', 'infl_q1', 'infl_q2', 'infl_q3', 'infl_q4'],
             ['infl_rev_q0', 'infl_rev_q1', 'infl_rev_q2', 'infl_rev_q3','infl_rev_q4']],
            [['unmp_q0', 'unmp_q1', 'unmp_q2', 'unmp_q3', 'unmp_q4'],
             ['unmp_rev_q0', 'unmp_rev_q1', 'unmp_rev_q2', 'unmp_rev_q3','unmp_rev_q4']]]

def select(List, indices): # This function allows for getting non-consecutive elements from a list.
    return [List[i] for i in indices]

## Get a set of indices for combinations of real GDP growth, inflation and unemployment.

index_list_variables = [select(range(0,30,10),I) for I in [[0],[1],[2],[0,1],[0,2],[1,2],[0,1,2]]]

index_list_var_types = [] # This list stores a set of indices for forecasts, forecast revisions, and both

for item in index_list_variables:
    
    index_list_var_types.append(item) # Just forecasts
    
    revisions = [i + 5 for i in item]
    
    index_list_var_types.append(revisions) # Just revisions
    
    full = item + revisions
    
    index_list_var_types.append(full) # Both

index_list_full = []
    
for item in index_list_var_types: # This loop gets a set of indices for each of 0, 1, 2, 3 and 4 quarters out.
    
    for i in range(0,5):
        
        periods = []
        
        for j in item:
            
            periods = periods + [j + k for k in range(0,i+1)]
            
        index_list_full.append(periods)

spec_list_full = [select(list(regressors_df.columns[1:]),subset) for subset in index_list_full] # Gets each specification
                                                                                                # in terms of strings

### Building a Dataframe for Specification Analysis

This block gets the Akaike Information Criterion, Adjusted $R^2$, $R^2$ and number of regressors (`n`) for each specification into a single dataframe, allowing for analysis of each using Pandas' `.sort_values()` method.

In [115]:
## Initialise (rather ugly) dataframe - specifications as index and information for analysis as columns

spec_df = pd.DataFrame(index = [','.join(s) for s in spec_list_full], columns = ['aic','adj_R^2','R^2','n'])

for spec in spec_list_full: # Loops through each specification
    
    columns = ['const'] + spec
    
    model = sm.OLS(regressand_df, regressors_df[columns]).fit() # This is the fitting of the model
    
    spec_df.loc[','.join(spec),'aic'] = model.aic
    
    spec_df.loc[','.join(spec),'adj_R^2'] = model.rsquared_adj
    
    spec_df.loc[','.join(spec),'R^2'] = model.rsquared
    
    spec_df.loc[','.join(spec),'n'] = len(spec)

In [118]:
spec_df

Unnamed: 0,aic,adj_R^2,R^2,n
rgdp_q0,-117.384,0.0552614,0.0612407,1
"rgdp_q0,rgdp_q1",-115.752,0.0514033,0.0634109,2
"rgdp_q0,rgdp_q1,rgdp_q2",-122.329,0.0954164,0.112592,3
"rgdp_q0,rgdp_q1,rgdp_q2,rgdp_q3",-120.991,0.0933241,0.116278,4
"rgdp_q0,rgdp_q1,rgdp_q2,rgdp_q3,rgdp_q4",-119.072,0.0878651,0.11673,5
rgdp_rev_q0,-115.21,0.0422521,0.0483138,1
"rgdp_rev_q0,rgdp_rev_q1",-113.453,0.0375881,0.0497706,2
"rgdp_rev_q0,rgdp_rev_q1,rgdp_rev_q2",-119.25,0.0777282,0.0952397,3
"rgdp_rev_q0,rgdp_rev_q1,rgdp_rev_q2,rgdp_rev_q3",-117.344,0.0722891,0.0957755,4
"rgdp_rev_q0,rgdp_rev_q1,rgdp_rev_q2,rgdp_rev_q3,rgdp_rev_q4",-120.342,0.0951239,0.123759,5


### Building the Purged Shock Series Dataframe

This block produces the OLS model for each series carried forth (justifications given above) and gets the residuals into a dataframe.

In [107]:
S_subset = ['rgdp_q0',
  'rgdp_q1',
  'rgdp_q2',
  'rgdp_q3',
  'rgdp_q4',
  'infl_q0',
  'infl_q1',
  'infl_q2',
  'infl_q3',
  'infl_q4'] # The forecast-only real GDP growth and inflation spec through to 4-quarters-out.

MA_subset = ['rgdp_q0',
  'rgdp_q1',
  'rgdp_q2',
  'unmp_q0',
  'unmp_q1',
  'unmp_q2',
  'rgdp_rev_q0',
  'rgdp_rev_q1',
  'rgdp_rev_q2',
  'unmp_rev_q0',
  'unmp_rev_q1',
  'unmp_rev_q2'] # The forecast and forecast revisions for real GDP growth and unemployment spec through to 2-quarters-out.

S_model = sm.OLS(regressand_df, regressors_df[S_subset]).fit()

MA_model = sm.OLS(regressand_df, regressors_df[MA_subset]).fit()

full_model = sm.OLS(regressand_df, regressors_df).fit() # The full model

purged_shocks_df = pd.DataFrame(index = greenbook_df.index)

purged_shocks_df['S_shocks'] = S_model.resid # This gets the residuals for each value

purged_shocks_df['MA_shocks'] = MA_model.resid

purged_shocks_df['full_shocks'] = full_model.resid

### Export to *.csv*

In [108]:
purged_shocks_df.to_csv('purged_shocks.csv')