## Advanced Econometrics 2 - Bootstrap Methods

Computer Class 1c (Thursday)

*Aim of this computer class*: compare inference based on first-order asymptotic theory to inference based on the bootstrap using a Monte Carlo (MC) simulation.

The DGP for the data is the Gaussian first-order autoregressive
$-$AR(1)$-$ model given by
$$y_{t}=\rho y_{t-1}+\varepsilon _{t}, \ \ \ \ \varepsilon _{t}\sim N(0,1), \ \ \ \ \ t=2,...,N$$
with starting value
$y_{1}\sim N(0,1/(1-\rho ^{2}))$. 

We estimating the model, an intercept and trend is included in the model $$y_{t}=\rho y_{t-1}+\mu +\beta t+\varepsilon _{t},$$ so $\mu =0$ and $\beta =0$ in the DGP. The parameter of interest is the AR(1) coefficient $\rho$. 

Suppose we want to test $H_{0}:\rho =\rho _{0}$ against $H_{1}^{L}:\rho <\rho _{0}$, $H_{1}^{R}:\rho >\rho _{0}$ or
$H_{1}^{2}:\rho \neq \rho _{0}$. 

We are interested in estimating the size of inference procedures based on the standard normal distribution and
critical values obtained by the bootstrap. Preferably, the bootstrap is higher-order correct, so the critical values should be based on 
$$T^{\ast }=\frac{\hat{\rho}^*-\hat{\rho}}{SE(\hat{\rho}^*)}.$$

## Assignment

### 1.  Study in some detail the simulation code shown in the cell below. Execute the program.

In [2]:
def OLS(y,X):
    N,k = X.shape                   # number of observations and regressors
    XXi = np.linalg.inv(X.T @ X)
    b_ols = XXi @ (X.T @ y)
    res = y-X @ b_ols
    s2 = (res @ res)/(N-k)
    SE = np.sqrt(s2*np.diag(XXi))
    return b_ols,SE,res

In [3]:
import numpy as np
from scipy import signal
N=50                                     # number of observations
print('N=%d\n' %N)
MCREP = 4000                             # number of Monte Carlo replications
BOOTREP = 99                             # number of bootstrap replications
const = np.ones(N-1)                     # vector with ones (constant)
trend = np.arange(N-1)                   # trend
rho_hat = np.zeros(MCREP)                # contains estimated rho
tstat = np.zeros(MCREP)                  # contains t-stat for rho
mean_rhoB = np.zeros(MCREP)              # mean of the bootstrapped estimator
tB_low = np.zeros(MCREP)                 # lower critical value
tB_high = np.zeros(MCREP)                # higher critical value
tB_crit2 = np.zeros(MCREP)               # critical value based on absoluted t-stat
coefB = np.zeros(BOOTREP)
statB = np.zeros(BOOTREP)
np.random.seed(314159)          # reproducibility
rho=0.9                         # true parameter value
ar1=np.array([1, -rho])         # AR parameters: y(t)=rho*y(t-1)+epsilon(t)
ma0=np.array([1])               # MA parameters: we only have 1*epsilon(t)
for i in range(MCREP):
    y0 = np.random.normal(size=1)/np.sqrt(1-rho**2);           # draw start value ~ N(0,1/(1-rho^2))
    eps = np.random.normal(size=N-1)
    yt = signal.lfilter(ma0, ar1, np.concatenate((y0,eps)) )
    y = yt[1:]
    X = np.vstack((yt[:-1],const,trend)).T
    b_ols,SE,res = OLS(y,X)
    rho_hat[i] = b_ols[0]
    ar1_hat=np.array([1, -b_ols[0]])
    tstat[i]=(b_ols[0]-rho)/SE[0]
    fit_deter=X[:,1:] @ b_ols[1:]
    for b in range(BOOTREP):
        index = np.random.randint(N-1,size=N-1)
        epsB = np.copy(res[index])
        yBt = signal.lfilter(ma0, ar1_hat, np.concatenate((y0,fit_deter+epsB)) )
        yB = yBt[1:]
        XB = np.vstack((yBt[:-1],const,trend)).T
        bB_ols,SEB,resB = OLS(yB,XB)
        coefB[b] = bB_ols[0]
        statB[b]=(bB_ols[0]-rho_hat[i])/SEB[0]
    mean_rhoB[i]=np.mean(coefB)
    tB_low[i]=np.quantile(statB,0.05)
    tB_high[i]=np.quantile(statB,0.95)
    t2Bstat=abs(statB)
    tB_crit2[i]=np.quantile(t2Bstat,0.95)
print("True bias: %7.3f" % (np.mean(rho_hat)-rho))
bias_boot=mean_rhoB-rho_hat; print("MC average of bootstrapped bias: %7.3f" % np.mean(bias_boot))
print("Rejection frequencies")
rej1Lasym=(tstat<=-1.645);  rej1Lboot=(tstat<=tB_low)
print("H1:rho< %4.2f based on Asymp: %7.3f    Boot: %7.3f" % (rho,np.mean(rej1Lasym),np.mean(rej1Lboot)))
rej1Rasym=(tstat>1.645);    rej1Rboot=(tstat>=tB_high)
print("H1:rho> %4.2f based on Asymp: %7.3f    Boot: %7.3f" % (rho,np.mean(rej1Rasym),np.mean(rej1Rboot)))
rej2asym=(abs(tstat)>1.96); rej2boot=(abs(tstat)>tB_crit2)
print("H1:rho<>%4.2f based on Asymp: %7.3f    Boot: %7.3f" % (rho,np.mean(rej2asym),np.mean(rej2boot)))

N=50

True bias:  -0.143
MC average of bootstrapped bias:  -0.108
Rejection frequencies
H1:rho< 0.90 based on Asymp:   0.378    Boot:   0.159
H1:rho> 0.90 based on Asymp:   0.001    Boot:   0.043
H1:rho<>0.90 based on Asymp:   0.247    Boot:   0.156


### 2.  How many bootstrap samples are generated within the whole MC simulation?

In [4]:
4000*99


396000

### 3.  Look at the vectors `b_ols` and `bB_ols`. To what does `b_ols` refer to? Same question with respect to `bB_ols`.

In [6]:
bB_ols

array([ 0.76173981, -1.03626702,  0.03817633])

In [7]:
b_ols

array([ 0.81643125, -1.0103615 ,  0.02728157])

### 4.  In the Python program, to what does the vector `t2Bstat` refer to?

### 5.  Look at the averages of the series: `rej1Lasym`, ` rej1Rasym`, `rej2asym`. What do these averages represent? Same question with respect to: `rej1Lboot`, `rej1Rboot`, ` rej2boot`. What is the Error in Rejection Probability (ERP) for the various testing procedures?

### 6.  Verify the rejection frequencies as reported in the lecture for $N=50$ , $N=200$, $N=800$.

### 7.  Consider the use of the restricted parameter values in the bootstrap DGP.

Modify the Python code to get the following bootstrap DGP
$$y_{t}^{\ast }=\rho _{0}y_{t-1}^{\ast }+\hat{\mu}+\hat{\beta}t+\varepsilon_{t}^{\ast }$$
and test statistic
$$T^{\ast }=\frac{\hat{\rho}^{\ast }-\rho _{0}}{SE(\hat{\rho}^{\ast })}.$$
What is your conclusion? How could you obtain estimates of $\mu$ and $\beta$ under $H_{0}:\rho =\rho _{0}$.