## Advanced Econometrics 2 - Bootstrap Methods

Computer Class 1b (Wednesday)

*Aim of this computer class*: to gain practical experience of bootstrapping regression models.

# 11-1

Consider the model $y=\alpha+\beta x+\varepsilon$, where $\alpha$, $\beta$, and $x$ are scalars and $\varepsilon \sim N(0,\sigma^2)$. A sample of size $N=20$ is generated with $\alpha=2, \beta=1, \sigma^2=1$ and $x \sim N(2,2)$. We wish to test $H_0: \beta=1$ against $H_1: \beta\neq 1$  at level 0.05 using the t-statistic $t=(\hat{\beta}-1)/SE[\hat{\beta}]$. Use $B = 999$ bootstrap replications. 

In [None]:
def OLS(y,X):
    N,k = X.shape                   # number of observations and regressors
    XXi = np.linalg.inv(X.T @ X)
    b_ols = XXi @ (X.T @ y)
    res = y-X @ b_ols
    s2 = (res @ res)/(N-k)
    SE = np.sqrt(s2*np.diag(XXi))
    return b_ols,SE,res

### **(a)** Estimate the model by OLS, giving slope estimate $\hat{\beta}$.

In [None]:
import numpy as np
y  =np.array([2.463460087, 4.082339785,7.14245305 ,6.837688781, 3.188993095,4.838084255,5.354217263,5.024464493,4.278112328, 2.061616983,-0.655026946,6.637085435, 1.822475278, 3.440341802,6.294259862, 4.225766242,4.901194854, 2.293813513,3.278865984,5.515655038])
x  =np.array([0.259705633, 2.481299324,3.960540791,3.49720621,  2.133512947,1.530091473,3.265568683,3.797276605,1.184917425, 0.462349978,-2.149324397,4.470384733, 1.343208036, 1.693754991,3.869958201, 2.789750994,2.867776386, 0.393884163,1.918828592,2.983220267])
eps=np.array([0.203754454,-0.39895954, 1.181912259,1.340482571,-0.944519852,1.307992781,0.08864858,-0.772812112,1.093194903,-0.400732995,-0.505702548,0.166700702,-1.520732758,-0.253413189,0.424301661,-0.563984752,0.033418467,-0.10007065,-0.639962608,0.532434771])
N=len(y)
alpha=2
beta=1
const=np.ones(N)
X=np.vstack( (const,x) ).T
# continue below


### **(b)** Use a paired bootstrap to compute the standard error and compare this to the original sample estimate. Use the bootstrap standard error to test $H_0$.

In [None]:
w=np.vstack( (y,x) ).T                 # make pairs
BOOTREP=9999;                          # number of bootstrap replications
betaB=np.zeros(BOOTREP)                # initialise to zero
tB=np.zeros(BOOTREP)
np.random.seed(42)
for b in range(BOOTREP):
    index=np.random.randint(N,size=N)  # select the indices  
                                       # resample from data

                                       # obtain bootstrap estimates using OLS(.)-function    
    betaB[b]=                          # store bootstrapped regression coefficient
    tB[b]=                             # store bootstrapped t-statistic
print('Results paired boostrap (B=%d):' % BOOTREP)
print('  Bootstrapped SE:       %7.4f' % np.std(betaB));
print('  t-stat using SE.boot:  %7.4f' % ((b_ols[1]-beta)/np.std(betaB)))

### **(c)** Use a paired bootstrap based on an asymptotic pivotal test statistic to test $H_0$.

### (d) Use a residual bootstrap to compute the standard error and compare this to the original sample estimate. Use the bootstrap standard error to test $H_0$.

In [None]:
fit=X@b_ols
np.random.seed(42)
for b in range(BOOTREP):
    index=np.random.randint(N,size=N)  # select the indices  
    resB=                              # resample from residuals
    yB=                                # construct boostrap observables
    bB_ols,SEB,resB=                   # obtain bootstrap estimates using OLS(.)-function    
    betaB[b]=                          # store bootstrapped regression coefficient
    tB[b]=                             # store bootstrapped t-statistic
print('Results residual boostrap (B=%d):' % BOOTREP)
print('  Bootstrapped SE:      %8.4f' % np.std(betaB));
print('  t-stat using SE.boot: %8.4f' % ((b_ols[1]-beta)/np.std(betaB)))

### **(e)** Use a residual bootstrap with asymptotic refinement to test $H_0$.

# 11-2

A sample of size 20 is generated according to the following DGP. Two regressors are generated by $x_1\sim \chi^2(4)-4$ and $x_2 \sim 3.5+\mathcal{U}[1,2]$; the error is from a mixture of normals with $u \sim N(0,25)$ with probability 0.3 and $u \sim N(0,5)$ with probability 0.7; and the dependent variable is $y=1.3\cdot x_1+0.7\cdot x_2+0.5\cdot u$.

### **(a)** Estimate by OLS the model $y=\beta_0+\beta_1\cdot x_1+\beta_2\cdot x_2+u$. Also obtain the variance-covariance matrix of the OLS estimates.

In [None]:
y =np.array([-1.68394399, 1.89893235  ,5.587108425, 4.040390467,13.20263535 ,12.9103882  ,4.742519161,-0.22837419 , 0.997667496, 3.917611056,5.264028901,5.470083648,-2.736489722,-0.700599201, 2.968735541, 1.731689435,-1.626249678, 0.246495836, 3.040399679,4.966098157])
x1=np.array([-2.466038711,-0.039161303,0.746740798, 0.370547493, 5.807525562, 6.442885266,0.296510257,-3.434583623,-2.372860055,-2.95813237 ,0.499211306,1.106643338,-1.587499657,-2.36526624 ,-0.789818973,-0.553989793,-3.098652021,-1.793674939, 1.674597685,0.167786355])
x2=np.array([ 5.156698473, 4.709976091,4.963574179, 5.441221931, 5.216287314, 4.576500608,5.148811731, 5.168776938, 4.699506648, 5.411501169,4.847301769,5.211460017, 5.048981452, 4.572994183, 5.334505106, 4.877380298, 4.557628807, 5.14582404 , 4.727755237,4.95463344])
u =np.array([-4.175565193,-2.69428244 ,2.283686924,-0.50035325 , 4.002901992, 2.662173856,1.50577523 , 1.236881328, 1.585461827, 7.950264637,2.44388593 ,0.766850592,-8.41405437,-1.653698034 , 0.522693263,-1.924580086,-1.576684432,-2.047607142,-4.892011955,2.559464974])
N=len(y)
beta0=0
beta1=1.3
beta2=0.5
const=np.ones(N)
X=np.vstack( (const,x1,x2) ).T
k=np.shape(X)[1]
# continue below
b_ols,SE,res = OLS(y,X)
print('                      C          X1         X2')
print('OLS estimates:    %7.4f   %7.4f   %7.4f' % (b_ols[0],b_ols[1],b_ols[2]) )
print('Standard errors: (%7.4f) (%7.4f) (%7.4f)' % (SE[0],SE[1],SE[2]) )
s2 = (res @ res)/(N-k)
V = s2*np.linalg.inv(X.T@X)
print('Covariance matrix')
with np.printoptions(precision=4, suppress=True):
    print(V)

### **(b)** Suppose we are interested in estimating the quantity $\gamma=\beta_1+\beta_2^2$  from the data. Use the least-squares estimates to estimate this quantity. Use the delta method to obtain approximate standard error for this function.

- Delta method: let $\theta=\left (\begin{array}{c}\beta_1 \\ \beta_2 \end{array} \right)$, so that $\gamma=h(\theta)=\beta_1+\beta_2^2$.
- Then we have $R(\theta)=\frac{\partial h(θ)}{\partial θ'}=(1,2β_2)$, so that $V(\hat{\gamma})\approx R(\hat{\theta})V(\hat{\theta})R'(\hat{\theta})$ (see p. 231 of the book;  § 7.2.8).
- This means that $V(\hat{\gamma})≈V(\hat{\beta}_1)+4\hat{\beta}_2^2 V(\hat{\beta}_2)+4\hat{\beta}_2 Cov(\hat{\beta}_1,\hat{\beta}_2).$

Determine the point estimate and its standard error.

In [None]:
gamma_hat=
se_gamma=
tstat=(gamma_hat-1)/se_gamma
print('gamma_hat (SE): %7.4f (%7.4f)' % (gamma_hat,se_gamma) )
print('t-stat: %7.4f' % tstat);

### **(c)** Then estimate the standard error of $\hat{\gamma}$ using a paired bootstrap. Compare this to $SE[\hat{\gamma}]$ from part (b) and explain the difference. For the bootstrap use $B=25$ and $B=200$.

In [None]:
w=np.vstack( (y,x1,x2) ).T                # make pairs
for BOOTREP in (25,200,1000):
    gammaB_hat=np.zeros(BOOTREP)              # initialise to zero
    se_gammaB=np.zeros(BOOTREP)
    tB=np.zeros(BOOTREP)
    np.random.seed(3)
    for b in range(BOOTREP):
        index=np.random.randint(N,size=N)  # select the indices  
        wB=np.copy(w[index,])              # resample from data
        yB=wB[:,0]
        XB=np.vstack( (const,wB[:,1],wB[:,2]) ).T
        bB_ols,SEB,resB=OLS(yB,XB)         # obtain bootstrap estimates using OLS(.)-function    
        gammaB_hat[b]=
        
        se_gammaB[b]=
        tB[b]=(gammaB_hat[b]-gamma_hat)/se_gammaB[b]
    print('Results paired boostrap (B=%d):' % BOOTREP)
    print('  Bootstrapped SE:  %7.4f' % np.std(gammaB_hat,ddof=1))
    print('  t-stat (SE.boot): %7.4f\n' % ((gamma_hat-1)/np.std(gammaB_hat,ddof=1)) )

### **(d)** Now test $H_0:\gamma=1$ at level 0.05 using a paired bootstrap with $B=999$. Perform bootstrap tests without asymptotic refinement, i.e. using $SE_{Boot}[\hat{\gamma}]$ of (c),  and with asymptotic refinement, i.e. using $T^*=(\hat{\gamma}^*-\hat{\gamma})/SE(\hat{\gamma}^*)$.