# Problem Set 2
## 1. Monte Carlo Simulations for 2SLS and two-step efficient GMM
### (a) Generate observations for ($Y_i,X_i,Z_i$)


In [1]:
using Distributions, PrettyTables, Random

n=500
const β=1.0
const Π=[1.0;1.0;]
const ρ=0.95;
const Σ=[1.0 ρ; ρ 1;];

Random.seed!(2634)

function data(n)
    Z=randn(n,2)
    mvnorm = MvNormal([0.0; 0.0], Σ)
    err = rand(mvnorm,n)'
    ϵ=err[:,1]
    V=err[:,2]

    X=Z*Π+V
    U=exp.(Z*Π) .* ϵ
    Y=β*X+U
    return (Y = Y, X = X, Z = Z, U = U)
end

data (generic function with 1 method)

### (b) Compute $\hat\beta^{2sls}$ and its standard error
Also doing part c because it uses 2sls

In [2]:
function estimators(Y,X,Z,U)
    n = length(Y)
    PZ = Z*( (Z'*Z)\Z' )
    β2sls = (X'*PZ*X)\(X'*PZ*Y)
    Q = Z'*X/n
    W = inv(Z'*Z/n)
    ZU = Z.*(Y-β2sls*X)
    Ω1 = (ZU' * ZU)/n
    var = ( (Q'*W*Q)\(Q'*W*Ω1*W*Q)/(Q'*W*Q) )/n
    std2sls = sqrt(var)
    
    # part (c) GMM
    WGMM = inv(Ω1);
    βgmm = (X'*Z*WGMM*Z'*X)\(X'*Z*WGMM*Z'*Y)
    ZU = Z.*(Y-βgmm*X)
    Ω2 = (ZU' * ZU)/n
    WGMM = inv(Ω2)
    stdgmm = sqrt(inv(Q'*WGMM*Q)/n)
    
    return β2sls, std2sls, βgmm, stdgmm
end

estimators (generic function with 1 method)

### (d) Generate 10,000 independent samples and calculate bias and confidence intervals

In [3]:
R=10000
Bias2SLS=0.0
BiasGMM=0.0
std2SLS=0.0
stdGMM=0.0
In2SLS=0.0
InGMM=0.0
CritVal = quantile(Normal(0,1), .975);
for r=1:R
    Y, X, Z, U = data(n)
    b2sls, s2sls, bgmm, sgmm = estimators(Y,X,Z,U)
    Bias2SLS += abs(b2sls-β)
    BiasGMM += abs(bgmm-β)
    std2SLS += s2sls
    stdGMM += sgmm
    In2SLS += (β>b2sls - CritVal*s2sls)*(β<b2sls + CritVal*s2sls)
    InGMM += (β>bgmm - CritVal*sgmm)*(β<bgmm + CritVal*sgmm)
end

### (e) Report statistics

In [4]:
table_data = ["Bias" Bias2SLS/R BiasGMM/R
        "Ave. std. err" std2SLS/R stdGMM/R;
        "CI Coverage Prob" In2SLS/R InGMM/R;        
]
header=["Statistic" "2SLS" "GMM";]
pretty_table(table_data,header)

┌──────────────────┬──────────┬──────────┐
│[1m        Statistic [0m│[1m     2SLS [0m│[1m      GMM [0m│
├──────────────────┼──────────┼──────────┤
│             Bias │  0.45382 │  0.37402 │
│    Ave. std. err │ 0.521618 │ 0.440295 │
│ CI Coverage Prob │   0.9564 │   0.9477 │
└──────────────────┴──────────┴──────────┘


### (f) Compare two methods, which is preferred?
GMM has a smaller bias and standard error, and still has a coverage probability close to 0.95, so GMM is preferred.

### (g) Repeat d-f for n=100

In [5]:
R=10000
Bias2SLS=0.0
BiasGMM=0.0
std2SLS=0.0
stdGMM=0.0
In2SLS=0.0
InGMM=0.0
CritVal = quantile(Normal(0,1), .975);
for r=1:R
    Y, X, Z, U = data(100)
    b2sls, s2sls, bgmm, sgmm = estimators(Y,X,Z,U)
    Bias2SLS += abs(b2sls-β)
    BiasGMM += abs(bgmm-β)
    std2SLS += s2sls
    stdGMM += sgmm
    In2SLS += (β>b2sls - CritVal*s2sls)*(β<b2sls + CritVal*s2sls)
    InGMM += (β>bgmm - CritVal*sgmm)*(β<bgmm + CritVal*sgmm)
end
table_data = ["Bias" Bias2SLS/R BiasGMM/R
        "Ave. std. err" std2SLS/R stdGMM/R;
        "CI Coverage Prob" In2SLS/R InGMM/R;        
]
header=["Statistic" "2SLS" "GMM";]
pretty_table(table_data,header)

┌──────────────────┬──────────┬──────────┐
│[1m        Statistic [0m│[1m     2SLS [0m│[1m      GMM [0m│
├──────────────────┼──────────┼──────────┤
│             Bias │ 0.849006 │ 0.644065 │
│    Ave. std. err │ 0.903082 │ 0.691824 │
│ CI Coverage Prob │   0.9529 │   0.9339 │
└──────────────────┴──────────┴──────────┘


With n=100, GMM is still preferred because it has a smaller bias and standard error, although the coverage probability is a little lower now. 2sls still has a coverage probability at 0.95, so it may be more reliable for smaller samples.

## 2. Prove (10), (11), and (12) in Lecture 1.
### (10): $P(W_n>\chi_{k,1-\alpha}^2)\rightarrow 1$.
Under the fixed alternative $H_1:\beta=\beta_0+\delta$ and by slutsky:
$$
\frac{W_n}{n} = (\hat\beta_n-\beta_0)^\prime \hat{V}^{-1}(\hat\beta_n-\beta_0)\rightarrow_p \delta^\prime V^{-1} \delta>0\\
P(W_n>\chi_{k,1-\alpha}^2)=P(\frac{W_n}{n}>\frac{\chi_{k,1-\alpha}^2}{n})\rightarrow 1 \text{ as } n\rightarrow\infty
$$
because $\frac{W_n}{n}\rightarrow_p \delta^\prime V^{-1}\delta>0$ and $\frac{\chi_{k,1-\alpha}^2}{n}\rightarrow 0$ 
### (11): $n^{1/2}\big(\hat\beta_n(A_n)-\beta_0\big)\rightarrow_d N(\delta,V(A))$
Under the local alternative $H_1:\beta=\beta_0+\frac{\delta}{\sqrt{n}}$:
$$
\sqrt{n}\big(\hat\beta(A_n)-\beta_0\big)=\delta+(D_n(A_n^\prime A_n)D_n^\prime)^{-1}D_n(A_n^\prime A_n)n^{-1/2}\sum{z_i u_i}\\
D_n:\frac{1}{n}\sum x_i z_i^{-1}
$$
We know that
$$
V(W)=(Q^\prime W Q)^{-1}Q^\prime W\Omega WQ(Q^\prime WQ)^{-1}\rightarrow_p N(0,V(A))
$$
Then
$$
\sqrt{n}(\hat\beta(A_n)-\beta_0)\rightarrow_d N(\delta,V(A))
$$
### (12): $W_n \rightarrow_d \chi_{k}^2\big(\delta^\prime(V(A))^{-1}\delta\big)$
$$
W_n = \sqrt{n}(\hat\beta_n-\beta_0)^\prime \hat{V}_n^{-1}\sqrt{n}(\hat\beta_n-\beta_0)
$$
From 11,
$$
V^{-1/2}\sqrt{n}(\hat\beta_n-\beta_0)\rightarrow_d V^{-1/2}N(\delta,V)=N(V^{-1/2}\delta,I_k)=z+V^{1/2}\delta
$$
where $z=N(0,I_k).$ <br>
Then
$$
W_n\rightarrow (z+v^{1/2}\delta)^\prime(z+V^{1/2}\delta)//
\rightarrow_d \chi_{k}^2(\delta^\prime V(A)^{-1}\delta)
$$

## 3. Local power of the overidentifying restrictions test
### (a)
$$
\begin{align}
\tilde\beta_n(W_N) & =(X^\prime ZW_nZ^\prime X)^{-1}X^\prime ZW_nZ^\prime Y\\
& =\beta+(X^\prime ZW_nZ^\prime X)^{-1}X^\prime ZW_nZ^\prime U\\
& =\beta+(X^\prime ZW_nZ^\prime X)^{-1}X^\prime ZW_n\sum_{i=1}^n(U_iZ_i-EU_iZ_i)+(X^\prime ZW_nZ^\prime X)^{-1}X^\prime ZW_nnEU_1Z_1\\
& =\beta+\big(\frac{X^\prime Z}{n}W_n\frac{Z^\prime X}{n}\big)^{-1}\frac{X^\prime Z}{n}W_n\frac{1}{n}\sum_{i=1}^n(U_iZ_i-EU_iZ_i)+\big(\frac{X^\prime Z}{n}W_n\frac{Z^\prime X}{n}\big)^{-1}\frac{X^\prime Z}{n}W_n\frac{\delta}{\sqrt{n}}
\end{align}
$$
By the WLLN,
$$
\frac{Z^\prime X}{n}\rightarrow_p Q=EZ_1X_1 \text{ and } \frac{1}{n}\sum_{i=1}^n(U_iZ_i-EU_iZ_i)\rightarrow_p 0
$$
and since rank$(Q)=k$ and $\delta/\sqrt{n}\rightarrow 0$, if follows that the probability limit of $\tilde\beta_n(W_n)$ is $\beta$.
### (b)
From (a) and the central limit theorem,
$$
\sqrt{n}(\tilde\beta_n(W_n)-\beta)=\big(\frac{X^\prime Z}{n}W_n\frac{Z^\prime X}{n}\big)^{-1}\frac{X^\prime Z}{n}W_n\frac{1}{\sqrt{n}}\sum_{i=1}^n(U_iZ_i-EU_iZ_i)+\big(\frac{X^\prime Z}{n}W_n\frac{Z^\prime X}{n}\big)^{-1}\frac{X^\prime Z}{n}W_n\delta\\
\rightarrow_d (Q^\prime WQ)^{-1}Q^\prime WD+(Q^\prime WQ)^{-1}Q^\prime W\delta\\
~ N((Q^\prime WQ)^{-1}Q^\prime W\delta,(Q^\prime WQ)^{-1}Q^\prime W\Omega WQ(Q^\prime WQ)^{-1})
$$
where $D~N(0,\Omega)$
### (c)
The distribution in (b) cannot be used for inference on $\beta$ because although it is consistent, it depends on an unknown $\delta$, for which we don't have an estimator.
### (d)
Consider $\hat\Omega_n=\frac{1}{n}\sum_{i=1}^n\hat{U}_i^2Z_iZ_i^\prime$ where $\hat{U}_i=Y_i-X_i^\prime\tilde\beta_n(W_n)$
$$
\hat{U}_i=Y_i-X_i^\prime\tilde\beta_n(W_n)=\frac{1}{n}\sum_{i=1}^nU_i^2Z_iZ_i^\prime + \frac{1}{n}\sum_{i=1}^n\big(X_i^\prime(\beta-\tilde\beta_n(W_n))\big)^2Z_iZ_i^\prime - \frac{2}{n}\sum_{i=1}^n\big(X_i^\prime(\beta-\tilde\beta_n(W_n))\big)U_iZ_iZ_i^\prime
$$
The second and third terms go to zero by the WLLN, consistency of $\tilde\beta$, and the assumptions of finite moments of $EZ_{i,j}^4, EX_{i,j}^4, \text{ and } EU_i^4$. <br>
For the first term,
$$
\start{align}
\frac{1}{n}\sum_{i=1}^nU_i^2Z_iZ_i^\prime & = \frac{1}{n}\sum_{i=1}^n(U_iZ_i)(U_iZ_i)^\prime
& = \frac{1}{n}\sum_{i=1}^n(U-iZ_i-EU_iZ_i)(U_iZ_i)^\prime+EU_1Z_1\frac{1}{n}\sum_{i=1}^nU_iZ_i^\prime
& = \frac{1}{n}\sum_{i=1}^n(U_iZ_i-EU_iZ_i)(U_iZ_i-EU_iZ_i)^\prime+EU_1Z_1\times\frac{1}{n}\sum_{i=1}^n(U_iZ_i-EU_iZ_i)^\prime+\frac{1}{n}\sum_{i=1}^n(U-iZ_i-EU_iZ_i)\times EU_1Z_1^\prime+(EU_1Z_1)(EU_1Z_1)^\prime
\end{align}
$$
By the WLLN, this converges to $Var(U_iZ_i)$ (and since $EU_1Z_1\rightarrow0$