<u> Question 2 </u>:  (Monte Carlo (MC) Simulations for 2SLS and two-step efficient GMM) When answering
the questions below, use Markdown to structure your Jupyter notebook.

<u> Part a </u>:

In [1]:
# The necessary libraries 
using Distributions, PrettyTables, Random
using Parameters


###### Initialize all the parameters needed

In [2]:
n = 500
const β = 1.0
const Π = [1.0;1.0;]
const ρ = 0.95;
const Σ =[1.0 ρ; ρ 1;];

###### Let us write a function that generate an n-sample


In [3]:
function generate_sample(n)
    # Define the Multivariate Normal Distribution instance
    mvnormal = MvNormal([0.0; 0.0], Σ)
    
    # DGP
    Errors=rand(mvnormal,n)'
    ϵ=Errors[:,1]
    V=Errors[:,2]
    Z=randn(n,2)
    X=Z*Π+V
    U=exp.(Z*Π) .* ϵ
    Y=β*X+U
    return (Y = Y , X = X , Z = Z)
end

generate_sample (generic function with 1 method)

In [4]:

@unpack X, Y , Z = generate_sample(5);



<u> Part b </u>: Function that computes 2SLS estimator of $\beta$ and its standard error.

In [5]:
# Function for estimation of Ω
function Ω(U,Z)
    n=length(U)
    zr = Z.*U
    omega = (zr' * zr)/n
        
    return omega
end

Ω (generic function with 1 method)

The 2SLS estimator of $\beta$ is given by $\hat{\beta}^{2sls}_n = (X^{\prime}ZW_nZ^\prime X)^{-1}X^{\prime}ZW_nZ^\prime Y$ where $W^{-1}_n = 1/n Z^\prime Z$

In [6]:
function β_2sls(Y,X,Z)
    
    n=length(Y)

    PZ = Z*( (Z'*Z)\Z' )
    β2SLS = (X'*PZ*X)\(X'*PZ*Y)
    Q = Z'*X/n
    W = inv(Z'*Z/n)
    Ω1 = Ω(Y-β2SLS*X,Z)
    var2sls = ( (Q'*W*Q)\(Q'*W*Ω1*W*Q)/(Q'*W*Q) )/n
    std2SLS = sqrt(var2sls)
    
    return β2SLS, std2SLS
    
end


β_2sls (generic function with 1 method)

<u> Part c </u>: Function that computes the GMM estimator of $\beta$ and its standard error.

In [7]:
function β_GMM(Y,X,Z)
    
    n=length(Y)
    PZ = Z*( (Z'*Z)\Z' )
    Q = Z'*X/n
    β2SLS = (X'*PZ*X)\(X'*PZ*Y)
    Ω1 = Ω(Y-β2SLS*X,Z)

    W_gmm=inv(Ω1);
    β_gmm=(X'*Z*W_gmm*Z'*X)\(X'*Z*W_gmm*Z'*Y)
    Ω2=Ω(Y-β_gmm*X,Z)
    W_gmm =inv(Ω2)
    stdGMM=sqrt(inv(Q'*W_gmm*Q)/n)
    
    return β_gmm, stdGMM
    
end

β_GMM (generic function with 1 method)

<u> Part d </u>: Generate 10,000 independent samples of size n from the model. For each sample
compute the following a 95% confidence interval of $\beta$.

In [8]:
R=10^4
Bias2SLS=0.0
BiasGMM=0.0
std2SLS=0.0
stdGMM=0.0
In2SLS=0.0
InGMM=0.0
CritVal = quantile(Normal(0,1), .975);

In [9]:
for r=1:R
    Y, X, Z = generate_sample(n)
    b2SLS, s2SLS = β_2sls(Y,X,Z)
    bGMM, sGMM = β_GMM(Y,X,Z)
    Bias2SLS += abs(b2SLS-β)
    BiasGMM += abs(bGMM-β)
    std2SLS += s2SLS
    stdGMM += sGMM
    In2SLS += (β>b2SLS - CritVal*s2SLS)*(β<b2SLS + CritVal*s2SLS)
    InGMM += (β>bGMM - CritVal*sGMM)*(β<bGMM + CritVal*sGMM)
end

<u> Part e</u>: results

In [10]:
table_data = ["Bias" Bias2SLS/R BiasGMM/R
 "Ave. std.err" std2SLS/R stdGMM/R;"Coverage Prob of CI" In2SLS/R InGMM/R;]
header=["Statistic", "2SLS", "Two-step efficient GMM"]
pretty_table(table_data; header)

┌─────────────────────┬──────────┬────────────────────────┐
│[1m           Statistic [0m│[1m     2SLS [0m│[1m Two-step efficient GMM [0m│
├─────────────────────┼──────────┼────────────────────────┤
│                Bias │  0.46081 │               0.376171 │
│        Ave. std.err │ 0.526561 │               0.442985 │
│ Coverage Prob of CI │   0.9561 │                 0.9468 │
└─────────────────────┴──────────┴────────────────────────┘


<u> Part f</u>: interpretation of the result

* We con observe that both 2sls and gmm display a coverage probability odf the confident interval close to 0.95.
* The bias and standard error of the gmm are respectively smaller than that of 2sls. For these reasons, gmm model is preferred to 2sls model.

<u> Part g</u>: simulations with n = 100

In [52]:
n=100
Bias2SLS=0.0;
BiasGMM=0.0;
std2SLS=0.0;
stdGMM=0.0;
In2SLS=0.0;
InGMM=0.0;

for r=1:R
    Y, X, Z = generate_sample(n)
    b2SLS, s2SLS = β_2sls(Y,X,Z)
    bGMM, sGMM = β_GMM(Y,X,Z)
    Bias2SLS +=abs(b2SLS-β)
    BiasGMM +=abs(bGMM-β)
    std2SLS +=s2SLS
    stdGMM +=sGMM
    In2SLS += (β>b2SLS - CritVal*s2SLS)*(β<b2SLS + CritVal*s2SLS)
    InGMM += (β>bGMM - CritVal*sGMM)*(β<bGMM + CritVal*sGMM)
end

table_data = ["Bias" Bias2SLS/R BiasGMM/R
        "Ave. std.err" std2SLS/R stdGMM/R;
        "Coverage Prob of CI" In2SLS/R InGMM/R;     
]
header=["Statistic","2SLS","Two-step efficient GMM"]
pretty_table(table_data;header)

┌─────────────────────┬──────────┬────────────────────────┐
│[1m           Statistic [0m│[1m     2SLS [0m│[1m Two-step efficient GMM [0m│
├─────────────────────┼──────────┼────────────────────────┤
│                Bias │ 0.816472 │               0.618646 │
│        Ave. std.err │ 0.882597 │               0.675747 │
│ Coverage Prob of CI │    0.953 │                 0.9358 │
└─────────────────────┴──────────┴────────────────────────┘


The 2SLS model seems to outpace the gmm model when n is smaller.