# Suggested Solution to PS3

You can optionally activate any environment. In this particular case, I will go to such location and activate it. If you don't use any particular environment in your computer, you can ignore the cell below.

In [1]:
cd(joinpath(pwd(),".."))

using Pkg
Pkg.activate(".") ;

[32m[1m Activating[22m[39m environment at `C:\Users\paulc\.julia\dev\ECON627_2020\Project.toml`


## Load Main Packages

In [2]:
using Distributions, PrettyTables, Random, Parameters

## Define parameters and other constants

In [3]:
const β=0.15
const ρ=0.9;
const Σ=[1.0 ρ; ρ 1;];

## Set Seed

In [4]:
Random.seed!(1234);

## Define functions that generate data and do estimation

In [5]:
function generate_data(n)
    #Define the Multivariate Normal Distribution instance
    mvnormal = MvNormal([0.0; 0.0], Σ)
    

    #DGP
    W = rand(Uniform(0, 1), n)
    Z = -0.5*(W.<0.2)-0.1*(W.>=0.2).*(W.<0.4)+0.1*(W.>=0.4).*(W.<0.6)+1*(W.>=0.6) 
    
    Errors=rand(mvnormal,n)'
    ϵ=Errors[:,1]
    V=Errors[:,2]  
    U = (1.0.+Z).*ϵ
    
    X = 4*Z.^2+V
    Y = β*X+U
    return (Y = Y , X = X , Z = Z)
end

generate_data (generic function with 1 method)

In [6]:
function efficient_instruments(Y,X,Z)
    #Infeasible
    g_infea = (4*Z.^2)./((1.0.+Z).^2)
    
    #Feasible
    Z_1 = 1*(Z.==-0.5)
    Z_2 = 1*(Z.==-0.1)
    Z_3 = 1*(Z.==0.1)
    Z_4 = 1*(Z.==1)

    D = [Z_1 Z_2 Z_3 Z_4]
    P_D = D/(D'*D)*D'
    
    U_hat = Y-(Z'*X)\(Z'*Y)*X
    g_fea = (P_D*X)./(P_D*(U_hat.^2))
    
    
    return (infeasible = g_infea , feasible = g_fea)
end

efficient_instruments (generic function with 1 method)

In [7]:
function estimators(Y,X,Z)
    @unpack feasible , infeasible = efficient_instruments(Y,X,Z)
    
    n=length(Y)
    
    # 2SLS
    β2sls = (Z'*X)\(Z'*Y)
    U = Y - X*β2sls
    zu = Z.*U
    zx = Z'*X
    
    Ω2sls = n*((zx)\(zu'*zu)/(zx')) 
    se2sls = sqrt.(Ω2sls)
    
    # Efficient Infeasible
    βfea = (feasible'*X)\(feasible'*Y)
    zx = feasible'*X
    
    Ωfea = n*inv(zx)
    sefea = sqrt.(Ωfea)

    
    # Efficient Feasible
    βinf = (infeasible'*X)\(infeasible'*Y)
    zx = infeasible'*X
    
    Ωinf = n*inv(zx)
    seinf = sqrt.(Ωinf)
    
    
    return (β2sls = β2sls, se2sls = se2sls, βfea = βfea, sefea = sefea, βinf = βinf, seinf = seinf)
    
end

estimators (generic function with 1 method)

# Results for n=100

In [8]:
n = 100
R = 10^4

sig_2sls = zeros(1,3)
length_2sls = zeros(1,3)
cov_2sls = zeros(1,3)

sig_fea = zeros(1,3)
length_fea = zeros(1,3)
cov_fea = zeros(1,3)

sig_inf = zeros(1,3)
length_inf = zeros(1,3)
cov_inf = zeros(1,3)

α = [0.1 0.05 0.01]


for r=1:R
    Y, X, Z = generate_data(n)
    @unpack β2sls, se2sls, βfea, sefea, βinf, seinf = estimators(Y,X,Z)
     
    upper_2sls = β2sls .- se2sls*quantile.(Normal(), α./2)/sqrt(n)
    lower_2sls = β2sls .+ se2sls*quantile.(Normal(), α./2)/sqrt(n)
    
    upper_inf = βinf .- seinf*quantile.(Normal(), α./2)/sqrt(n)
    lower_inf = βinf .+ seinf*quantile.(Normal(), α./2)/sqrt(n)
    
    upper_fea = βfea .- sefea*quantile.(Normal(), α./2)/sqrt(n)
    lower_fea = βfea .+ sefea*quantile.(Normal(), α./2)/sqrt(n)
    
    sig_2sls .+=  1*(lower_2sls.>0)
    sig_inf .+=  1*(lower_inf.>0)
    sig_fea .+=  1*(lower_fea.>0)
 
    cov_2sls .+= 1*(upper_2sls.>β).*(lower_2sls.<β)
    cov_inf .+= 1*(upper_inf.>β).*(lower_inf.<β)
    cov_fea .+= 1*(upper_fea.>β).*(lower_fea.<β)
 
    length_2sls .+= 1*(upper_2sls .- lower_2sls)
    length_inf .+= 1*(upper_inf .- lower_inf)
    length_fea .+= 1*(upper_fea .- lower_fea)
    
end

In [9]:
sig_2sls = sig_2sls./R
length_2sls = length_2sls./R
cov_2sls = cov_2sls./R

sig_fea = sig_fea./R
length_fea = length_fea./R
cov_fea = cov_fea./R

sig_inf = sig_inf./R
length_inf = length_inf./R
cov_inf = cov_inf./R ;

### Coverage (n=100)


In [10]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  cov_2sls[1] cov_inf[1] cov_fea[1]  ;
0.95  cov_2sls[2] cov_inf[2] cov_fea[3] ;
0.99  cov_2sls[3] cov_inf[2] cov_fea[3] ]

pretty_table(table, header)


┌──────┬────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m   2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼────────┼──────────────────────┼────────────────────┤
│  0.9 │ 0.8868 │               0.8955 │             0.8734 │
│ 0.95 │ 0.9376 │               0.9458 │             0.9749 │
│ 0.99 │  0.984 │               0.9458 │             0.9749 │
└──────┴────────┴──────────────────────┴────────────────────┘


### Prob. of Statistically Significant Results (n=100)

In [11]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  sig_2sls[1] sig_inf[1] sig_fea[1]  ;
0.95  sig_2sls[2] sig_inf[2] sig_fea[3] ;
0.99  sig_2sls[3] sig_inf[2] sig_fea[3] ]

pretty_table(table, header)

┌──────┬────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m   2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼────────┼──────────────────────┼────────────────────┤
│  0.9 │ 0.5514 │               0.7353 │              0.761 │
│ 0.95 │ 0.4477 │               0.6309 │             0.4847 │
│ 0.99 │ 0.2718 │               0.6309 │             0.4847 │
└──────┴────────┴──────────────────────┴────────────────────┘


### Average Length of CI (n=100)

In [12]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  length_2sls[1] length_inf[1] length_fea[1]  ;
0.95  length_2sls[2] length_inf[2] length_fea[3] ;
0.99  length_2sls[3] length_inf[2] length_fea[3] ]

pretty_table(table, header)

┌──────┬──────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m     2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼──────────┼──────────────────────┼────────────────────┤
│  0.9 │ 0.277178 │             0.213275 │           0.207638 │
│ 0.95 │ 0.330278 │             0.254132 │            0.32516 │
│ 0.99 │ 0.434059 │             0.254132 │            0.32516 │
└──────┴──────────┴──────────────────────┴────────────────────┘


# Results for n=400


In [13]:
n = 400
R = 10^4

sig_2sls = zeros(1,3)
length_2sls = zeros(1,3)
cov_2sls = zeros(1,3)

sig_fea = zeros(1,3)
length_fea = zeros(1,3)
cov_fea = zeros(1,3)

sig_inf = zeros(1,3)
length_inf = zeros(1,3)
cov_inf = zeros(1,3)

α = [0.1 0.05 0.01]


for r=1:R
    Y, X, Z = generate_data(n)
    @unpack β2sls, se2sls, βfea, sefea, βinf, seinf = estimators(Y,X,Z)
     
    upper_2sls = β2sls .- se2sls*quantile.(Normal(), α./2)/sqrt(n)
    lower_2sls = β2sls .+ se2sls*quantile.(Normal(), α./2)/sqrt(n)
    
    upper_inf = βinf .- seinf*quantile.(Normal(), α./2)/sqrt(n)
    lower_inf = βinf .+ seinf*quantile.(Normal(), α./2)/sqrt(n)
    
    upper_fea = βfea .- sefea*quantile.(Normal(), α./2)/sqrt(n)
    lower_fea = βfea .+ sefea*quantile.(Normal(), α./2)/sqrt(n)
    
    sig_2sls .+=  1*(lower_2sls.>0)
    sig_inf .+=  1*(lower_inf.>0)
    sig_fea .+=  1*(lower_fea.>0)
 
    cov_2sls .+= 1*(upper_2sls.>β).*(lower_2sls.<β)
    cov_inf .+= 1*(upper_inf.>β).*(lower_inf.<β)
    cov_fea .+= 1*(upper_fea.>β).*(lower_fea.<β)
 
    length_2sls .+= 1*(upper_2sls .- lower_2sls)
    length_inf .+= 1*(upper_inf .- lower_inf)
    length_fea .+= 1*(upper_fea .- lower_fea)
    
end

In [14]:
sig_2sls = sig_2sls./R
length_2sls = length_2sls./R
cov_2sls = cov_2sls./R

sig_fea = sig_fea./R
length_fea = length_fea./R
cov_fea = cov_fea./R

sig_inf = sig_inf./R
length_inf = length_inf./R
cov_inf = cov_inf./R ;

### Coverage (n=400)

In [15]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  cov_2sls[1] cov_inf[1] cov_fea[1]  ;
0.95  cov_2sls[2] cov_inf[2] cov_fea[3] ;
0.99  cov_2sls[3] cov_inf[2] cov_fea[3] ]

pretty_table(table, header)


┌──────┬────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m   2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼────────┼──────────────────────┼────────────────────┤
│  0.9 │  0.904 │               0.9008 │             0.8956 │
│ 0.95 │ 0.9499 │               0.9517 │             0.9885 │
│ 0.99 │ 0.9899 │               0.9517 │             0.9885 │
└──────┴────────┴──────────────────────┴────────────────────┘


### Prob. of Statistically Significant Results (n=400)

In [16]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  sig_2sls[1] sig_inf[1] sig_fea[1]  ;
0.95  sig_2sls[2] sig_inf[2] sig_fea[3] ;
0.99  sig_2sls[3] sig_inf[2] sig_fea[3] ]

pretty_table(table, header)

┌──────┬────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m   2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼────────┼──────────────────────┼────────────────────┤
│  0.9 │ 0.9603 │                0.997 │             0.9966 │
│ 0.95 │ 0.9228 │               0.9933 │             0.9627 │
│ 0.99 │  0.803 │               0.9933 │             0.9627 │
└──────┴────────┴──────────────────────┴────────────────────┘


### Average Length of CI (n=400)

In [17]:
header = ["α" "2SLS"  "Infeasible Efficient" "Feasible Efficient"]
table = [ 0.90  length_2sls[1] length_inf[1] length_fea[1]  ;
0.95  length_2sls[2] length_inf[2] length_fea[3] ;
0.99  length_2sls[3] length_inf[2] length_fea[3] ]

pretty_table(table, header)

┌──────┬──────────┬──────────────────────┬────────────────────┐
│[1m    α [0m│[1m     2SLS [0m│[1m Infeasible Efficient [0m│[1m Feasible Efficient [0m│
├──────┼──────────┼──────────────────────┼────────────────────┤
│  0.9 │ 0.139451 │             0.106327 │           0.105787 │
│ 0.95 │ 0.166166 │             0.126696 │           0.165661 │
│ 0.99 │ 0.218379 │             0.126696 │           0.165661 │
└──────┴──────────┴──────────────────────┴────────────────────┘


While  the  simulated  coverage  probabilities  for  each  of  the  three  methods  are  close  to  the nominal levels of $1−α$,  the two efficient IV methods produce smaller CIs and statistically significant results with higher probability than 2SLS does.The differences between infeasible and feasible efficient IV methods are not substantial.  

The CIs constructed with feasible efficient IVs have coverage probabilities very close to nominal,but there is slight under-coverage.  Also,  they are a bit shorter and therefore appear a bit more powerful than the infeasible CIs constructed with the true efficient IVs.  This is because the feasible efficient IVs rely on an asymptotic result (established in Q1) that estimation of efficient IVs does not change the asymptotic distribution.  Asymptotically this is true,  but in  finite  samples  there  will  be  small  discrepancies,  since  we  ignore  the  contribution  of  the estimation of efficient IVs to the variance.

Compare the results for $n= 100$ with the ones for $n= 400$:  when $n= 100$, there are some distortions to the coverage probabilities when one uses estimated efficient IVs.  Withn= 400,the distortions are practically gone.