# PSET 4

### Juan M Jimenez R.

## Question 1.
### Part A

In [1]:
using Distributions, Random, PrettyTables

In [2]:
#Data generating function
function genData(R,n,k2)
    
    #Parameters 
    λ=.5
    β1=1
    β2=ones(k2,1)
    π1=1
    π2=ones(k2,1)
    
    #Creating random vars 
    Random.seed!(123456789)
    
    Output = zeros(n,k2+3,R)
    
    for h=1:R
        
        data=randn(n,k2+3)

        X2=data[:,1:k2]
        ε=data[:,k2+1]
        v=data[:,k2+2]
        Z=data[:,k2+3]
        
        #True model
        u = λ*v + ε
        X1 = π1*Z + X2*π2 + u
        Y = β1*X1 + X2*β2 + v
        
        Output[:,:,h]=[Y X1 X2 Z]
        
    end
       
    #Output
    #Y=Output[:,1,:]
    #X1=Output[:,2,:]
    #X2=Output[:,3:k2+2,:]
    #Z=Output[:,k2+3,:]
    
    #return Y, X1, X2, Z
    return Output
    
end

genData (generic function with 1 method)

### Part B

In [3]:
#2SLS function 
function est2SLS(Y, X, Z)
    
    PZ=Z*inv(Z'*Z)*Z'
    β_2sls=(X'*PZ*X) \ (X'*PZ*Y)
    
    return β_2sls

end

est2SLS (generic function with 1 method)

In [4]:
#Simulation parameters
R=10^5
n=30
k2=2

β1=1

#Creating data 
data = genData(R,n,k2);

#Calculation of 2SLS estimates
Beta=zeros(R,1)
Bias=zeros(R,1)

for h=1:R
    
    Y=data[:,1,h]
    X1=data[:,2,h]
    X2=data[:,3:k2+2,h]
    Z1=data[:,k2+3,h]
    
    X=[X1 X2]
    Z=[Z1 X2]
    
    Beta[h]=est2SLS(Y, X, Z)[1]
    Bias[h]=abs(Beta[h].-β1)
    
end

#Mean 2sls estimate 
mean(Beta)

0.9825340555123412

In [5]:
#Average absolute bias value
mbias=mean(Bias)

0.17502699055057958

In [6]:
#Simulated bias value
mbias=abs(mean(Beta)-β1)

0.017465944487658813

The simulated bias is around 1.7% of the true $\beta_1=1$.
(Ask Paul: Difference with Mean bias from pset-2?)

### Part C

In [7]:
#Simulation parameters
R=10^5
n=30
k2=2

β1=1

#Creating data 
data = genData(R,n,k2);

#Calculation of 2SLS estimates
Beta=zeros(R,1)
Bias=zeros(R,1)

for h=1:R
    
    Y=data[:,1,h]
    X1=data[:,2,h]
    X2=data[:,3:k2+2,h]
    Z1=data[:,k2+3,h]
    
    X=[X1 X2]
    Z=[ Z1 Z1.^2 Z1.^3 X2 X2.^2 X2.^3 (Z1 .* X2) (Z1.^2 .* X2) (Z1 .* X2.^2) Z1.*X2[:,1].*X2[:,2] ]
    
    Beta[h]=est2SLS(Y, X, Z)[1]
    Bias[h]=abs(Beta[h].-β1)
    
end

#Mean 2sls estimate 
mean(Beta)

1.149672389655874

In [8]:
#Average absolute bias value
mbias=mean(Bias)

0.17589901412308387

In [9]:
#Simulated bias value
mbias=abs(mean(Beta)-β1)

0.14967238965587404

The simulated bias increased to 15%.

### Part D 

With n=30, the simulated bias increased from nearly 1.7% with l=3 (l being the number of instrumental variables) to approximate 15% with l=16. This happens because as seen in class the bias of the 2SLS estimate is directly proportional to $c=\frac{l}{n}$ for all c<1. Thus, in this case c was intially 3/30 and then increased to 16/30, which translated to a direct increase in the bias aswell.   

### Part E

In [10]:
#Simulation parameters
R=10^5
n=100
k2=2

β1=1

i=1
mbiasb=zeros(3,1)
mbiasc=zeros(3,1)

for n = [30, 100, 1000]
        
    #Creating data 
    data = genData(R,n,k2);

    #Calculation of 2SLS estimates
    Betab=zeros(R,1)
    Betac=zeros(R,1)
    Biasb=zeros(R,1)
    Biasc=zeros(R,1)

    for h=1:R

        #Defining the variables
        Y=data[:,1,h]
        X1=data[:,2,h]
        X2=data[:,3:k2+2,h]
        Z1=data[:,k2+3,h]

        X=[X1 X2]

        #Choosing the instruments
        Zb=[Z1 X2]
        Zc=[ Z1 Z1.^2 Z1.^3 X2 X2.^2 X2.^3 (Z1 .* X2) (Z1.^2 .* X2) (Z1 .* X2.^2) Z1.*X2[:,1].*X2[:,2] ]

        #Estimating the 2SLS coefficients and corresponding biases with few IVs
        Betab[h]=est2SLS(Y, X, Zb)[1]
        Biasb[h]=abs(Beta[h].-β1)

        #Estimating the 2SLS coefficients and corresponding biases with many IVs
        Betac[h]=est2SLS(Y, X, Zc)[1]
        Biasc[h]=abs(Beta[h].-β1)

    end
    
    #Simulated bias value for the few IVs estimation
    mbiasb[i]=abs(mean(Betab)-β1)
    
    #Simulated bias value for the many IVs estimation
    mbiasc[i]=abs(mean(Betac)-β1)
    
    i=i+1
end

In [11]:
#Table report
table_data = ["30" mbiasb[1] mbiasc[1];
        "100" mbiasb[2] mbiasc[2];
        "1000" mbiasb[3] mbiasc[3];]
header=["N", "Simulated bias (L=3)", "Simulated bias (L=16)"]
pretty_table(table_data; header)

┌──────┬──────────────────────┬───────────────────────┐
│[1m    N [0m│[1m Simulated bias (L=3) [0m│[1m Simulated bias (L=16) [0m│
├──────┼──────────────────────┼───────────────────────┤
│   30 │            0.0174659 │              0.149672 │
│  100 │           0.00528789 │             0.0553607 │
│ 1000 │          0.000456245 │            0.00600002 │
└──────┴──────────────────────┴───────────────────────┘


As it can be seen the bias decreases when n increases, since the $c=\frac{l}{n}$ ratio decreases. However, for a given N, the bias in the case of many IVs is always higher than in the case of few IVs.