In [1]:
using Distributions

assumptions

In [2]:
β₀ = 10
β₁ = 1
β = [β₀, β₁]
σ² = 1

1

Strict set of GM assumptions:
* X is deterministic, x is thus fixed over repeated samples
* errors $\mu$ are normally distributed with assumed homoscedastic errors

Give the small sample and asymptotic properties of the OLS estimator for $\beta$ and for the estimator of the standard errors.

Small sample properties:
* best unbiased estimator
* estimator is normally distributed (stems from the fact that $\hat{\beta}$ is linear function of disturbance vector $\mu$)
* covariance matrix $\sigma^2(X'X)^{-1}$ with an unbiased estimator of $\sigma^2$ given by

$$\hat{\sigma}^2 = \frac{\hat{\mu}'\hat{\mu}}{N-K} = \frac{y'My}{N-K}$$


Asymptotic properties:
* same under the GM conditions
* $\bar{x}_N$ assymptotically approaches $N(\mu,\frac{\sigma^2}{N})$

# Setting up a monte carlo simulation

Step 1: specify a population = N(5,2) draw sample once to have a deterministic sample

In [3]:
gen_X(sample_size) = hcat(ones(sample_size), rand(Normal(5, 2), sample_size))

gen_X (generic function with 1 method)

In [4]:
function gen_μ(sample_size)
    return randn(sample_size,1)*sqrt(σ²)
end

gen_μ (generic function with 1 method)

In [5]:
X = gen_X(25)
μ = gen_μ(25)

25×1 Array{Float64,2}:
  0.678191 
  0.221607 
  0.402541 
 -0.0451813
 -0.205249 
 -0.713423 
 -0.707502 
 -0.0461326
 -1.51608  
  0.0589416
 -0.206912 
 -1.01226  
  0.702741 
 -0.715642 
  1.53709  
 -0.5118   
 -1.03741  
 -0.226173 
  0.200182 
 -0.799575 
 -1.37976  
 -0.11869  
 -1.07584  
  1.5083   
 -0.192665 

In [6]:
y = X * β + μ;

step 2: calculate statistics and save them (ols estimator, estimated ols standard error, t-statistic)

In [7]:
OLS_normal(y, X) = inv(X'X) * X'y

OLS_normal (generic function with 1 method)

Note that std does not use the standard formulation of standard deviation

this gives the true standard error of the total simulation

In [8]:
std_of_x(x) = norm(x - mean(x))/sqrt(length(x))

std_of_x (generic function with 1 method)

In [9]:
t_test(vec,H₀) = (mean(vec) - H₀)/std(vec)

t_test (generic function with 1 method)

part 2: lagged dependent variable

Introducing lagged dependent variables makes it so that the assumption "X and $\mu$ are independent" has to be relaxed to $E[\mu_t|x_t] = 0$ or thus that the errors are contemporaneously independent with any explanatory variables.

The OLS estimator becomes:
* Biased: $E[\hat{\beta}|X] = \beta + (X'X)^{-1}X'E[\mu|X]$ => $E[\hat{\beta}] = E_X(E[\hat{\beta}|X]) \neq \beta$
* Consistent and asymptotically normally distributed: $plim\hat{\beta} = \beta + plim \frac{X'X}{T}^{-1} plim\frac{X'\mu}{T}$ = 0 because $plim\frac{X'\mu}{T} = E(x_t\mu_t) = 0$
* $\hat{\sigma}^2 = \frac{\hat{\mu}'\hat{\mu}}{T-k}$ is still a consistent estimator for $\sigma^2$

In [11]:
β₀ = 10.
β₁ = 0.
β = [β₀, β₁]
σ² = 1
y₀ = rand(Normal(β₀/(1-β₁), sqrt(σ²/(1-β₁^2))))
sample_size = 5
μ = randn(sample_size,1)*sqrt(σ²)

5×1 Array{Float64,2}:
 -1.82753  
 -0.693731 
 -0.60137  
  0.0725159
  0.459709 

We need to generate a seed to kickstart the iterative time evolution

In [14]:
function gen_y(β, T, y₀)
    β₀, β₁ = β
    y = zeros(T)
    
    μ = randn()
    y[1] = β₀ + β₁*y₀ + μ
    
    for t = 2:T
        μ = randn()
        y[t] = β₀ + β₁*y[t-1] + μ
    end
    y
end

gen_y (generic function with 1 method)

In [19]:
@time gen_y(β, 10, y₀)

  0.000008 seconds (131 allocations: 7.875 KB)


10-element Array{Float64,1}:
  9.04671
 10.1256 
  9.00689
 10.3392 
  9.41033
  9.91178
  9.76645
  9.60674
 10.1689 
  9.62597

In [23]:
function gen_y!(y, β, y₀)
    β₀, β₁ = β
    
    μ = randn()
    y[1] = β₀ + β₁*y₀ + μ
    
    for t = 2:length(y)
        μ = randn()
        y[t] = β₀ + β₁*y[t-1] + μ
    end
end

gen_y! (generic function with 2 methods)

In [25]:
y = zeros(10)
@time gen_y!(y, β, y₀)
y

  0.000005 seconds (4 allocations: 160 bytes)


10-element Array{Float64,1}:
  8.91937
 10.3463 
  9.12763
  9.96545
  9.13978
  9.74982
 11.7051 
  9.2925 
  9.61794
 10.7179 

In [17]:
# y_t = β₀ + β₁*y₀ + μ[1]

1×1 Array{Float64,2}:
 10.6383

In [18]:
#println(μ)
# function one_step_y(y_t_min_1, mu)
#     y_t = β₀ + β₁*y_t_min_1 + mu
#     return y_t
# end



one_step_y (generic function with 1 method)

In [20]:
# one_step_y(y₀, μ[1])

1×1 Array{Float64,2}:
 10.6383

In [13]:
sample_size = 5

5

eventually we drop the seed because there is no information in it, we also keep the seed fixed throughout the simulations

In [14]:
# seed = y₀
# Y = [seed]
# Y = vcat(Y,one_step_y(Y[end]))

LoadError: MethodError: no method matching one_step_y(::Array{Float64,2})[0m
Closest candidates are:
  one_step_y(::Any, [1m[31m::Any[0m) at In[11]:3[0m

In [15]:
μ[1]

-0.946437284256469

In [21]:
seed = y₀
Y = [seed]

1-element Array{Array{Float64,2},1}:
 [10.6981]

In [22]:
function gen_Y_lag()
    for i in 1:sample_size
        Y = vcat(Y,one_step_y(Y[end]))
    end
    return Y
end

gen_Y_lag (generic function with 1 method)

In [23]:
gen_Y_lag()

LoadError: UndefVarError: Y not defined