# Probit Maximum Likelihood

In this homework you should implement the maximum likelihood estimator for the probit model. To remind you, this model is defined as follows:
    $$
    \begin{align}  
    y_i  &\in \{0,1\} \\
    \Pr\{y_i=1\} &= \Phi(x_i \beta) \\
    L(\beta)   & = \Pi_{i=1}^N  \Phi(x_i \beta)^{y_i} (1-\Phi(x_i \beta))^{1-y_i} \\
    \beta  & \in \mathbb{R}^k \\
    x_i  & \sim N\left([0,0,0],\left[ \begin{array}{ccc} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1\end{array} \right] \right) \\
    k & = 3 
    \end{align}
    $$
    
Where $\Phi$ is the standard Normal cdf. Think of $x_i$ as a row-vector. You should proceed as follows:

1. define a data generating function with default argument `N=10000`, generating `N` simulated data points from this model. Generate the data using $\beta=[1,1.5,-0.5]$. The function should return a `Dict` as outlined in the code.
1. Define the log likelihood function, $l(\beta) = \log(L)$
1. Write a function `plotLike` to plot the log likelihood function for different parameter values. Follow the outline of that function.
1. Define the function `maximize_like`. this should optimize your log likelihood function.
1. (Optional) Define the gradient of the log likelihood function and use it in another optimization `maximize_ike_grad`.
1. (Optional) Define the hessian of the log likelihood function and use it in another optimization `maximize_like_grad_hess`.
1. (Optional) Use the hessian of the log likelihood function to compute the standard errors of your estimates and use it in `maximize_like_grad_se`

## Tests

* The code comes with a test suite that you should fill out. 
* There are some example tests, you should make those work and maybe add other ones. 
* Please do not change anything in the file structure.

In [76]:
using Distributions, Optim, PyPlot, DataFrames

# data generating function
function makeData(n=10000::Int, beta = [ 1; 1.5; -0.5 ]::Vector, k=3::Int)
    X = rand(MvNormal(eye(k)),n) # define X
    y = Array{Int}(n) #empty y
    for i in 1:n # define binomial y
        y[i] = rand(Bernoulli(cdf(Normal(),dot(X[:,i]',beta))))
    end
    # return a dict with beta,numobs,X,y,norm)
    return Dict("beta" => beta, "numobs" => n, "X" => X, "y" => y, "dist" => Normal())
    #return beta, X, y
end



makeData (generic function with 4 methods)

In [77]:
makeData()

Dict{String,Any} with 5 entries:
  "y"      => [1,0,0,1,0,1,0,0,1,1  …  1,0,0,0,0,1,1,1,1,0]
  "X"      => [0.506386 0.179857 … 0.107167 -0.330175; 0.431269 -0.273457 … 1.0…
  "numobs" => 10000
  "dist"   => Distributions.Normal{Float64}(μ=0.0, σ=1.0)
  "beta"   => [1.0,1.5,-0.5]

In [82]:
function loglik(beta::Vector, d::Dict)
    l = 0
    for i in 1:d["numobs"]
        if d["y"][i] == 1
            l = l + log(cdf(d["dist"],dot(d["X"][:,i]',beta)))
        else
            l = l + log(1-cdf(d["dist"],dot(d["X"][:,i]',beta)))
        end
    end
    return l
end

loglik (generic function with 2 methods)

In [81]:
function plotLike()
    d = makeData()
    l1(x) = loglik([ x d["beta"][2] d["beta"][3]], d)
            
end

-3406.366194758875

In [83]:
d = makeData()
loglik(d["beta"],d)

-3384.672588244844