# Binary Choice and Censored Models

This notebook uses MLE to estimate Probit, truncated and censored models. 

## Load Packages and Extra Functions

The key functions used for OLS and IV/2SLS are from the (local) `FinEcmt_OLS` module.

In [1]:
MyModulePath = joinpath(pwd(),"src")
!in(MyModulePath,LOAD_PATH) && push!(LOAD_PATH,MyModulePath)
using FinEcmt_OLS, FinEcmt_ProbitTobit
using FinEcmt_MLEGMM: MLE

In [2]:
#=
include(joinpath(pwd(),"src","FinEcmt_OLS.jl"))
using .FinEcmt_OLS, .FinEcmt_ProbitTobit
using .FinEcmt_MLEGMM: MLE
=#

In [3]:
using DelimitedFiles, Statistics

# Probit

and linear probaility models for binary data (typically, 0 or 1).

## Loading Data

In [4]:
x = readdlm("Data/transportEd.txt",skipstart=1)
T = size(x,1)
(auto,dtime,constant) = (x[:,4],x[:,3],ones(T))
x      = [constant dtime]
xNames = ["c","dtime"]

2-element Vector{String}:
 "c"
 "dtime"

## Estimate

In [5]:
b_ols, = OlsGM(auto,x)
printmat("OLS estimates")
printmat(b_ols;rowNames=xNames)

LLtFun_p(par,y,x) = ProbitLL(par,y,x)[1]
(b_prob,_,_,std_sandw,LL1) = MLE(LLtFun_p,b_ols,auto,x)
t_prob = b_prob./std_sandw

(LL0,pHat) = BinLLConst(auto.>0.5)
R2  = 1 - sum(LL1)/LL0
predHat = ProbitLL(b_prob,auto,x)[2] .> 0.5

(R2_pred,cTab) = BinaryChoiceR2pred(auto.>0.5,predHat)

printlnPs("\nProbit parameter estimates, t-stat: ")
printmat([b_prob t_prob])
printlnPs("\nMcFadden's R2 ",R2,"\nR2_pred ",R2_pred)

OLS estimates

c         0.485
dtime     0.007


Probit parameter estimates, t-stat: 
    -0.064    -0.162
     0.030     3.109


McFadden's R2      0.576  
R2_pred      0.800


# Truncated Model

for the case where data $(y,x)$ is unavailable when $y < c$, where $c$ is some (known) threshold.

## Loading the Data

The next cells replicates an old example from Hill et al (2008), Table 16.8. See the lecture notes for more details.

### A remark on the code
The data set contains many different variables. To import them with their correct names, we create a named tuple of them by using the function `PutDataInNT()` from the `FinEcmt_OLS` module. (This is convenient, but not important for the focus of this notebook. An alternative is to use the `DataFrames.jl` package.)

In [6]:
(x,header) = readdlm("Data/mrozEd.txt",header=true)
X          = PutDataInNT(x,header)                         #NamedTuple with X.wage, X.exper, etc

c = ones(size(x,1))                                       #constant, used in the regressions

println("The variables in X (use as, for instance, X.wage): ")
printmat(keys(X))

y = X.hours
T = length(y)
x = [ones(T) X.educ X.exper X.age X.kidsl6];

The variables in X (use as, for instance, X.wage): 
(:taxableinc, :federaltax, :hsiblings, :hfathereduc, :hmothereduc, :siblings, :lfp, :hours, :kidsl6, :kids618, :age, :educ, :wage, :wage76, :hhours, :hage, :heduc, :hwage, :faminc, :mtr, :mothereduc, :fathereduc, :unemployment, :bigcity, :exper)



In [7]:
vv = y .> 0
b_ols,res, = OlsGM(y[vv],x[vv,:])

par0 = par0 = [b_ols;std(res)]
LLtFun_t(par,y,x) = TruncRegrLL(par,y,x)[1]     #truncated
(b_trunc,_,_,std_sandw,_) = MLE(LLtFun_t,par0,y[vv],x[vv,:])
t_trunc = b_trunc./std_sandw
printlnPs("Truncated MLE: ")
printmat([b_ols b_trunc[1:end-1] t_trunc[1:end-1]];colNames=["OLS","MLE","t-stat (MLE)"])


Truncated MLE: 
       OLS       MLEt-stat (MLE)
  1829.746  1920.045     4.534
   -16.462   -22.716    -1.062
    33.936    46.748     6.050
   -17.108   -25.245    -3.015
  -305.309  -491.158    -2.225



# Censored Model

Similar to the truncated case, except that the data $(y,x) = (c,x)$  when $y < c$. This means that we have more information than in the truncated case.

Applied to the same model as for the truncated case. Again, see THill et al (2008), Table 16.8 and the lecture notes for more details.

In [8]:
LLtFun_c(par,y,x) = CensRegrLL(par,y,x)[1]        #censored
(b_cens,_,_,std_sandw,_) = MLE(LLtFun_c,par0,y,x)
t_cens = b_cens./std_sandw
printlnPs("Censored MLE: ")
printmat([b_cens t_cens];colNames=["coef","std"])

Censored MLE: 
      coef       std
  1349.900     3.443
    73.290     3.595
    80.536    13.066
   -60.768    -9.123
  -918.911    -8.005
  1133.693    26.187

