# Maximum Likelihood Estimator
------
#### **Written by Jinkyu Kim (Dept. of Finance, Business School, Hanyang Univ.) Contact: jkyu126@gmail.com** 

## Objective

- Explain the concepts of MLE

- Using general-purpose optimization function in R, I perform MLE estimation . 

- Repeat the former task with the built-in MLE function in R.

## MLE Concepts 
----------

### Preliminary

- Consider basic economic model of $y = \alpha+\beta x + \epsilon_i$
- Suppose the population disturbance $\epsilon_i$ follows normal distribution. 
- Then sample analog $e_i$ also follow normal distribution $N(0, {\sigma_e}^2)$. 
- Based on given economic model $y_i = a + bx_i + e_i$ , I calculate $e_i$ from the data. 
- The probability of each observed data $e_i$ is caculated applying $\mu = 0$, $\sigma$ = $\sigma_e$ to probability distribution function.

$$  
  f(e_i) = \frac{1}{\sqrt{2\pi{\sigma_e}^2}} 
  exp\left(-\frac{e_i ^2}{2{\sigma_e}^2}\right)
$$

### Likelihood Function

- Then likelihood function $L$ is defined as the probability of our observed data $e_1, e_2, \cdots, e_n$ are jointly generated from specified probability distribution, i.e., specified a and b. 

- Assuming the independence of the observations, joint probability can be calculated not as conditional probability, but as the product of each independent probability. Then we get final equation as follows.
$$
\begin{align*}
  L(\theta | e_1,e_2,\cdots,e_n) 
  &= f(e_1,e_2,\cdots,e_n | \theta) \\
  &= f(e_1 | \theta) f(e_2 | \theta) \cdots f(e_n | \theta) \\
  &= \prod_{i=1}^{n} f(e_i | \theta)
\end{align*}
$$

### Wrap-Up

- Taking logs of the likelihood function makes calculation a lot easier because summation is more straightforward not only for me but also for my computer...
$$
\begin{align*}
  logL=\sum_{i=1}^n log f(e_i | \theta) \\
\end{align*}
$$

- In R, I use **dnorm** function to assign a probability of normal distribution. 
- R codes are as follows. Basically, general optimization function finds minimization value, so I defined target function of the optim as minus log likelihood (i.e., $-logL$). Then the final results are calculated as $a=0.0295, b=1.0345$.



In [3]:
# Reading Data
mydat<-read.table(
  "http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt",
   col.names=c("firm", "year","x", "y"))

In [13]:
# MLE function setting
normal.likelihood = function(beta, data){
  a=beta[1]
  b=beta[2]
  residuals=data$y-a-b*data$x
  log.likelihood=sum(log(dnorm(residuals, 0, sd(residuals))))
  return(-log.likelihood)
}


# MLE using General Optimization Function
optim(par=c(0,1), normal.likelihood, data=mydat)

ERROR while rich displaying an object: Error in vapply(seq_along(mapped), function(i) {: 길이가 반드시 1이어야 하지만,
FUN(X[[5]])의 결과는 길이 0 입니다

Traceback:
1. FUN(X[[i]], ...)
2. tryCatch(withCallingHandlers({
 .     rpr <- mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler), error = outer_handler)
3. tryCatchList(expr, classes, parentenv, handlers)
4. tryCatchOne(expr, names, parentenv, handlers[[1L]])
5. doTryCatch(return(expr), name, parentenv, handler)
6. withCallingHandlers({
 .     rpr <- mime2repr[[mime]](obj)
 .     if (is.null(rpr)) 
 .         return(NULL)
 .     prepare_content(is.raw(rpr), rpr)
 . }, error = error_handler)
7. mime2repr[[mime]](obj)
8. repr_html.list(obj)
9. repr_list_generic(obj, "html", "\t<li>%s</li>\n", "\t<dt>$%s</dt>\n\t\t<dd>%s</dd>\n", 
 .     "<strong>$%s</strong> = %s", "<ol>\n%s</ol>\n", "<dl>\n%s</dl>\n", 
 .     numeric_item = "\t<dt>[[%s]]</dt>\n\t\t<dd>%s</dd>\n", 

$par
[1] 0.0295916 1.0349579

$value
[1] 10572.6

$counts
function gradient 
      35       NA 

$convergence
[1] 0

$message
NULL


## MLE estimation using built-in MLE function
----------
- Now I confirm whether my results are correct. I use built-in mle function. 
- Since I explained about the MLE process at the above section, I just display my code and check the answer at this section. 
- The function *mle* is contained in stats4 package. Also, be careful not to use initial values as a vector. It only allows to input value as a scalar.

In [15]:
library(stats4)
normal.likelihood = function(a, b){
  data=mydat
  residuals=data$y-a-b*data$x
  log.likelihood=sum(log(dnorm(residuals, 0, sd(residuals))))
  return(-log.likelihood)
}
mle(normal.likelihood, start = list(a=0, b=1))


Call:
mle(minuslogl = normal.likelihood, start = list(a = 0, b = 1))

Coefficients:
         a          b 
0.02967972 1.03483345 

### Results
--------
Two results are pretty similar. 


References
-----------
https://en.wikipedia.org/wiki/Maximum_likelihood_estimation