# Maximum Likelihood Estimate (MLE)

The goal is to obtain a MLE for a simple problem. 

Estimate $\mu$ and $\sigma^2$ from $X_{n \times 1}$. 

Likelihood $\mathcal{L}_{n}(\mu,\sigma^2) = \mathcal{L}_{n}(\mu,\sigma^2|X) = f(X|\mu,\sigma^2)$

Assuming each $x_i$ are *independent random variables*

$ \implies \mathcal{L}_{n}(\mu,\sigma^2|X) = \Pi_{i=1}^{n} p(x_i|\mu,\sigma^2)$

Assuming $x_i$ are *identically* distributed as normal distributions, 

$ \implies \mathcal{L}_{n}(\mu,\sigma^2|X) = \Pi_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x_i-\mu)^2}{2 \sigma^2}}$



### Our MLE(stimates) $\mu_*$, $\sigma_*^2$
$$\{\mu_*, \sigma_*^2\} = \argmax_{\mu,\sigma^2} \mathcal{L}_{n}(\mu,\sigma^2|X) $$



\begin{aligned}
\implies \{\mu_*, \sigma_*^2\}  &= \argmax_{\mu,\sigma^2} \Pi_{i=1}^{n} \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x_i-\mu)^2}{2 \sigma^2}} \\
                                &= \argmax_{\mu,\sigma^2} (\frac{1}{\sqrt{2\pi\sigma^2}})^n e^{\sum_{i=1}^n -\frac{(x_i-\mu)^2}{2 \sigma^2}}

\end{aligned}                            
 We need to make a harmless and necessary assumption that $\sigma^2 \neq 0$.

 
### Introducing Log-likelihood (a very common "trick")
*Why introduce* $\log$
* Any function is easy to optimize if it is convex. One can simply find the extremums and one of it must be the optimal value.
* Difficult to analyze a function for convexity (think about getting the $\frac{\partial f}{\partial \mu}$) that is all in multiplication - $a_1(\mu,\sigma^2) \times a_2(\mu,\sigma^2) \times ... \times a_n(\mu,\sigma^2)$.
* Notice that all the functions $a_i$ s are positive functions (since they are pdfs). 
* Also, $\log$ is strictly increasing function - so maximizing $log$ likelihood is same as maximizing likelihood. 
* $log$ converts the multiplication to addition  - $a_1(\mu,\sigma^2) + a_2(\mu,\sigma^2) + ... + a_n(\mu,\sigma^2)$ 
and convenient to calculate derivatives ($\frac{\partial f}{\partial \mu}$).


\begin{aligned}
\implies \{\mu_*, \sigma_*^2\} &= \argmax_{\mu,\sigma^2} \log (\frac{1}{\sqrt{2\pi\sigma^2}})^n e^{\sum_{i=1}^n -\frac{(x_i-\mu)^2}{2 \sigma^2}} \\
                               &= \argmax_{\mu,\sigma^2} \frac{n}{2}\log(\frac{1}{2\pi\sigma^2}) -   \sum_{i=1}^n \frac{(x_i-\mu)^2}{2 \sigma^2} \\ 
                               &= \argmax_{\mu,\sigma^2} -\frac{n}{2}\log(2\pi\sigma^2) -   \sum_{i=1}^n \frac{(x_i-\mu)^2}{2 \sigma^2}
\end{aligned}


### Estimating $\mu_*$ 
The above function is concave in $\mu$ (Ideally, we need to do a second derivative test - but I said this directly because it is just negative of sum of "squares"). So, the first derivative zero should give the "extremum". 


\begin{aligned}
 \frac{\partial log \mathcal{L}_n(\mu,\sigma^2|X)}{\partial \mu} = &\sum_{i=1}^n (2)(-1)\frac{x_i - \mu}{2 \sigma^2} = 0 \\
 \implies &\sum_{i=1}^n x_i - \mu_* = 0 \\
 \implies &\sum_{i=1}^n x_i = n\mu_* \\
 \implies  &\mu_* = \frac{1}{n}\sum_{i=1}^n x_i
\end{aligned}

In [None]:
import numpy as np 

In [None]:
mu_true = 4.3
sigma_true = 2.1


X = np.random.normal(mu_true, sigma_true, 100)