<a href="https://colab.research.google.com/github/zyin36/MAT-422/blob/main/hw2_4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## **Maximum Likelihood Estimation**

### **MLE for Random Samples**
Goal: Estimate the parameters of a probability distribution through maximizing a likelihood function

**Definition 2.4.1**:
Let $X_1, X_2, ..., X_n$ be a random sample, which has a joint pdf $f(x_1, x_2, ..., x_n; θ_1,...,θ_m)$, where $x_1,...,x_n$ are observed sample values and $θ_1,...,θ_m$ are parameters. When we regard this joint pdf as a function of $θ_1,...,θ_m$ and $x_1,...,x_n$ is constant, it is the likelihood function. We find MLE $\hat{θ_1}, ..., \hat{θ_m}$ such that the likelihood function is maximized, i.e.,  $f(x_1, ..., x_n; \hat{θ_1}, ..., \hat{θ_m}) ≥ f(x_1, ..., x_n; θ_1,...,θ_m)$ for all $θ_1, ..., θ_m$.

  Example (from 2.3): Let $X_1, X_2, ..., X_n$ be a random sample from a normal distribution.

  \begin{align}
        f(x_1, x_2, ..., x_n; μ, σ^2) = \left(\frac{1}{2πσ^2}\right)^\frac{n}{2}  e^{\frac{-Σ(x_i - μ)^2}{2σ^2}}
  \end{align}
is the likelihood function.

Finding the partial derivatives of $ln(f)$ wrt $μ$ and $σ^2$ respectively, then equating them to zero and solve for $μ$ and $σ^2$ respectively gives us the maximizing values -
  \begin{align}
  \hat{μ} = \bar{X} \\[1em] \hat{σ}^2 = \frac{Σ(X_i - \bar{X})^2}{n}
  \end{align}

### **Linear Regression**
In a dataset, we are usually interested in the correlation between a set for independent variables ($𝐱$) and a dependent variable ($y$).

Goal: Given data points $\{(𝐱_i, y_i)\}_{i=1}^n$, where each $𝐱_i = (x_{i1},...,x_{ip})$, and the estimation of $y_i$ is
  \begin{align}
  \hat{y_i} = β_0 + β_1x_{i1} + ... + β_px_{ip}
  \end{align}
Find coeffients $β_1,...,β_p$ such that the sum of residual squared is minimized.

$y_i = \hat{y_i} + 𝜀$, where $ε ∼ N(0, σ^2)$, so $y_i ∼ N(\hat{y_i}, σ^2)$.

Based on the derivation from notes, we find that MLE of $β$ is exactly the least square problem.



In [13]:
import matplotlib.pyplot as plt
import pandas as pd
from sklearn import datasets
from scipy import stats

# load diabetes dataset
df, target = datasets.load_diabetes(return_X_y=True, as_frame=True)

print("df shape: ", df.shape)
print("target shape: ", target.shape)

print("df datatypes: \n", df.dtypes)
print("target datatype: \n", target.dtypes)

# linear regression
A = df.to_numpy()
b = target.to_numpy()



df shape:  (442, 10)
target shape:  (442,)
df datatypes: 
 age    float64
sex    float64
bmi    float64
bp     float64
s1     float64
s2     float64
s3     float64
s4     float64
s5     float64
s6     float64
dtype: object
target datatype: 
 float64
