# Likelihood — What Is It?

Likelihood is a fundamental concept in statistics. It is closely related to probability, but **not the same**.

- **Probability:** “Given parameters, what’s the chance of seeing this data?”  
- **Likelihood:** “Given the data, how plausible are different parameters?”

---

## Formal Definition

Suppose you have data:

$$
D = \{x_1, x_2, \dots, x_n\}
$$

and a model with parameters $\theta$.

The **probability of observing the data under the model** is:

$$
P(D \mid \theta)
$$

This is also called the **likelihood function**, denoted as:

$$
L(\theta \mid D) = P(D \mid \theta)
$$

---

### Intuition

- In probability, we fix the parameters and ask how likely the data is.  
- In likelihood, we **fix the data** and ask how plausible different parameter values are.  

This distinction is key in **Maximum Likelihood Estimation (MLE)** and **Bayesian inference**, where the likelihood updates our belief about model parameters given observed data.


# Likelihood Example: Coin Toss

Suppose we have a biased coin, and let:

$$
\theta = P(\text{Heads})
$$

We observe the following data:

```
H H T H T H H T H H
```

- 7 heads, 3 tails

---

### Likelihood Function

The likelihood of $\theta$ given this data is:

$$
L(\theta) = \theta^7 (1-\theta)^3
$$

---

### Key Points

- **Data is fixed**: we observed 7 heads and 3 tails.  
- **Parameter is variable**: $\theta$ can take any value between 0 and 1.  
- **Likelihood function** tells us how plausible each $\theta$ is given the observed data.  

For example, $\theta = 0.5$ has a likelihood of:

$$
L(0.5) = 0.5^7 \cdot 0.5^3 = 0.5^{10}
$$

By maximizing $L(\theta)$, we find the **most likely value of $\theta$** — this is the **Maximum Likelihood Estimate (MLE)**.
