# Maximum Likelihood for the Binomial Distribution

In this notebook, we will learn about the concept of maximum likelihood and how it is used for the binomial distribution. The binomial distribution is a probability distribution that describes the number of successes in a sequence of independent experiments. The concept of maximum likelihood is used to estimate the parameters of a statistical model.

Let's start by defining our problem. We have a dataset where we asked a group of people whether they preferred orange Fanta or grape Fanta. We denote the number of people who preferred orange Fanta as $X$, the total number of people we asked as $N$, and the probability of someone randomly choosing orange Fanta as $P$.

The binomial distribution can be represented as:

$$P(X=x) = \binom{n}{x} p^x (1-p)^{n-x}$$

In this formula, $p$ is the probability of success (someone choosing orange Fanta), $n$ is the total number of trials (total people asked), and $x$ is the number of successes (number of people who chose orange Fanta).

The goal of maximum likelihood is to find the value of $P$ that maximizes the likelihood of observing the data that we have. In other words, we want to find the value of $P$ that makes our data most probable.

## Calculating the Likelihood

To find the likelihood, we rearrange the left side of our equation to this:

$$L(P) = \binom{n}{x} p^x (1-p)^{n-x}$$

The likelihood function $L(P)$ is the same as the probability function $P(X=x)$, but we view the data $(n, x)$ as fixed and the parameter $p$ as variable.

We can calculate the likelihood for different values of $P$ given that a certain number of people out of a group said they preferred orange Fanta. For example, we can calculate $L(0.25)$, $L(0.5)$, $L(0.57)$ and so on.

## The Maximum Likelihood

We can plot these likelihoods for different values of $P$ to visualize the likelihood function. The peak of this plot represents the maximum likelihood, that is, the value of $P$ that maximizes the likelihood function.

To find this value, we take the derivative of the likelihood function with respect to $P$, set it to zero, and solve for $P$. This gives us the maximum likelihood estimate for $P$.

If we perform these calculations, we find that the maximum likelihood estimate for $P$ is given by $x/n$, that is, the number of people who preferred orange Fanta divided by the total number of people we asked. This result is intuitively satisfying, as it corresponds to the proportion of people who preferred orange Fanta in our sample.

We can generalize this result to say that for a binomial distribution, the maximum likelihood estimate for the probability of success is given by the number of successes divided by the total number of trials. This result is a powerful tool for estimating probabilities from binomially-distributed data.

## References

StatQuest: Maximum Likelihood, Clearly Explained!!! [Video]. (2017, August 7). YouTube. https://www.youtube.com/watch?v=XepXtl9YKwc

StatQuest with Josh Starmer. (2018, August 6). StatQuest: Maximum Likelihood For the Binomial Distribution, Clearly Explained!!! [Video]. YouTube. https://www.youtube.com/watch?v=pYxNSUDSFH4

Wikipedia contributors. (2021, May 27). Binomial distribution. In Wikipedia, The Free Encyclopedia. Retrieved 00:16, June 2, 2021, from https://en.wikipedia.org/w/index.php?title=Binomial_distribution&oldid=1025443958

Wikipedia contributors. (2021, June 1). Maximum likelihood estimation. In Wikipedia, The Free Encyclopedia. Retrieved 00:16, June 2, 2021, from https://en.wikipedia.org/w/index.php?title=Maximum_likelihood_estimation&oldid=1026486261