# Module 9: Binomial Probabilities

In this module, we look at how to calculate single and cumulative probabilities for the binomial distribution. The binomial distribution measures the probability of getting some number of successes on a fixed number of independent trials with fixed probability of success on each trial.

## Single Probabilities

In this section, we discuss how to find the probability that a binomial random variable takes a specific value. In R, we find this probability using the function "dbinom()".

The "dbinom()" function has three inputs: "x", "size" and "prob". The "x" input is the desired number of successes. The "size" and "prob" inputs are the number of trials and probability of success respectively.

Suppose a gambler has a coin weighted so that the probability of getting heads on one flip is 0.4 rather than the usual 0.5. Let's find the probability of getting six heads when we flip this coin ten times.

In [None]:
dbinom(x=6, size=10, prob=0.4)

## Cumulative Probabilities

The answer we got in the last section wouldn't be too hard to get using the formula for binomial probabilities and a calculator, but what if we wanted the probability of getting at most five heads? Or if we instead flipped the coin 100 times and wanted the probability of getting at most 50 heads? These 'cumulative' binomial probabilities are much harder to calculate by hand. Fortunately, the "pbinom()" function in R calculates cumulative binomial probabilities for us.

The "pbinom()" function has three main inputs: "q", "size" and "prob". The "q" input is the maximum number of successes. The "size" and "prob" inputs are, as above, the number of trials and probability of success respectively.

Let's use the same weighted coin as in our last example. Remember that it shows heads with probability 0.4. This time however, we flip the coin 100 times and we want to find the probability of getting at most 50 heads.

In [None]:
pbinom(q=50, size=100, prob=0.4)

The "pbinom()" function also has an optional input: "lower.tail". If "lower.tail" is set to "TRUE" then "pbinom()" gives the probability of being at most "q". If "lower.tail" is set to "FALSE" then "pbinom()" gives the probability of being greater than "q". The default value of "lower.tail" is "TRUE", so if we do not specify a value, "pbinom()" will automatically use "lower.tail=TRUE".

Let's find the probability of getting more than 50 heads when we flip our weighted coin 100 times.

In [None]:
pbinom(q=50, size=100, prob=0.4, lower.tail=FALSE)

Note that the answers from "pbinom()" with "lower.tail" set to "TRUE" and with "lower.tail" set tp "FALSE" will always add to 1.

## Normal Approximation

If we have enough trials in a binomial experiment and the probability of success is not too close to one or zero, then we can approximate binomial probabilities with the normal distribution. Remember that we calculate normal probabilities in R using the "pnorm()" function. See Module 5 for a more thorough explanation of normal probabilities and the "pnorm()" function.

In order to approximate the binomial distribution with a normal distribution, we need the mean and standard deviation. Fortunately, these are known. The mean of a binomial distribution is $n*p$, where $n$ is the number of trials and $p$ is the probability of success, and the standard deviation is $\sqrt{n*p*(1-p)}$.

As long as $np>10$ and $n(1-p)>10$, the normal is a good approximation to the binomial.

Let's approximate the probability of getting at most 50 heads out of 100 flips of our weighted coin. First we calculate the mean and standard deviation of the corresponding normal distribution. Then we use the "pnorm()" function to compute the normal approximation. For reference, we will also calculate the exact probability using "pbinom()".

In [None]:
n = 100
p = 0.4
mu = n*p #Calculate the population mean
sigma = sqrt(n*p*(1-p)) #Calculate the population standard deviation
print("Normal Approximation:")
pnorm(q=50, mean=mu, sd=sigma)
print("Exact Probability:")
pbinom(q=50, size=100, prob=0.4)