# Binomial Probability Distribution using the Scipy Library

In previous notebooks, I have demonstrated how to create functions to obtain the Probability Mass Function (pmf) and Cumulative Distribution Function (cdf) for the Binomial distribution. These allow us to calculate probabilities for random variables that follow the binomial distribution and model processes where we are interested in the probability of a given number of successes from a fixed number of trials, where each trial is independent and has the same success probability. 

In this notebook, I will again obtain the pmf and cdf for the binomial distribution in Python, but this time I will use the Scipy software library. 

The Scipy software library makes it very easy for us to work with probability distributions. To make use of the binomial distribution we need to use a scipy method called binom. 

In [1]:
# Importing the binom method from scipy stats module. 

from scipy.stats import binom

Let us use an example where we want to work out the probability of getting 4 successes from 10 trials, with a probability of success on each trial set to 0.3 (30%). Calling the binom pmf method will tell us the probability of getting exactly 4 successes from 10 trials. 

The number of trials is 10 (n = 10), number of successes is 4 (k = 4), the success probability per trial is 0.3 (p = 0.3).

In [2]:
# Using the binom.pmf method. Note the order if argument (successes, trials, probability). 

binom.pmf(4, 10, 0.3)

0.2001209489999999

Here we can see that the probability is about 0.2 or 20%. 

The cdf is also available as a method using binom.cdf. This will tell us the cumulative probability of getting at least 4 successes from 10 trials with a probability of success of 0.3 on each trial. 

In [3]:
# Using binom.cdf method. Again, note the order of arguments (successes, trials, probability).

binom.cdf(4, 10, 0.3)

0.8497316674000001

We get a result that tell us the cumulative probability of getting 4 successes or lower is about 0.849 or 85%. 

A further useful method that we can use with the scipy binom method is something called survival function. This is the complement of the cdf (1 - cdf). It tells use the probability that we won't get x or fewer successes from n trials, with a probability of success on each trial of p.

Again, using the above example values, 4 successes in 10 trials, with a success probabilty of 0.3.

In [4]:
# Calling the binom.sf survival function. 

binom.sf(4, 10, 0.3)

0.15026833259999992

The output shows the probability of not getting 4 or fewer successes is about 15%. Note, this is the complement of the 85% probability given by the cdf. 

If we add the survival function probability and the cdf probability, they will sum to 1. 