## Discrete Probability Distributions
### Binomial Distribution

Definition: A binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question with a p probability of success (each experiment is called a Bernoulli trial).

Notation: B(n, p)

Example: If I roll a die 10 times, how many times will I roll a 1?
B(10, 1/6)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import binom, poisson
import ipywidgets as widgets

%matplotlib inline
%config InlineBackend.figure_format = 'retina'

Next define a function <i>plot_pmf</i> that takes arguments $n$ and $p$ and plots the PMF.

In [None]:
def plot_pmf(n, p):
    '''
    Plot the probability mass function of Binom(n, p).
    '''
    k = np.arange(0, n + 1)
    P_binom = binom.pmf(k, n, p)
    plt.plot(k, P_binom, '-o')
    
    '''
    The snippet below sets the axes limits and
    associated properties of the plot.
    '''
    axes = plt.gca()
    axes.set_xlim([0, n])
    axes.set_ylim([0, 1.1 * max(P_binom)])
    plt.title('PMF of Bin(%i, %.2f)' % (n, p))
    plt.xlabel('Number k of successes ')
    plt.ylabel('Probability of k succcesses')
    plt.show()

Finally, define an interactive slider that enables you to vary $n$ over $[0,30]$ and $p$ over $[0,1]$ and then plot the resulting PMF according to the updated values of both the parameters.

In [None]:
widgets.interact(
    plot_pmf,
    n=widgets.IntSlider(min=0, max=30, step=1, value=15),
    p=widgets.FloatSlider(min=0, max=1, step=0.01, value=0.5))

Explore how adjusting the sliders changes the distribution (particularly center, spread, and shape). Based on the plot, if I roll a die 10 times, how many 6's would you expect I got?

### Poisson Distribution

Definition: A discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant rate and independently of the time since the last event.

Notation: λ = expected number of events

Example: I normally get 5 pieces of mail every day. On any given day, what are the chances I get 2 pieces of mail? Or 3? or 8? or 10?

In [None]:
def f(n, λ):
    k = np.arange(0, n+1)
    P_poisson = poisson.pmf(k, λ)
    plt.plot(k, P_poisson, '-o')
    plt.title('PMF of Poisson(%i)' %λ)
    plt.xlabel('Number of Events')
    plt.ylabel('Probability of Number of Events')
    plt.show()
widgets.interact(f, n=widgets.IntSlider(min=0, max=50, step=1, value=25), λ=widgets.FloatSlider(min=0, max=30, step=0.1, value=5))

Again, explore how changing the values of n and λ affect the distribution. Make a note of the value of n and λ if you find a combination that appears similar to a binomial distribution.

Based on the above plot, if I normally receive 5 pieces of mail a day, what are the chances I only get 1 piece of mail? 3 pieces? 10 pieces? Is it possible for me to get 20 pieces of mail one day?

### Geometric Distribution

Definition: The geometric distribution gives the probability that the first success requires k independent Bernoulli trials, each with success probability p.

Notation: G(p)

Example: If I roll two dice, what are the odds it takes me exactly 2 rolls to get doubles? What are the odds it takes me exactly 4 rolls? What are the odds it takes up to 6 rolls?
(The odds of doubles = 6/36 = 1/6 ≈ 0.167)

In [None]:
def f(p, n_max, CDF):
    x = np.arange(1, n_max + 1)
    y = [((1 - p)**(z - 1)) * p for z in x]
    z = [(1 - (1 - p)**zz) for zz in x]
    plt.plot(x, y, 'o-', label='PDF')
    if CDF == True:
        plt.plot(x, z, 'ro-', label='CDF')
    #plt.title("Exponential(%.2f)" %λ, fontsize = 20)
    plt.gcf().set_size_inches(20, 10)
    axes = plt.gca().set_xlim([1, n_max])
    if n_max == 1:
        axes = plt.gca().set_xlim([0, 1])
        plt.plot([0, 1], [p, p], 'b')
        plt.xticks([1])
    plt.xlabel('n', fontsize=20)
    plt.ylabel('y', fontsize=20)
    plt.title('PMF of Geometric(%0.2f)' % p, fontsize=20)
    plt.xticks(fontsize=16)
    plt.yticks(fontsize=16)
    plt.legend(fontsize=18)
    plt.show()


widgets.interact(
    f,
    p=widgets.FloatSlider(min=0, max=1, step=0.01, value=0.5),
    n_max=widgets.IntSlider(min=1, max=1000, step=1, value=10),
    CDF=widgets.ToggleButton(False)
)

Explore changing the value of p and n_max. Also note that this plot has a button labelled "CDF". When you click it, a line representing the Cumulative Distribution Function (CDF) will appear on the plot. The CDF represents the sum of all the p values up to the given n.

Based on the CDF, what appears to be the chance it takes me 6 or fewer rolls of two dice to get doubles?