# "I Am Exponential Distributions (And So Can You!)"
> "A beginner's guide to exponential distributions and coding them in Python."

- toc: true
- branch: master
- badges: true
- comments: true
- categories: [exponential distributions, statistics]
- hide: false
- search_exclude: true

# I Am Exponential Distributions (And So Can You!)\*
\*Title generously adapted from [Stephen Colbert](https://en.wikipedia.org/wiki/I_Am_America_(And_So_Can_You!))

## Theory
### Theory and background on exponential distributions.
Adapted from [General Assembly's *Data Science Immersive*](https://generalassemb.ly/education/data-science-immersive/) and  [Lumen Learning's *Introduction to Statistics*](https://courses.lumenlearning.com/introstats1/chapter/the-exponential-distribution/).
- We use the exponential distribution when we are interested in modeling the **amount of time until a success occurs**. Since time is continuous, the exponential distribution is a continuous distribution. (The exponential distribution can also be used to model situations where certain events occur with a constant probability per unit value, such as length or money.)
- There are fewer large values and more small values for an exponential distribution.
- The exponential distribution has the **memoryless property**, which says that future probabilities do not depend on any past information.
- The parameter of the distribution is β, the average time to an event (or the average distance, average amount of money, etc.).
- If we are modeling the amount of time until a success occurs, then x is the time elapsed (otherwise it's the distance, amount of money, etc.).
- The probability density function (PDF, the relative likelihood of success at any given x) is:
<p style="text-align: center;">$f(x|\beta) = \frac{1}{\beta} e^{-x/\beta}$</p>
- The cumulative distribution function (CDF, the probability of success at a value less than or equal to any given x) is:
<p style="text-align: center;">$F(x|\beta) = 1 - e^{-x/\beta}$</p>

## Examples
### Real-world examples of the exponential distribution, along with possible values of β.
Adapted from [Lumen Learning's *Introduction to Statistics*](https://courses.lumenlearning.com/introstats1/chapter/the-exponential-distribution/).
- The amount of time, in days, until an earthquake of magnitude 7 or greater on the Richter scale occurs on Earth.
    - $\beta$ = 20
- The length, in minutes, of long distance business telephone calls.
    - $\beta$ = 112
- The value, in cents, of the change that you have in your pocket or purse.
    - $\beta$ = 50
- The distance, in kilometers, between roadkill on a given road.
    - $\beta$ = 100
- The amount of dollars customers spend in one trip to the supermarket.
    - $\beta$ = 100
- The amount of time, in minutes, a postal clerk spends with their customer. 
    - $\beta$ = 4
- The amount of time, in minutes, spouses shop for anniversary cards.
    - $\beta$ = 8
- The number of days ahead that travelers purchase their airline tickets.
    - $\beta$ = 15
- How long a computer part lasts, in years.
    - $\beta$ = 10
- Time for the next customer to arrive at a store, in minutes.
    - $\beta$ = 2
- Time until the next call at a police station in a large city, in minutes.
    - $\beta$ = 0.25

## Code
### How to plot the PDF and the CDF of an exponential distribution in Python.
Adapted from [General Assembly's *Data Science Immersive*](https://generalassemb.ly/education/data-science-immersive/).
- Begin by importing the following:
    - numpy, in order to create our array of x values,
    - matplotlib, in order to create the plot, and
    - scipy.stats, in order to calculate the PDF and the CDF for each x value.
- Add the lines below the <mark>import</mark> functions to configure how the plots will display.

In [None]:
#collapse_show
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt

%config InlineBackend.figure_format = 'retina'
%matplotlib inline

Write a function that will plot a PDF given a choice of input values:
- Low and high x values (<mark>low</mark>, <mark>high</mark>)
- The name of your plot (<mark>dist_name</mark>)
- The label for your x-axis (<mark>x_label</mark>)
- The statistical distribution (<mark>stats_dist</mark>)
- The width of the plot lines (<mark>lw</mark>)

You can use this function again later to plot other statistical distributions

In [None]:
#collapse_show
def plot_continuous_pdf(low, high, dist_name = 'Continuous', xlabel = 'Time', stats_dist = None, lw = 5):
        
    x = np.arange(low, high + 1)
    
    fig, ax = plt.subplots(1, 1, figsize=(10,5))
    
    ax.set_xlim(low - 1, high + 1)
    ax.set_xlabel(xlabel, fontsize = 16)
    ax.set_ylabel('Probability Density Function (pdf)', fontsize = 16)
    ax.plot(x, stats_dist.pdf(x), color = 'darkred', lw = lw)
    ax.set_ylim(0, np.max(stats_dist.pdf(x)) + 0.03)
    
    plt.title(f'{dist_name} \n', fontsize = 20)

    plt.show()

Call the function with reasonable input values for the model, including:
- x-values from a low of 0 to a high of wherever your probabilities approach 0
- A statistical distribution of <mark>stats.expon(scale = $\beta$)</mark> for the exponential distribution

In [None]:
#collapse_show
plot_continuous_pdf(0, 60, dist_name = 'Probability/time of exactly X minutes until next bus', 
                    stats_dist = stats.expon(scale = 6), xlabel = 'Minutes')

The PDF shows the chance per time of a bus arriving at any given moment, given that the bus arrives an average of once every six minutes.

![2020-03-12-I-Am-Exponential-Distributions-And-So-Can-You-Image-1.png](attachment:2020-03-12-I-Am-Exponential-Distributions-And-So-Can-You-Image-1.png)

Write a function that will plot a CDF given a choice of input values:
- Low and high x values (<mark>low</mark>, <mark>high</mark>)
- The name of your plot (<mark>dist_name</mark>)
- The label for your x-axis (<mark>x_label</mark>)
- The statistical distribution (<mark>stats_dist</mark>)
- The width of the plot lines (<mark>lw</mark>)

You can use this function again later to plot other statistical distributions

In [None]:
#collapse_show
def plot_continuous_cdf(low, high, dist_name = 'Continuous', xlabel = 'Time', stats_dist = None):
        
    x = np.linspace(low, high + 1, 300)
    
    fig, ax = plt.subplots(1, 1, figsize = (10, 5))

    ax.set_ylim(0, 1.1)
    ax.set_xlim(low - 1, high + 1)
    ax.set_xlabel(xlabel, fontsize = 16)
    ax.set_ylabel('Cumulative Distribution Function (cdf)', fontsize = 16)
    
    ax.plot(x, stats_dist.cdf(x), lw = 4, color = 'darkblue')

    plt.title(f'{dist_name} \n', fontsize = 20)

    plt.show()

Call the function with reasonable input values for the model, including:
- x-values from a low of 0 to a high of wherever your probabilities approach 0
- A statistical distribution of <mark>stats.expon(scale = $\beta$)</mark> for the exponential distribution

In [None]:
#collapse_show
plot_continuous_cdf(0, 60, dist_name = 'Probability of X or fewer minutes until next bus', 
                    stats_dist = stats.expon(scale = 6), xlabel = 'Minutes')

The CDF shows the chance of a bus arriving at or before any moment, given that the bus arrives an average of once every six minutes.

![2020-03-12-I-Am-Exponential-Distributions-And-So-Can-You-Image-2.png](attachment:2020-03-12-I-Am-Exponential-Distributions-And-So-Can-You-Image-2.png)