# Probability distributions

Probability distributions help you understand the distribution of values and calculate probabilities.

## Table of Contents

- [Random Variables and Probability Distribution](#rvars)
    - [Discrete Probability Distributions](#dpd)
- [xxxSummary Statistics](#sum)
    - [xxxMeasures of Central Tendency](#sum-central)
    - [xxxMeasures of Variability](#sum-var)
    - [xxxCorrelation (coefficient)](#corr)
- [xxxResources](#res)

<img src="images/stat-dpd.png" alt="" style="width: 400px;"/>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

---
<a id='rvars'></a>

## Random Variables

In statistics, **random variables** are characteristics that you can observe, but you don’t control them. They can be a `characteristic, measurement, or a count that varies randomly according to a function`. **Random** in this context indicates that you don’t know the value of the next observation, but you do know the probability associated with values and ranges of values.

## Probability Distribution

A **probability distribution** is a mathematical function that describes the probabilities for all possible outcomes of a **random variable**. In other words, the frequency of the observed values varies based on the underlying **probability distribution**.

Properties of **distributions in histograms**, and **probability distributions** are similar. They have a `shape`, `center`, and `spread`. However, the focus for probability distributions is on the `probabilities of the outcomes`. Importantly, **probability distributions** describe populations while **histograms** represent samples.

**Probability distributions** indicate the likelihood of an event or outcome. Statisticians use the following notation to describe probabilities: `p(x) =` the likelihood that random variable takes a specific value of x. The sum of all probabilities for all possible values must equal `1`. Furthermore, the probability for a particular value or range of values must be between `0` and `1`, inclusive.

**Probability distributions** describe the dispersion of the values for a random variable. Consequently, `the kind of variable determines the type of probability distribution`. For a single random variable, statisticians divide distributions into the following two types:

- **Discrete probability distributions** for discrete variables
- **Probability density functions** for continuous variables

---
<a id='dpd'></a>

## Discrete Probability Distributions

**Discrete probability functions** are also known as **probability mass functions** and can assume a set of distinct values. 

For **discrete probability distribution functions**, each possible value has a non-zero likelihood. Furthermore, the probabilities for all possible values must sum to one. Because the total probability is 1, one of the values must occur for each opportunity.

If the **discrete distribution** has a finite number of values, you can dis-play all the values with their corresponding probabilities in a table.

<img src="images/stat-dpd.png" alt="" style="width: 300px;"/>

### Types of Discrete Distribution

There are a variety of **discrete probability distributions** that you can use to model different types of data. The correct discrete distribution depends on the properties of your data. For example, use the:

- **Binomial distribution** to model binary data, such as coin tosses.
- **Poisson distribution** to model count data, such as the count of library book checkouts per hour.
- **Uniform distribution** to model multiple events with the same probability, such as rolling a die.

#### Binomial and Other Distributions for Binary Data

Binary data occur when you can place an observation into only two categories. It tells you that an event occurred or that an item has a particular characteristic. For instance, sale or no sale, pass or fail result.

Binary data allow you to `calculate proportions and percentages` easily. What is the proportion of items that pass the inspection? What percentage of customers make a purchase?

To use the `binomial`, `geometric`, `negative binomial`, and the `hypergeometric` distributions, you need to satisfy the following assumptions:

1. **There are only two possible outcomes per trial**. For example, accept or reject, sale or no sale, etc.
2. **Each trial is independent** (except for hypergeometric). The result of one trial does not affect the results of another trial. For instance, when flipping a coin, the outcome of a coin toss doesn’t influence the next coin toss.
3. **The probability remains constant over time** (except for hypergeometric). In some cases, this assumption is valid based on the physical properties, such as flipping a coin. However, if there is a chance the probability can change over time, you can use the **P chart** (a control chart) to confirm this assumption. For example, the likelihood that a process produces defective products might change over time.

The **binomial**, **geometric**, **negative binomial**, and **hypergeometric** distributions describe the probabilities associated with the number of events and when they occur.

#### Binomial Distribution

Use the **binomial distribution** to calculate probabilities that an event occurs a certain number of times in a set number of trials. Specifically, it calculates the probability of X events happening within N trials.

<img src="images/stat-dpd2.png" alt="" style="width: 400px;"/>

The graph displays the probability of rolling a 6 each number of times when you roll the die ten times. The shaded area sums the probabilities for four events and higher to calculate this **cumulative probability**. The **cumulative probability** of rolling at least four 6s is 0.06977.

#### Geometric Distribution

Use the **geometric distribution** when you know the probability of an event occurring and want to calculate the likelihood of the event first occurring during a specific trial. In other words, if you keep drawing random samples, what is the probability of the event/characteristic first appearing on each draw?

<img src="images/stat-dpd3.png" alt="" style="width: 400px;"/>

Each bar in the graph represents the probability of rolling the first six on a specific trial. For instance, the likelihood of rolling the first 6 on the third roll specifically is 0.11. The red shaded region indicates that you have a 33% cumulative chance of rolling the first 6 on the 7th roll or later.

#### Negative Binomial Distribution

Use the **negative binomial distribution** to calculate the number of tri- als that are required to observe the event a specific number of times. In other words, given a known probability of an event occurring and the number of events that you specify, this distribution calculates the probability for observing that number of events within N trials.

<img src="images/stat-dpd4.png" alt="" style="width: 400px;"/>

In the plot, each bar represents the probability of rolling precisely five 6s in the specified number of rolls. For example, the maximum likeli- hood (0.04) of rolling exactly five 6s occurs at 24 rolls, which is the peak of the histogram. Additionally, the shaded area indicates that the cumulative probability of obtaining five 6s in the first 27 rolls is nearly 0.5.

#### Hypergeometric Distribution

Use the hypergeometric distribution when you are drawing from a small population without replacement, and you want to calculate probabilities that an event occurs a certain number of times in a set amount of trials. Like the binomial distribution, the hypergeometric distribution calculates the probability of X events in N trials. How- ever, unlike the binomial distribution, it does not assume that the like- lihood of an event’s occurrence is constant. Instead, the hypergeometric distribution assumes that the probability changes be- cause you are drawing from a small population without replacement.

We’ll draw candy blindly from a jar. Suppose there are 15 candies of various colors in the jar and our favorite candies are red. For this scenario, the binary data values are “red” and “not red.” At the start, 5 out of the 15 (33%) candies are red. We’ll use the **hypergeometric distribution** to calculate the probabilities of drawing red candies when we draw five candies from the jar. The `probabilities in this scenario are not constant` because each draw from the jar affects the probabilities for the next draw.

<img src="images/stat-dpd5.png" alt="" style="width: 400px;"/>

The graph displays the probability of drawing each possible number of red candies when you draw 5 candies altogether.

---
<a id='res'></a>

# Resources

- [Statistics by Jim](https://statisticsbyjim.com/)
- [onlinemathlearning.com](https://www.onlinemathlearning.com)