# Probability Distributions

**probability distribution**: a mathematical function we use to representat a real-world process where the outcome is a **random variable**: a variable who's value in unknown

Most distributions have **parameters** that define their shape.

- uniform (randint): all outcomes are equally likely
    - parameters: low + high cutoffs
- binomial: number of sucesses after n trials
    - assumes independence of trials
    - n: number of trials
    - p: probability of sucess
- normal: the "bell curve"; values closer to the middle are more likely that values further away
    - mean, $\mu$: center point of the distribution
    - standard deviation, $\sigma$: the spread of the distribution, how wide or narrow it is
- poisson: number of events that occur for a given time interval
    - k, $\lambda$: average rate over the time interval
    - upper bound is infinite

What we can get out of a scipy distribution object:

- value -> probability
    - **pmf**: equal to a point (only for discrete distributions!)
    - **cdf**: less than or equal to a point
    - **sf**: greater than a point
- probability -> value
    - **ppf**: less than or equal to a point
    - **isf**: greater than a point
- **rvs** for random values (could also use numpy)

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

The `stats` module within [scipy](https://www.scipy.org/) gives a lot of functions and objects for statistical computation, including the functionality we'll need for working with probability distributions.

In [None]:
from scipy import stats

## Working with Probability Distributions

`rvs` can be used to visualize a distribution

In [None]:
x = stats.randint(1, 7).rvs(100)
pd.Series(x).value_counts().sort_index().plot.bar(
    width=.9,
    ec='black',
    title='Outcome of 1,000 Dice Rolls',
)

Demo: using distribution methods

- What is the probability we roll a 3?
- What is the probability we roll a 3 or less?
- What is the probability we roll greater than a 3?
- There's a 50% chance we roll less than or equal to what number?
- There's a 50% chance we roll greater than what number?

## Examples

Suppose the average high temperature in august is 98.2 Â± 2.7 degrees farenheiht. How hot would it have to be for a day to be in the hottest 10% of all days? The lowest 25%? How likely is it that the temperature breaks 100 degrees?

We know that the average number of messages in the zoom chat during a lecture is 25. During the probability distributions lecture, we observe that 29 chat messages were sent. How likely is it that we observed 29 chat messages? 29 or fewer? How likely is it we observed 29 or more chat messages?

Suppose our company's weekly newsletter has been sent out, in its entire lifetime, 10,412 times. Of the times it has been sent out, it has been opened 2,598 times. This week we sent out 688 emails and found that 160 of them were opened. How likely is it that this many emails or fewer were opened?