# 06: Probability 

- Valid generalizations in inferential statistics require either random sampling in the case of surveys or random assignment in the case of experiments. 

## Population & Sample

- **Population**: Any complete set of observations (or potential observations). 

- **Sample**: Any subset of observations from a population. 

- **Random Sampling**: A selection process that guarantees all potential observations in the population have an equal chance of being selected. 

- **Random Assignment**: A procedure designed to ensure that each subject has an equal chance of being assigned to any group in an experiment. 

- Random sampling occurs in well-designed surveys, and random assignment occurs in well-designed experiments. 

## Probability 

- **Probability**: The proportion or fraction of times that a particular event is likely to occur. 

- **Mutually Exclusive Events**: Events that cannot occur together. 

- **Addition Rule**: Add together the separate probabilities of several mutually exclusive events to find the probability that any of these events will occur. 

$$
Pr(A \, or \, B) = Pr(A) + Pr(B)
$$

Where $Pr()$ is the probability of the event and $A$ and $B$ are mutually exclusive events. 

- **Independent Events**: The occurrence of one event has no effect on the probability that the other event will occur. 

- **Multiplication Rule**: Multiply together the separate probabilities of several independent events to find the probability that these events will occur together. 

$$
Pr(A \, and \, B) = [Pr(A)Pr(B)]
$$

- **Dependent Events**: When the probability of one event affects the probability of another event, these events are dependent. 

- **Conditional Probability**: The probability of one event, given the occurrence of another event.

- Use **Frequency Analysis** to calculate conditional probability. 

- Probability plays an important role in inferential statistics, including in the important are of *hypothesis testing.*

In [6]:
from scipy.stats import norm

Probability that a randomly selected z-score will be above 1.96

In [9]:
round(1 - norm.cdf(1.96), 5)

0.025

Probability that a randomly selected z-score will be either above 1.96 or below -1.96

In [13]:
round(1 - norm.cdf(1.96), 5) + round(norm.cdf(-1.96), 5)

0.05

Probability that a randomly selected z-score will be between 1.96 and 1.96

In [14]:
round(norm.cdf(1.96) - norm.cdf(-1.96), 5)

0.95

Probability that a randomly selected z-score will be either above 2.58 or below -2.58

In [19]:
round(norm.cdf(-2.58), 7)*2*100

0.988