Lecture based on the work of Ryan Henning, Adam Richards, Tammy Lee, Lee Murray, Scott Schwartz, Matthew Drury, and other Galvanize folks.

## Continuous vs Discrete (Random) Variables

All above, we only talked about _discrete_ random variables (although we didn't use that term until now). However, a random variable (or a variable in general) need not be discrete. Here's the difference between _discrete_ and _continuous_: [ref](https://en.wikipedia.org/wiki/Continuous_and_discrete_variables)

**Discrete**: there is a positive, minimum difference between two values the variable can take

**Continuous**: between two values the variable can take, there are uncountably infinite other values the variable can take

Another way to put it: There are measurable "gaps" between value of a discrete variable, where the gaps between values of a continuous variable can be made infinitesimal.

### Probability Mass Function (PMF)

The PMF of a r.v. $X$ gives the probabilities of every outcome in the support $S$ of r.v. $X$. For example:

<img src="images/pmf.png" width=400px>

<br><font color='red'><center>Draw a PMF for a single random variable $X$ that is the sum of two 6-sided dice?</center></font>

### Probability Density Function (PDF)

The PDF of a r.v. $X$ gives the relative likelihood of a random variable's support. PDFs should not be interpreted the same as a PMF; with a PDF you only can interpret area-under-the-curve.

<img src="images/pdf.png" width=400px>

<br><font color='red'><center>What is the probability that I sample the r.v. and get exactly 0.0?<br>I.e. $P(X=0.0)=$ ???</center></font>

### Recall: Expectation and Variance

For **discrete** random variables (let $P$ be the PMF of the r.v. $X$):

$$E(X) = \sum_{s \in S} s * P(X=s)$$

$$Var(X) = \sum_{s \in S} (s-E(X))^2 * P(X=s)$$

For **continuous** random variables (let $f$ is the PDF of r.v. $X$):

$$E(X) = \int_{x=-\infty}^{\infty} x * f(x) dx$$

$$Var(X) = \int_{x=-\infty}^{\infty} (x-E(X))^2 * f(x) dx$$

___

<font color='red'><center>What is the difference between $E(X)$ and the mean of $X$?</center></font>

## Major Probability Distributions

All you need to build a distribution is a PMF (if discrete) or a PDF (if continuous), and define the support. For it to be legit, the PMF must be non-negative and sum to 1 over the support.

Next, you derive the mean and variance using the PMF (or PDF), the support, and the definition of mean and variance. See [this](http://filestore.aqa.org.uk/subjects/AQA-MS03-W-2-SM.PDF) (or this [local copy](misc/AQA-MS03-W-2-SM.PDF)) for a derivation of the mean and variance for all the distributions below.

### Discrete Distributions:

#### Bernoulli

$X \sim \text{Bernoulli}(p)$:  
A single coin flip turns up heads with probability $p$.

PMF: $P[success] = p$ , $P[failure] = 1-p$

Support: $\{\text{success}, \text{failure}\}$

Mean: $p$

Variance: $p (1-p)$

<img src="images/bernoulli.png" width=400px>

#### Binomial

$X \sim \text{Binomial}(n, p)$:  
The number of coin flips out of n which turn up heads.

PMF: $P[X=k] = {n \choose k} p^k (1-p)^{n-k}$

Support: $k \in \{0,1,...,n\}$

Mean: $np$

Variance: $np(1-p)$

<img src="https://upload.wikimedia.org/wikipedia/commons/7/75/Binomial_distribution_pmf.svg" width=400px>

#### Geometric

$X \sim \text{Geometric}(p)$:  
The number of trials until a coin flip turns up heads.

PMF: $P[X=k] = p(1-p)^{k-1}$

Support: $k \in \{0,1,...\}$

Mean: $\frac{1}{p}$

Variance: $\frac{1-p}{p^2}$

<img src="images/geometric.png" width=400px>

#### Poisson

$X \sim \text{Poisson}(\lambda)$:  
The number of taxis passing a street corner in a given hour (on avg, 10/hr, so $\lambda=10$).

PMF: $P[X=k] = \frac{ \lambda^k e^{-\lambda} }{ k! }$

Support: $k \in \{0,1,2,...\}$

Mean: $\lambda$

Variance: $\lambda$

<img src="https://upload.wikimedia.org/wikipedia/commons/1/16/Poisson_pmf.svg" width=400px>

This is a good time to mention the [Gambler's fallacy](https://en.wikipedia.org/wiki/Gambler%27s_fallacy). Does the Poisson distribution disagree with the fallacy?

### Continuous Distributions:

#### Uniform

$X \sim \text{Uniform}(a, b)$:  
Degrees between hour hand and minute hand ($a=0, b=360$).

PDF: $f(x) = \frac{1}{b-a}$

Support: $x \in [a, b]$

Mean: $\frac{a+b}{2}$

Variance: $\frac{(b-a)^2}{2}$

<img src="https://upload.wikimedia.org/wikipedia/commons/9/96/Uniform_Distribution_PDF_SVG.svg" width=400px>

#### Normal (a.k.a., Gaussian)

$X \sim \text{Gaussian}(\mu, \sigma)$:  
IQ Scores (if $\mu = 100, \sigma = 10$)

PDF: $f(x) = \frac{ 1 }{ \sigma \sqrt{2 \pi} } \exp(- \frac{ (x-\mu)^2 }{ 2 \sigma^2 })$

Support: $x \in (-\infty, \infty)$

Mean: $\mu$

Variance: $\sigma^2$

<img src="https://upload.wikimedia.org/wikipedia/commons/7/74/Normal_Distribution_PDF.svg" width=400px>

#### Exponential

$X \sim \text{Exponential}(\lambda)$:  
Number of minutes until a taxi will pass street corner (if on average 10 taxis pass per hour; $\lambda=10/60$ the number of taxis per minute)

CDF: $f(x) = \lambda \exp(\lambda x)$

Support: $x \in [0, \infty)$

Mean: $\frac{1}{\lambda}$

Variance: $\frac{1}{\lambda^2}$

<img src="https://upload.wikimedia.org/wikipedia/commons/e/ec/Exponential_pdf.svg" width=400px>

## Joint Probability Distribution

The probability of pairs of events from two (or more) random variables:

$$P(A=a, B=b)$$

If two random variables, also called a __bivariate distribution__ or if more random variables, called a __multivariate distribution__.

Always true:

$$P(A=a, B=b) = P(A=a | B=b) * P(B=b)$$

If independent:

$$P(A=a, B=b) = P(A=a) * P(B=b)$$

Always (if discrete):

$$1 = \sum_{a \in S_A} \sum_{b \in S_B} P(A=a, B=b)$$

Always (if continuous):

$$1 = \int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(a, b) \, da \, db$$