# The Bernoulli family of distributions

## Binomial
* A discrete distribution on the number of successes $y \in \mathbb{Z}_{\ge 0}$ in $n$ Bernoulli trials, where each trial has probability $p$ of success.
* Parameterized by $n$ (the number of trials), and $p$ (the probability of success of each trial).
$$
p(y) = \binom{n}{y} p^y (1 - p)^{n - y}
$$

## Multinomial
* A discrete-vector distribution on the counts $(y_1, \ldots, y_k)$ of trials that fall in one of $k$ bins.
* Parameterized by $n$ (the number of trials) and $p_1, \dots, p_k$, the probability of each bin ($k$ is implicit).
* Specializes to the binomial distribution in the case of exactly two bins.
$$
p((y_1, \ldots, y_k)) = n! \prod_{i=1}^{k} \frac{p_i^{y_i}}{y_i!}
$$

## Geometric
* A discrete distributions on the number of trials $y \in \mathbb{Z}_{\ge 0}$ required for the first success.
* Parameterized by $p$ (the probability of success of each trial).
$$
p(y) = p(1 - p)^{y - 1}
$$

## Negative Binomial
* A discrete distribution on the number of trials $y \in \mathbb{Z}_{\ge 0}$ required for $r$ successes.
* Parameterized by $r$ (the number of required successes) and $p$.
* The $r=1$ case is the geometric distribution.
$$
p(y)= \binom{y - 1}{r - 1} p^{r} (1 - p)^{y - r}
$$

## Beta 
* A continuous distribution on the parameter $\theta \in [0, 1]$ of Bernoulli trials.
* Parameterized by $\alpha, \beta > 0$. 
* The conjugate prior to the binomial distribution. Given new data $y$, the parameters update by
$$\begin{align}
\alpha &\mapsto \alpha + y \\
\beta &\mapsto \beta + n - y
\end{align}$$
The distribution is given by
$$
p(\theta) = \frac{\Gamma(\alpha + \beta)}{\Gamma(\alpha) \Gamma(\beta)} \theta^{\alpha - 1} (1 - \theta)^{\beta - 1}
$$

## Dirichlet
* A continuous vector distribution on parameters $\theta_1, \dots, \theta_k$ of multinomial trials.
* Parameterized by $\alpha_1, \dots, \alpha_k > 0$.
* The conjugate prior to the multinomial distribution. The update function is analogous to the beta distribution.
* Reduces to the beta distribution when $k = 2$.
$$
p(\theta_1, \dots, \theta_k) = \Gamma\left(\sum_{i=1}^k \alpha_i \right) \prod_{i=1}^k 
                                    \frac{\theta_i^{\alpha_i}}{\Gamma(\alpha_i)}
$$

# The Poisson family of distributions
A **Poisson process** is one in which events occur exchangeably in all time intervals.

## Poisson

* A discrete distribution on the count $y$ of events occurring in a unit interval. 
* Parameterized by $\lambda > 0$, the rate at which events occur (in expectation).
$$
p(y) = \frac{\lambda^y e^{-\lambda}}{y!}
$$

## Exponential
* A continuous distribution on the time $t \ge  0$ between Poisson events.
* Parameterized by $\lambda > 0$, the rate at which events occur (in expectation).
$$
p(t) = \lambda e^{-\lambda t}
$$

## Gamma
* A continuous distribution on the time $t \ge 0$ it takes for $\alpha$ events to occur.
* Parameterized by $\lambda > 0$ (the rate of events) and $\alpha \in \mathbb{Z}_{\ge 0}$.
* The conjugate prior to the Poisson distribution.
$$
p(t) = \frac{\lambda^\alpha}{\Gamma(\alpha)}t^{\alpha - 1} e^{-\lambda t}
$$