# Uniform Distributions

A uniform distribution is a probability distribution where every value has the same probability of being chosen.

## Discrete Random Variables

The probability mass function (PMF) of a uniformly distributed discrete random variable $X$ is:

\begin{equation*}
P(X) = \frac{1}{b-a +1}, a \leq x \leq b
\label{eq:1} \tag{1}
\end{equation*}

where $E[X] = \frac{b+a}{2}$ and $var(X) = \frac{(b-a+1)^2 - 1}{12}$.

## Continuous Random Variables

The probability density function (PDF) of a uniformly distributed continuous random variable:

\begin{equation*}
f(x) = \frac{1}{b-a}, a \leq x \leq b
\label{eq:2} \tag{2}
\end{equation*}

where $E[X]=\frac{b+a}{2}$ and $var(X) = \frac{(b-a)^2}{12}$. It's distribution function is:

\begin{equation*}
\begin{split}
F(x) &= \int_{a}^{x} f(y)dy \frac{1}{b-a} \\
&= \frac{x-a}{b-a}, a \leq x \leq b 
\end{split}
\label{eq:3} \tag{3}
\end{equation*}

$F(x) = 0$ if $x < a$ and $F(x) = 1$ if $x > b$.

### Mean and Variance

\begin{equation*}
\begin{split}
E[X] &= \frac{1}{b-a} \int_{a}^{b} xdx = \frac{b+a}{2} \\
E[X^2] &= \frac{1}{b-a} \int_{a}^{b} x^2dx = \frac{a^2 + ab + b^2}{3} \\
\text{var}(X) &= E[X^2] - E[X]^2 = \frac{(b - a)^2}{12}
\end{split}
\label{eq:4} \tag{4}
\end{equation*}

### Examples

#### Sum Over 1

*How many IID uniform random variables on [0,1] would you expect to draw until their sum is greater than 1?*

We're looking for the expected number of draws, $N$, needed for their sum to exceed 1. $N$ is a discrete random variable with values in the range $[2, \infty]$. $N=2$ at the minimum since $N=1$ will never return a sum greater than 1. Therefore,

\begin{equation*}
E[N] = \sum_{n=2}^{\infty} nP(N=n)
\end{equation*}

We can figure out what $P(N=n)$ is by first finding $P(N=2)$ and $P(N=3)$. Let $X_i$ ~ Uniform(0,1), where $1 \leq i \leq N$. Then,

\begin{equation*}
\begin{split}
P(N=2) &= P(N \leq 2)\\
&= P(X_1 + X_2 > 1) \\
&= 1 - P(X_1 + X_2 \leq 1) \\
&= 1 - \int_{0}^{1} \int_{0}^{1-x_2} dx_1 dx_2 \\
&= \frac{1}{2}
\end{split}
\end{equation*}

Now let's find $P(N=3) = P(X_1 + X_2 + X_3 > 1, X_1 + X_2 \leq 1)$. Solving both inequalities simultaneously is a bit complicated, so we can use $P(N=3) = P(N \leq 3) - P(N \leq 2)$ instead. We already have $P(N \leq 2)$ and can easily solve for $P(N \leq 3)$:

\begin{equation*}
\begin{split}
P(N \leq 3) &= P(X_1 + X_2 + X_3 > 1) \\
&= 1 - P(X_1 + X_2 + X_3 \leq 1) \\
&= 1 - \int_{0}^{1} \int_{0}^{1-x_3} \int_{0}^{1-x_3-x_2} dx_1 dx_2 dx_3 \\
&= \frac{5}{6}
\end{split}
\end{equation*}

Therefore,

\begin{equation*}
\begin{split}
P(N=3) &= P(N \leq 3) - P(N \leq 2) \\
&= \frac{5}{6} - \frac{1}{2} = \frac{1}{3}
\end{split}
\end{equation*}

Generalizing both $P(N \leq 2) = \frac{1}{2}$ and $P(N \leq 3) = \frac{1}{6}$ gives us:

\begin{equation*}
P(N \leq n) = 1 - \frac{1}{n!}
\end{equation*}

And generalizing the outcomes of both $P(N=2)$ and $P(N=3)$ for any $n \geq 2$ gives us:

\begin{equation*}
\begin{split}
P(N = n) &= P(N \leq n) - P(N \leq n-1) \\
&= \left(1 - \frac{1}{n!} \right) - \left( 1 - \frac{1}{(n-1)!} \right) \\
&= \frac{n-1}{n!}
\end{split}
\end{equation*}

Plugging this back into $E[N]$, we have:

\begin{equation*}
\begin{split}
E(N) &= \sum_{n=2}^{\infty} n P(N=n) \\
&= \sum_{n=2}^{\infty} n \left( \frac{n-1}{n!} \right) \\
&= \sum_{n=2}^{\infty} \frac{1}{(n-2)!} \\
&= \sum_{n=0}^{\infty} \frac{1}{n!}  = e
\end{split}
\end{equation*}

We can also simulate this result:

In [13]:
import random

def simulate_n(n_simulations):
    total_n = 0
    for _ in range(n_simulations):
        n = s = 0
        while s <= 1:
            s += random.uniform(0,1)
            n += 1
        total_n += n
    return total_n / n_simulations

simulate_n(1000)

2.724

#### Expected Max and Min

*Suppose $x_1, x_2, \cdots, x_n$ are IID and uniform on $[0, 1]$. What's the expected value of the maximum? Expected value of the difference between the maximum and minimum?*

Remember that the expected value of a continous distribution is $E[X] = \int_{0}^{1}xf(x)dx$, where the PDF of $X$ is $f(x) = \frac{d}{dx}F(x)$ and the cumulative distribution function (CDF) of $X$ is $F(x) = P(x \leq a)$. Let's first find the CDF of the maximum of $X$:

\begin{equation*}
\begin{split}
P(\text{max}(x_i) \leq x) &= P(x_i \leq x \forall i) \\
&= \prod_{i=1}^{n} P(x_i \leq x) \\
&= x^n
\end{split}
\end{equation*}

Why are we looking for $\text{max}(x_i) \leq x$? It's because if the maximum is less than $x$, than every other order statistic has to also be less than $x$. Continuing on, the derivative of the CDF is $nx^{n-1}$ so the expected value is:

\begin{equation*}
\begin{split}
E[\text{max}(x_i)] &= \int_{0}^{1} xnx^{n-1} dx \\
&= \int_{0}^{1} nx^n dx \\
&= \frac{n}{n+1} \\
\end{split}
\end{equation*}

By symmetry, the expected value of the minimum is $\frac{1}{n+1}$. Therefore, the expected value of their difference is:

\begin{equation*}
\begin{split}
E[\text{max}(x_i) - \text{min}(x_i)] &= E[\text{max}(x_i)] - E[\text{min}(x_i)] \\
&= \frac{n-1}{n+1}
\end{split}
\end{equation*}

## Maximum Likelihood Estimate

To find the maximum likelihood estimate (MLE) of $a$ and $b$ for a uniform distribution, we have to first find the values of $a$ and $b$ that will maximize the log-likelihood function. The likelihood function of a uniform distribution is:

\begin{equation*}
\begin{split}
\prod_{i=1}^{n} f(x_i;a,b) &= \prod_{i=1}^{n} \frac{1}{b-a} \\
&= \frac{n}{b-a}
\end{split}
\label{eq:5} \tag{5}
\end{equation*}

Therefore, the log-likelihood function of a uniform distribution is:

\begin{equation*}
\begin{split}
log \left( \prod_{i=1}^{n} f(x_i;a,b) \right) &= log \left( \frac{n}{b-a} \right) \\
&= -n log(b-a)
\end{split}
\label{eq:6} \tag{6}
\end{equation*}

To find the values of $a$ and $b$ that maximize the log-likelihood function, take its derivative with respect to $a$ and $b$.

\begin{equation*}
\begin{split}
\frac{d(logf)}{da} &= \frac{n}{b-a} \\
\frac{d(logf)}{db} &= -\frac{n}{b-a} \\
\end{split}
\label{eq:7} \tag{7}
\end{equation*}

Since $\frac{d(logf)}{da}$ is monotonically increasing, we're looking to maximize $a$. Therefore, the MLE for $a$ is $\text{min}(X_1, X_2, \cdots, X_n)$. Similarly for $b$, $\frac{d(logf)}{da}$ is monotonically decreasing, so we're looking to minimize $b$. Therefore, the MLE for $b$ is $\text{max}(X_1, X_2, \cdots, X_n)$.

## Box-Muller Transform

The Boxâ€“Muller transform generates pairs of independent, standard, and normally distributed random variables $X$ and $Y$ from uniformly distributed random variables $U$ and $V$:

\begin{equation*}
\begin{split}
X &= \sqrt{-2 ln(U)} cos(2 \pi V) \\
Y &= \sqrt{-2 ln(U)} sin(2 \pi V)
\end{split}
\label{eq:8} \tag{8}
\end{equation*}

If you're looking for a non-standard normal distribution (e.g. $X_1, Y_1$ ~ $N(0.5, 1)$), then do:

\begin{equation*}
\begin{split}
X_1 &= X + \mu \\
Y_1 &= Y + \mu
\end{split}
\label{eq:9} \tag{9}
\end{equation*}