# Exercises

## Bernoulli Distribution
1. What is the sample space of the Bernoulli distribution?

$\Omega = \{ 0, 1 \}$

2. What is is corresponding probability measure?

$\mu(x) = \frac{1}{2}$.

## Transformations on Random Variables

3. Suppose $X$ is uniform on $[0, 2\pi]$. Find the density of $Y = \sin X$.

4. Let $X_1$ and $X_2$ be two independent uniform distributions on $[0, 1]$.
    - Find the density of $Y_1 = X_1 + X_2$

    - Find the density of $Y_2 = X_1 - X_2$

    - Find the density of $Y_3 = X_1 / X_2$

    - Find the density of $Y_4 = \max(X_1, X_2)$


## Expected Values and Moments

5. Find the mean and variance of a Gaussian random variable $X$ with a density:
$$ p(x) = \frac{1}{\sqrt{2 \pi \sigma^2}} \exp{ \left(- \frac{(x - m)^2}{2 \sigma^2 } \right)} $$

6. Find the mean of the Cauchy distribution:
$$ p(x) = \frac{1}{\pi (1 + x^2)} $$

7. Find the mean and variance of the Binomial distribution:
$$ b(x; n, p) = \begin{pmatrix} n \\ x \end{pmatrix} p^x (1-p)^{n - x} $$

$$ \mathbb{E}[X], X \sim b(x; n, p) = \sum_{x=0}^n x \begin{pmatrix} n \\ x \end{pmatrix} p^x (1-p)^{n - x} $$
$$ = \sum_{x=0}^n x \frac{n!}{x! (n-x)!} p^x (1-p)^{n - x} $$
$$ = 0 + \sum_{x=1}^n \frac{n!}{(x-1)! (n-x)!} p^x (1-p)^{n - x} $$

Let $y = x - 1$ and $m = n - 1$, so that $x = y + 1$ and $n = m + 1$:
    
$$ = \sum_{y=0}^m \frac{(m+1)!}{y! (m - y)!} p^{y+1} (1 - p)^{m - y} $$
$$ = (m + 1) p \sum_{y=0}^m \frac{m!}{y! (m - y)!} p^y (1 - p)^{m - y} $$
$$ = n p \sum_{y=0}^m \frac{m!}{y! (m - y)!} p^y (1 - p)^{m - y} $$
$$ = np \sum_{y=0}^m \begin{pmatrix} m \\ y \end{pmatrix} p^y (1 - p)^{m - y} $$

But by the binomial theorem, we have:
$$ = np (p + (1 - p))^m $$
$$ \mathbb{E}[X] = np $$

We compute variance as:
$$ Var(X) = \mathbb{E}[X^2] - \mathbb{E}[X]^2 $$
$$ \mathbb{E}[X^2] = \sum_{x = 0}^n x^2 \begin{pmatrix} n \\ x \end{pmatrix} p^x (1-p)^{n - x} $$
$$ = \sum_{x = 0}^n x^2 \frac{n!}{x! (n - x)!} p^x (1-p)^{n - x} $$
$$ = \sum_{x = 0}^n nx \frac{n-1!}{(x-1)! (n - x)!} p^x (1-p)^{n - x} $$
$$ = \sum_{x = 0}^n nx \begin{pmatrix} n - 1 \\ x - 1 \end{pmatrix} p^x (1-p)^{n - x} $$
$$ = 0 + \sum_{x = 1}^n nx \begin{pmatrix} n - 1 \\ x - 1 \end{pmatrix} p^x (1-p)^{n - x} $$
$$ = np \sum_{x = 1}^n x \begin{pmatrix} n - 1 \\ x - 1 \end{pmatrix} p^{x-1} (1-p)^{n - x} $$

Let $y = x - 1$ and $m = n - 1$, so that $x = y + 1$ and $n = m + 1$:
$$ = np \sum_{y=0}^m (y+1) \begin{pmatrix} m \\ y \end{pmatrix} p^{y} (1 - p)^{m - y} $$
$$ = np \left( \sum_{y=0}^m y \begin{pmatrix} m \\ y \end{pmatrix} p^{y} (1 - p)^{m - y} + \sum_{y=0}^m \begin{pmatrix} m \\ y \end{pmatrix} p^{y} (1 - p)^{m - y} \right) $$
$$ = np \left( \sum_{y=0}^m y \begin{pmatrix} m \\ y \end{pmatrix} p^{y} (1 - p)^{m - y} + 1 \right)$$

The first term is the expected value of the binomial distribution with respect to $y$ and $m$, hence we can recursively apply the proof from $\mathbb{E}[X]$:
$$ = np ( mp + 1 ) $$
$$ = np ((n - 1)p + 1)$$
$$ = np (np - p + 1) $$
$$ \mathbb{E}[X^2] = (np)^2 + np(1 - p) $$

Therefore:
$$ Var(X) = \mathbb{E}[X^2] - \mathbb{E}[X]^2 $$
$$ = (np)^2 + np(1 - p) - (np)^2 $$
$$ = np(1 - p) $$

8. Let $X$ be a random variable such that $\mathbb{E}[|X|^m] \leq AC^m$ for some positive constants $A$ and $C$, and all integers $m \geq 0$. Show that $\mu (|X| > C) = 0$.

## Joint Probability and Independence
9. Generate a sample of two random variables $X$ and $Y$ where $X$ and $Y$ are normal with a correlation $\rho$.

In [88]:
from numpy import array, corrcoef, stack
from numpy.linalg import cholesky
from numpy.random import normal

N = 1000
COR = 0.2
# Generate samples
samples = normal(loc=0., scale=1., size=(N, 2))
transformation = array([
    [1., COR],
    [COR, 1.]
])
samples = samples @ cholesky(transformation) # Square root of transformation
# Compute covariance
rho = corrcoef(samples.T)
print(f"Correlation: {rho}")

Correlation: [[1.         0.21478601]
 [0.21478601 1.        ]]


## Conditional Probability and Conditional Expectation
10. Let $X$ and $Y$ be two random variables with $\mathbb{E}[Y] = m$ and $\mathbb{E}[Y^2] < \infty$.
- Show that the constant $c$ that minimizes $\mathbb{E}[(Y - c)^2]$ is $c  = m$.

$$ \mathbb{E}[(Y - c)^2] = \mathbb{E}[Y^2 - 2cY + c^2] $$
$$ = \mathbb{E}[Y] - 2c\mathbb{E}[Y] + c^2 $$
$$ = c^2 - 2cm + m $$
$$ \frac{\delta}{\delta c} = 2c - 2m $$
$$ \therefore c^* = m $$

- Show that the random variable $f(X)$ that minimizes $\mathbb{E}[(Y - f(X))^2\ \vert\ X]$ is $f(X) = E[Y\ \vert\ X]$.

$$ \mathbb{E}[(Y - f(x))^2] = \mathbb{E}[Y^2 - 2f(x)Y + f^2(x)] $$
$$ =\mathbb{E}[Y^2 | X] - 2f(x) \mathbb{E}[Y | X] + f^2(x) $$
$$ \frac{\delta f(x)}{\delta x} = -2\mathbb{E}[Y | X] + 2f(x) $$
$$ \therefore f^*(x) = E[Y | X] $$

- Show that the random variable $f(X)$ that minimizes $\mathbb{E}[(Y - f(X))^2]$ is also $f(X) = E[Y\ \vert\ X]$.

11. Will you consider a coin asymmetric if after 1000 tosses, the number of heads equals 600?

## Maximum Likelihood Estimation
12. Let $\{ X_1, \ldots, X_n \}$ be i.i.d samples from $\mathcal{U}(0., \theta)$. Find $\hat{\theta}$ that maximizes the MLE.

13. Show that the mean of $\frac{\delta}{\delta x} \ln p(x; \theta) = 0$

14. Show that the mean of $\frac{\delta^2}{\delta x^2} \ln p(x; \theta) = Var(\frac{\delta}{\delta x} \ln p(x; \theta)) = I(\theta)$

## Non-Parametric Inference
15. Show that for some empirical distribution, $\mathbb{E}[\hat{P}_n(x)] = P(x)$ and $Var(\hat{P}_n(x)) = \frac{P(x)(1-P(x))}{n}$

16. Derive the Nadaraya-Watson non-parametric regression technique.