# 4. Statistics and Probability

In [1]:
import numpy as np
import pandas as pd

import torch
from torch.distributions import multinomial

## Multinomial Distribution Sampling

For example, we are simulating a roll of a fair dice, we can do the following:

In [11]:
fair_p = torch.ones(6) / 6
multinomial.Multinomial(1, fair_p).sample()

tensor([0., 0., 0., 0., 1., 0.])

Now, we want to simulate 10 rolls, instead of a for loop, we can do the following:

In [12]:
multinomial.Multinomial(10, fair_p).sample()

tensor([3., 1., 1., 2., 1., 2.])

The above sampling results tells you that: in 10 rolls, we got the number 1 three times, the number 4 and 6 two times, and all the other numbers a single time.

We can also do 500 repeated experiments and in each experiment we roll the dice 10 times:

In [21]:
multinomial.Multinomial(10, fair_p).sample((500,))

tensor([[2., 2., 0., 2., 2., 2.],
        [2., 1., 0., 4., 0., 3.],
        [0., 2., 4., 2., 0., 2.],
        ...,
        [2., 1., 3., 2., 2., 0.],
        [2., 1., 0., 1., 3., 3.],
        [0., 3., 1., 4., 1., 1.]])

## Bayes Theorem

$$P(A \mid B) = \frac{P(B \mid A) P(A)}{P(B)}$$

## Marginalization (Sum Rule)

The probability of $B$ is the sum of the joint probabilities of $A$ and $B$ for all $A$s:

\begin{equation}
\begin{split}
P(B) &= \sum_{} P(A_i,B) \\
&= P(A_1,B) + P(A_2,B) + ... + P(A_n,B) \\
&= P(B \mid A_1) P(A_1) + P(B \mid A_2) P(A_2) + ... + P(B \mid A_n) P(A_n)
\end{split}
\end{equation}

where $P(A_1) + P(A_2) + ... + P(A_n) = 1$.

## Mean and Variance

The mean (expectation) of a random variable $x$ is given as:

$$E[X] = \sum_{x} x P(X = x)$$

If $x$ is drawn from the distribution P, the expectation of the function $f(x)$ is given as:

$$E_{x \sim P}[f(x)] = \sum_x f(x) P(x)$$

The varaince of a random variable $x$ is given as:

$$\mathrm{Var}[X] = E\left[(X - E[X])^2\right] = E[X^2] - E[X]^2$$

Similarly, the variance for a function $f(x)$ is given as:

$$\mathrm{Var}[f(x)] = E\left[\left(f(x) - E[f(x)]\right)^2\right]$$