# 1. Random Variable
#### Example Experiment
- Toss a fair coin 2 times. Denote
    + Sample space $\Omega$: $size(\Omega) = 4$
    + Random variable `X`:  number of heads
        + $x_i$: All possible outcomes of X

| **Possible Outcomes** | **Value** | **Probability** |
|-----------------------|-----------|-----------------|
| HH                    | 2         | 1/4             |
| HT                    | 1         | 1/4             |
| TH                    | 1         | 1/4             |
| TT                    | 0         | 1/4             |


## 1.1 Probability mass function (pmf)

| **x**        | $x_1$   | $x_2$   | ...   | $x_n$ |
|--------------|-----|-----|-----|-----|
| **P(X = x)** | $p_1$ | $p_2$ | ... | $p_n$ |


- Definition: $pmf_X(x)$ = A set of $P(X=x)$
- Properties:
    + $p_i >= 0$, $\forall i$
    + $\sum\limits_{i=1}^n P(X = x_i) = \sum\limits_{i=1}^n p_i= 1$



#### pmf Example Experiment
- In Example Experiment, $x_i \in [0,2]$
    - $P(X = 0) = P(\{TT\}) = \frac{1}{4}$
    - $P(X = 1) = P(\{TT, TH\}) = \frac{2}{4} = \frac{1}{2}$
    - $P(X = 2) = P(\{HH\}) = \frac{1}{4}$

- pmf by table

| **x**        | 0   | 1   | 2   |
|--------------|-----|-----|-----|
| **P(X = x)** | 1/4 | 1/2 | 1/4 |

- pmf by graph

<img src="assets/5.png" width="350"/>


## 1.2 Probability Distribution

#### Example Experiment
- $P(X \geq 1) = P(X=1 \cup X=2) = P(X=1) + P(X=2) = \frac{1}{2} + \frac{1}{4} =\frac{3}{4}$

# 2. Binomial distribution
#### Example Experiment
- Toss an unfair coin n times
    + $P(H) = p$
    + $P(T) = 1 - p$
- Random variable
    + $X$: Number of heads

#### pmf, example `n = 5`
- $P(X=0) = P(\{TTTTT\}) = (1-p)^5$
- $P(X=1) = P(\{HTTTT, THTTT, TTHTT, TTTHT, TTTTH\}) = 5p^1(1-p)^4$
- $P(X=2) =  C_5^2 p^2(1-p)^3$
- $P(X=3) =  C_5^3 p^3(1-p)^2$
- $P(X=4) =  C_5^4 p^4(1-p)^1$
- $P(X=5) =  C_5^5 p^5(1-p)^0$

#### pmf, General `n`
- X ~ Bin(n,p) $\to$ $P(X=k) =  C_k^n p^k(1-p)^{n-k}$, $\forall k \in [0,n]$, $k\in N$

# 3. Expected value of random variable
- Mean value of the random variable

## 3.1 Definition

- $E(X) = \sum\limits_{i=1}^n x_iP(X=x_i)$

## 3.2 Example
- A lottery prize has the following pmf

| **x**        | 5   | -1  |
|--------------|-----|-----|
| **P(X = x)** | 0.1 | 0.9 |

- Expected value: $E(X) = x_1P(X=x_1) + x_2P(X=x_2) = 5*0.1 + (-1)*0.9 = -0.4$

# 4. Expected value and prediction
#### Example problem
- A bakery sells pancakes.
    + $X$: A random variable, the number of pancakes sold everyday
    - Data on pancake selling everyday (pmf)

| **x**        | $x_1$   | $x_2$   | ...   | $x_n$ |
|--------------|-----|-----|-----|-----|
| **P(X = x)** | $p_1$ | $p_2$ | ... | $p_n$ |

- We try to predict the number of pancakes selling
    + $\hat{x}$: The predicted number

#### Solve
- Choose a loss function: Square error
    + $(X - \hat{x})^2$: The square error between real data and prediction
        + Try to minimize this value $\forall X = x_i $
    + $l(\hat{x}) = E[(X - \hat{x})^2]$: Loss function = Expected value of square error
        + Try to minimize $l(\hat{x})$: $l(\hat{x}) \to \min\limits_{\hat{x}}$
        + Or $l'(\hat{x}) = 0$
- Do calculation on data


| **X**           | $x_1$               | $x_2$               | $\dots$ | $x_n$               |
|-----------------|---------------------|---------------------|---------|---------------------|
| **P(X = x)**    | $p_1$               | $p_2$               | $\dots$ | $p_n$               |
| $(X-\hat{x})^2$ | $(x_1 - \hat{x})^2$ | $(x_2 - \hat{x})^2$ | $\dots$ | $(x_n - \hat{x})^2$ |

- => $l(\hat{x}) = E[(X-\hat{x})^2] = \sum\limits_{i=1}^n[p_i(x_i - \hat{x})^2]$
- Optimize $l(\hat{x})$, Or $l'(\hat{x}) = 0$ 

$$\begin{split}
    & l'(\hat{x}) = \frac{dl(\hat{x})}{d\hat{x}} = -\sum\limits_{i=1}^n[2p_i(x_i - \hat{x})]  = 0 \\
    \to &  \sum\limits_{i=1}^n x_ip_i - \sum\limits_{i=1}^n \hat{x}p_i = 0 \\
    \to & \sum\limits_{i=1}^n x_ip_i = \hat{x} \sum\limits_{i=1}^n p_i \\
    \to & \hat{x} = \sum\limits_{i=1}^n x_ip_i\ (\text{since } \sum\limits_{i=1}^n p_i = 1) \\
    \to & \hat{x} = E(X)
\end{split}$$

#### Conclusion
- If we choose `square error` as loss function then the prediction value of a dataset is the Expected value `E(X)`

# 5. Variance of random variable
- How far of the random variable X to E(X)

## 5.1 Formula
- 1. $\text{Var}(X) = E[ (X - E(X))^2 ])$
- 2. $\text{Var}(X) = E(X^2) -E(X)^2$

$$\begin{split}
\text{Var}(X) &= E[ (X - E(X))^2 ])\\
              &= E[ X^2 -2XE(X) + E(X)^2 ] \\
              &= E[X^2] -2E[XE(X)] + E[E(X)^2] \\
              &= E[X^2] -2E[X]E[X] + E[X]^2 \\
              &= E[X^2] -2E[X]^2 + E[X]^2 \\
              &= E[X^2] -E[X]^2
\end{split}$$


## 5.2 Example
- A lottery prize has the following pmf

| **x**        | 5   | -1  |
|--------------|-----|-----|
| **P(X = x)** | 0.1 | 0.9 |

- Calulate Var(X)

#### Solve
- Calc E(X)
    + $E(X) = x_1P(X=x_1) + x_2P(X=x_2) = 5*0.1 + (-1)*0.9 = -0.4$

- Calc $(X - E(X))^2$

| **X**        | 5     | -1   |
|--------------|-------|------|
| **P(X = x)** | 0.1   | 0.9  |
| **X-E(X)**   | 5.4   | -0.6 |
| $(X-E(X))^2$ | 29.16 | 0.36 |

- Calc Var(X)
    + $\text{Var}(X) = E[ (X - E(X))^2] = p_1*(X-E(X))^2_{|X = x_1} + p_2*(X-E(X))^2_{|X = x_2} = 0.1*29.16 + 0.9*0.36 = 3.24$
    

# 6. Discrete random variables with infinite number of values
## 6.1 Example Experiment: E(X) exist
- Toss a fair coin until 1st head occur. Infinite number of possible outcomes
    + H
    + TH
    + TTH
    + TTTH
    + ...
    + TTTT...TH
- Let X = number of tossing until 1st head occur
- Calculate the expected value E(X)

#### Solve
- Construct PMF table


<img src="assets/6.png" width="350"/>


| **X**        | 1   | 2       | 3       | ... | k       | ... |
|--------------|-----|---------|---------|-----|---------|-----|
| **P(X = x)** | $0.5$ | $0.5^2$ | $0.5^3$ | ... | $0.5^k$ | ... |

 - Check: $\sum\limits_{k=1}^{\infty} 0.5^k = 1$

- Calculate Expected value
    + $E(X) = \sum\limits_{k=1}^{\infty}x_kP(X=x_k) = \sum\limits_{k=1}^{\infty} k * \frac{1}{2^k} = \sum\limits_{k=1}^{\infty} \frac{1}{2} k \left( \frac{1}{2} \right)^{k-1} = \frac{1}{2}\frac{1}{(1-0.5)^2} = 2$


## 6.2 Example Experiment 2: E(X) not exist
- Toss a fair coin until 1st head occur. If H appears at k-th toss, the player will get $2^{k-1}$ dollar. Eg
    + H: 1 dollar
    + TH: 2 dollars
    + TTH: 4 dollars
    + ...

- Let $X = 2^k-1$, random variable model the reward if H occur at k-th toss
- Calculate the expected value E(X)

#### Solve

- Construct PMF table

| **k**        | 1   | 2       | 3       | ... | k       | ... |
|--------------|-----|---------|---------|-----|---------|-----|
| $X=2^{k-1}$      | 1   | 2       | 4       | ... | $2^{k-1}$   | ... |
| $P(X = x = 2^{k-1})$ | 0.5 | $0.5^2$ | $0.5^3$ | ... | $0.5^k$ | ... |

- Calculate Expected value
    + $E(X) = \sum\limits_{k=1}^{\infty}x_kP(X=x_k) = \sum\limits_{k=1}^{\infty} 2^{k-1} * \frac{1}{2^k} =  \sum\limits_{k=1}^{\infty} \frac{1}{2} = \infty$


## 6.3 Exercise 1
- Let X is the random variable that has the following PMF

$$\begin{cases}
 P(X = 0) =\ ?\text{ , k=0} \\
 P(X = k) = \frac{1}{2^{k+2}} \text{ , k>0} \\
\end{cases}$$

- Find P(X = 0)

#### Solve
- PMF Table

| **k**      | 0 | 1               | 2               | ... | k                   | ... |
|------------|---|-----------------|-----------------|-----|---------------------|-----|
| $X=k$      | 0 | 1               | 2               | ... | k                   | ... |
| $P(X = k)$ | ? | $\frac{1}{2^3}$ | $\frac{1}{2^4}$ | ... | $\frac{1}{2^{k+2}}$ | ... |


- We have $\sum\limits_{k=0}^\infty P(X=k) = 1$

 $$\begin{split}
     \sum\limits_{k=0}^\infty P(X=k) &= 1 \\
     \to P(X=0) + \sum\limits_{k=1}^{\infty} P(X=k) &= 1 \\
     \to P(X=0) + \sum\limits_{k=1}^{\infty} \frac{1}{2^{k+2}} &= 1 \\
     \to P(X=0) &= 1 - \sum\limits_{k=1}^{\infty} \frac{1}{2^3} \left( \frac{1}{2} \right)^{k-1} \\
                &= 1 - \sum\limits_{k=1}^{\infty} \frac{1}{2^3} \left( \frac{1}{2} \right)^{k-1} \\
                &= 1 - \frac{1}{2^3}  \frac{1}{1 - \frac{1}{2}} \\
                &= \frac{3}{4}
 \end{split}$$
 
## 6.4 Exercise 2
 
- Let X is the random variable that has the following PMF: $P(X = \frac{1}{2^k}) = \frac{1}{2^k},\ \forall k \geq 1$

- Calculate E(X)

#### Solve
- PMF table

| **k**                   | 1             | 2             | ... | k               | ... |
|-------------------------|---------------|---------------|-----|-----------------|-----|
| $X=\frac{1}{2^k}$       | $\frac{1}{2}$ | $\frac{1}{4}$ | ... | $\frac{1}{2^k}$ | ... |
| $P(X = \frac{1}{2^k})$  | $\frac{1}{2}$ | $\frac{1}{4}$ | ... | $\frac{1}{2^k}$ | ... |

- Expected Value

$$\begin{split}
    E(X) &= \sum\limits_{k=1}^{\infty} x_i P(X = \frac{1}{2^k}) \\
         &= \sum\limits_{k=1}^{\infty} \frac{1}{2^k} \frac{1}{2^k} \\
         &= \sum\limits_{k=1}^{\infty} \frac{1}{2^{2k}} = \sum\limits_{k=1}^{\infty} \frac{1}{4} \left( \frac{1}{4} \right)^{k-1} \\
         &= \frac{1}{4} \frac{1}{1-\frac{1}{4}} = \frac{1}{3}
\end{split}$$


# 7. Geometric distributions
#### Example Experiment
- Toss an `unfair` coin until 1st head, with $P(H) = p$
- Let Random variable `X` = number of tossing until 1st head occur

#### PMF

<img src="assets/7.png" width="350"/>

| **k**        | 1   | 2        | 3          | ... | k              | ... |
|--------------|-----|----------|------------|-----|----------------|-----|
| **X=k**      | 1   | 2        | 3          | ... | k              | ... |
| **P(X = k)** | $p$ | $(1-p)p$ | $(1-p)^2p$ | ... | $(1-p)^{k-1}p$ | ... |


- PMF: $P(X=k) = (1-p)^{k-1}p$, $\forall k \geq 1$, $k \in N$

#### Expected Value
- $E(X) = \frac{1}{p}$

# 8. Poisson distributions
- Example use case: Model the number of visitors to a shop `during a specific period of time` in a day
    + The number of visitors can be in range $[0, \infty]$


#### PMF

$$\begin{cases}
    \lambda > 0 \\
    P(X = k) = \frac{\lambda ^k}{k!}  e^{-\lambda},\ \forall k \geq 0, k \in N
\end{cases}$$
