# Random Variables

Khan Academy:
https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library

Random variables are ways to map outcomes of random processed to numbers. Usually random variables are noted by capital letters, e.g. **X**.  

* **Discrete Random Variable** - Take distinct/separate values
* **Continuous Random Variables** - Take any value in interval

**Example**. Let's say we want to estimate the number of people we will see in the line in store. We conduct an experiment by visiting the store 50 times. Out of the 50 times we observe 0 people 24 times, 1 person 18 times, and 2 people 8 times. We estimate the probabilities as shown below.

| People in the line | Times Observed | Probability Estimate |
| :------------------ | :-------------- | :-----------------:
| 0 | 24 | $\frac{24}{24+18+8}=\frac{24}{50}$ = 0.48 = 48% |
| 1 | 18 | $\frac{18}{50}$ = 0.36 = 36% |
| 2 | 8 | $\frac{8}{50}$ = 0.16 = 16% |

Now, let's say we plan to visit the store 500 times in the coming two years. How many times do we expect to see a 2 people line? A reasonable expectiation would be

\begin{equation*}
500 \cdot \frac{8}{50} = 80
\end{equation*}


### Mean and Variance

Let's say we have a discrete random variable X which is equal to the number of workouts in a week.

 | X | P(X)
 | - | - 
 | 0 | 0.1
 | 1 | 0.15
 | 2 | 0.4
 | 3 | 0.25
 | 4 | 0.1
 
**Expected value/Mean**. The expected value of $X$ is  

 \begin{equation*}
E(X) = \mu_x = 0\cdot0.1 + 1\cdot0.15 + 2\cdot0.4 + 3\cdot0.25 + 4\cdot0.1 = 2.1 
\end{equation*}

So the expected number of workouts in a week is 2.1.
 
**Variance and Standard Deviation**. Variance is a measure of spread.  

*Variance*  

\begin{equation*}
Var(X) = (0-2.1)^2\cdot0.1 + (1-2.1)^2\cdot0.15 + (2-2.1)^2\cdot0.4 + (3-2.1)^2\cdot0.25 + (4-2.1)^2\cdot0.1 = 1.19
\end{equation*}  

*Standard Deviation*  

\begin{equation*}
\sigma_x = \sqrt{Var(X)} = \sqrt{1.19} \approx 1.09
\end{equation*}

### The sum and difference of two random variables

If $X$ and $Y$ are indipendent random variables, than 

\begin{align*}
E(X + Y) &= E(X) + E(Y) \\
E(X - Y) &= E(X) - E(Y) \\
Var(X \pm Y) &= Var(X) + Var(Y)
\end{align*}

**Deriving variance of the difference of random variables**  

\begin{align*}
Var(X-Y)&=E[(X-Y-E(X-Y))^2]\\
        &=E[((X-E(X))-(Y-E(Y))^2]\\
        &=E[(X-E(X))^2-2\cdot(X-E(X))\cdot(Y-E(Y))+E(Y-E(Y))^2]\\
        &=E[(X-E(X))^2]-E[2\cdot(X-E(X))\cdot(Y-E(Y))]+E[(Y-E(Y))^2]\\
        &=Var(X)-0+Var(Y)\\
        &=Var(X)+Var(Y)
\end{align*}

### Probability Distribution and Probability Density Functions

* Probability Distribution Funstions - Discrete random variables
* Probability Density Functions - Continuous random variables


**Note**. The below code is incomplete, do it later

In [13]:
import random

def flip_fair_coins(n_coins):
    """ int (number of coins to flip) -> list of 1s (heads) and 0s (tails)
    """
    outcomes = []
    for i in range(n_coins):
        outcomes.append(random.randint(0,1))
    return outcomes

def trials(n_trials, n_coins):
    ## initialize a dictionary that will track the number of heads seen
    trial_dict = {}
    for i in range(n_coins+1):
        trial_dict['{}_heads'.format(i)] = (i, 0)
    
    for trial in range(n_trials):
        ## flip the coin and save the outcome
        outcome = flip_fair_coins(n_coins)
        
        ## update the heads dictionary based on how many heads we've seen in the current trial
        for heads in n_coins:
            if heads == outcome:
                trial_dict['{}_heads'.format(heads)][1] += 1
    ## divide the number of heads occurances by the number of total trials to get the probabilities
    for ele in trial_dict:
        ele[1] /= n_trials
    
    return trial_dict