In [1]:
from random import choice
from math import floor
from collections import Counter
import matplotlib.pyplot as plt

# Random Variables & Probability Distributions

Say $X$ is our random variable. What does $P(X)$ mean then?

We're actually more interested in $P(X = value)$ : the `probability that our random variable is a certain value`.

The value of $X$ is not to be calculated. We already know the values or the range of values $X$ can be. We want to know the probability that $X$ is a certain value or another. 

## Expectation &  Variance

`expectation` : mean of the random variable/weighted sum of the outcomes

$ E(X) = \mu = \sum_x {xP(X=x)} $

`variance` : measure of how much the datapoints are dispersed from the mean in terms of `standard deviation`s

$Var(X) = \sum_x{[P(X=x)(x-\mu)^2]}$

$\sigma = \sqrt{Var(X)}$

Before we delve further into the expectation and variances of random variables, let's learn about some common probability distributions.

## Probability Distributions

### Bernoulli

a `single trial` with 2 outcomes (success/failure) and 1 parameter
 
$P(X = 1) = p$

$P(X = 0) = 1 - p$

## Binomial

based on a `series of independent Bernoulli trials` and counts the number of successes ($k$) in $n$ trials

The probability of success $p$ and the number of independent trials $n$ must be constant.

$P(X = k) = (n,k) = p^k(1-p)^{n-k}$


$E(X_{binomial}) = np$

$Var(X_{binomial}) = np(1-p)$

In [2]:
def binomial(p,n,k) :
    return (p**k)*((1-p)**(n-k))

_example : picking cookies out of a bag_

In [21]:
n = 25
k = floor(0.1 * n)
cookie_types = ['♡','☆','☐']
bag = [choice(cookie_types) for i in range(n)]
for (x, freq) in dict(Counter(bag)).items() :
    print(f"{x} : {freq}")

p = Counter(bag)["♡"]/n
print(f"The probability of getting a ♡-shaped cookie is {p*100}%")

☐ : 12
♡ : 7
☆ : 6
The probability of getting a ♡-shaped cookie is 28.000000000000004%
