### Day 7
#### 8/8 CS 109 Binomial, Bernoulli and Variance

https://web.stanford.edu/class/archive/cs/cs109/cs109.1238/lectures/7-BernoulliBinomial/7-BernoulliBinomial.pdf

Code below: Bernoulli calculates the probability of a random variable with a variable of 0.25. 

* Expected value loops over all the possible x values, multiplying the value by the probability of that value
$$ E[X] = \sum_{x_i}^{} x_i * P(X=x_i) $$
* Properties of Expectation
    * Linearity: 
    * Expecation of a sum
    * Unconcious Statistician

1. Bernoulli Random Variables
    * Often you want to evaluate in binary. this is what bernoulli came up with. 
    * "indicator random variable"
2. Binomial
    * sum of $n$ bernoulli random variables

Large Applicability: Useful for any problem where there are $n$ independent trials, each with prob $p$ of success, and need to solve prob of $k$ successes.

$X$ ~ Bin($n,p$) means Our random Variable Is distributed as a Binomial given the Number of Trials and Probability of success of each trial.
Then the PMF is 
$$  P(X=k) = { n \choose k } * p^k * (1-p)e(n-p)$$

In [1]:
import random

In [1]:
class Bernoulli(object):
    def __init__(self, parameter):
        self.p = parameter
        self.pmf = {1: parameter, 0: 1 - parameter}
    
    def sample(self):
        rand = random.uniform(0, 1)
        if rand < self.p:
            return 0
        else:
            return 1
        
# P(X = x)
def Probability(X, x):
    return X.pmf[x]

#  E[X]
def Expectation(X):
    ev = 0
    for x, px in X.pmf.items():
        ev += x * px
    return ev

def main():
    # X ~ Bern(0.25)
    X = Bernoulli(0.25)

    # P(X = 1)
    probability = Probability(X, 1)
    print("P(X = 1) = ", probability)

    # E[X]
    expected_value = Expectation(X)
    print("E[X] = ", expected_value)

In [None]:
main()

We can consider binomial distribution questions like these: 

If we serve 1000 ads, each with a p = 0.01, what is the probability of exactly 5 clicks?
OR warriors playing bucks, warrriors win 55% and each game is independent. win at least 4 of 7 games to win series. prob of winning?

In [4]:
from scipy import stats

print("prob of 5 clicks: ", stats.binom.pmf(5, 1000, 0.01))
print("prob of winning series: ", stats.binom.pmf(4, 7, 0.55) + stats.binom.pmf(5, 7, 0.55) + stats.binom.pmf(6, 7, 0.55) + stats.binom.pmf(7, 7, 0.55))

prob of 5 clicks:  0.03745311160824718
prob of winning series:  0.6082877968750001


We can measure spared as a measure of the "distribution" of things with an equal expectation.
Variance is the expectation of the square of the distance of every value from the mean. STD is the square root of the variance.
$$ Var(X) = E[(X-\mu)^2]$$
Normalized histograms are approximations of the probability density function (PDF) of the data.
Through a pretty cool derivation, we derive:
$$ Var(X) = E[(X-\mu)^2] = \sum_{x} (x-\mu)^2 * p(x) = E[X^2] - (E[X])^2 $$