# Bernoulli and Binomial Distribution - Lab

## Introduction
In this lab, you'll practice your newly gained knowledge on the Bernoulli and Binomial Distribution.

## Objectives
You will be able to:

* Apply the formulas for the Binomial and Bernoulli distribution to calculate the probability of a specific event
* Use `numpy` to randomly generate Binomial and Bernoulli trials
* Use `matplotlib` to show the output of generated Binomial and Bernoulli trials

## Apply the formulas for the Binomial and Bernoulli distributions

When playing a game of bowling, what is the probability of throwing exactly 3 strikes in a game with 10 rounds? Assume that the probability of throwing a strike is 25% for each round. Use the formula for the Binomial distribution to get to the answer. You've created this before, so we provide you with the function for factorials again:

In [None]:
def factorial(n):
    prod = 1
    while n >= 1:
        prod = prod * n
        n = n - 1
    return prod

In [None]:
p_3_strikes = factorial(10)/(factorial(3) * factorial(7))*(.25)**3*.75**7 #answer = 0.2502822
p_3_strikes

Now, create a function for the Binomial distribution with three arguments $n$, $p$ and $k$ just like in the formula:

$$ \large P(Y=k)= \binom{n}{k} p^k(1-p)^{(n-k)}$$ 


In [None]:
def n_choose_k(n, k):
    return factorial(n) / (factorial(k) * factorial(n - k))

def binom_distr(n,p,k):
    return n_choose_k(n, k) * p**k * (1 - p)**(n - k)

Validate your previous result by applying your new function.

In [None]:
# Your code here
binom_distr(10, .25, 3)

Now write a `for` loop along with your function to compute the probability that you have five strikes or more in one game. You'll want to use `numpy` here!

In [None]:
import numpy as np
# Your code here
p_five_or_more = 0
for k in range(5,11):
    p_five_or_more += binom_distr(10, .25, k)
p_five_or_more


## Use a simulation to get the probabilities for all the potential outcomes

Repeat the experiment 5000 times.

In [None]:
# leave the random seed here for reproducibility of results
np.random.seed(123)
#
n, p = 10, .25
s = np.random.binomial(n, p, 5000)
print(list(set(s)))
print(list(np.bincount(s)))
#
#

In [None]:
# the results should look like this:
# [0 1 2 3 4 5 6 7 8]
# [ 310  941 1368 1286  707  297   78   11    2]

## Visualize these results

Create the PMF using these empirical results (that is, the proportions based on the values we obtained running the experiment 5000 times).

In [None]:
import matplotlib.pyplot as plt
%matplotlib inline

def pmf(s):
    return np.bincount(s)/s.sum()

plt.bar(x=list(set(s)), height=pmf(s))
plt.title('PMF of strike count in 10 frames (p=0.25)')
plt.xlabel('strike count')
#
#
#
#

You should see that, with a 25% strike hit rate, even when simulating 5000 times, an almost perfect and/or perfect game of 9 and 10 strikes didn't even occur once! If you change the random seed, however, you'll see that perfect games will show up occasionally. 

Next, let's create the CDF based on these results. You can use `np.cumsum` to obtain cumulative probabilities.

In [None]:
# Your code here
#
#
#
#
def cdf(s):
    return np.cumsum(np.bincount(s))/s.sum()

plt.bar(x=list(set(s)), height=cdf(s))
plt.title('CDF of strike count in 10 frames (p=0.25)')
plt.xlabel('strike count')

## Summary

Congratulations! In this lab, you practiced your newly gained knowledge of the Bernoulli and Binomial Distribution.