## Binomial distribution
As we touched on in the slides, the binomial distribution is used to model the number of successful outcomes in trials where there is some consistent probability of success.

For this exercise, consider a game where you are trying to make a ball in a basket. You are given 10 shots and you know that you have an 80% chance of making a given shot. To simplify things, assume each shot is an independent event.

In [1]:
from scipy.stats import binom

In [2]:
shots = 10
probability_success = 0.8

In [9]:
binomial_data = binom.rvs(n=shots, p=probability_success, size=1000)

To calculate a probability associated with the data: binom.cdf
    
Assign the probability of making 8 or less shots

In [10]:
# Assign and print probability of 8 or less successes
# CDF. -> probability up to a point
# k -> number of success
prob_8_or_less_success = binom.cdf(k=8, n=shots, p=probability_success)

In [11]:
# Assign the probability of making all 10 shots to prob2 
# PDF -> probability of one exact point
prob2 = binom.pmf(k=10, n=shots, p=probability_success)

## Normal distribution
On to the most recognizable and useful distribution of the bunch: the normal or Gaussian distribution. In the slides, we briefly touched on the bell-curve shape and how the normal distribution along with the central limit theorem enables us to perform hypothesis tests.

Similar to the previous exercises, here you'll start by simulating some data and examining the distribution, then dive a little deeper and examine the probability of certain observations taking place.

Generate the data for the distribution by using the rvs() function with size set to 1000; assign it to the data variable.

In [18]:
from scipy.stats import norm

In [19]:
data = norm.rvs(size=1000)

Given a standardized normal distribution, what is the probability of an observation greater than 2?

Standardize normal: mean=0, variance=1

In [20]:
# Compute and print true probability for greater than 2
true_prob = 1 - norm.cdf(2)

In [21]:
true_prob

0.02275013194817921

Looking at our sample, what is the probability of an observation greater than 2?

In [23]:
# Compute and print sample probability for greater than 2
sample_prob = (sum(obs > 2 for obs in data)) / len(data)
print(sample_prob)

0.023
