# Core Statistics Using Python
### Hana Choi, Simon Business School, University of Rochester


# Some Useful Distributions

## Topics covered

- Evaluating Standard Normal CDF, PDF, and the inverse of Standard Normal CDF
- Evaluating a Normal distribution with an arbitrary mean and a standard deviation 
- Evaluating other distributions besides the Normal (e.g., Uniform)

## Here are the packages/modules we need for this notebook

In [None]:
# Importing "stats" module from the scipy package
# scipy.stats module provides a large number of statistical functions, probability distributions, and statistical tests.
from scipy import stats 

# Importing the norm (uniform) class from the scipy.stats module
# The norm (uniform) class provides various methods and attributes to work with the normal (uniform) distribution.
from scipy.stats import norm
from scipy.stats import uniform

# Let's use Python for computing probabilities
## Evaluating the Standard Normal CDF

In [None]:
# How much of the standard normal lies to the left of zero?
result=norm.cdf(0)
print(result)

In [None]:
# How about to the left of -1
result=norm.cdf(-1)
print(result)

In [None]:
# You can also skip the define and print parts if you want
norm.cdf(-1)

In [None]:
# However, if you do this, the code cell will often only report the last thing you did
norm.cdf(0)
norm.cdf(-1)

In [None]:
# How about to the right of (i.e. greater than) -1
result=1-norm.cdf(-1)
print(result)

In [None]:
# There is another way to get the probability above the cutoff value as well.
# To do so, you can compute the survival function (sf) instead
# You are now asking for the "upper tail" probability
# You will, of course, get the same answer!

result=norm.sf(-1)
print(result)

## Evaluating the Inverse of Standard Normal CDF

- You can also compute the value such that a given percent of the standard normal is less than it
- Here, you are evaluating the inverse of the Normal CDF (which Python calls the ppf)

In [None]:
# For example, what is the number such that 15% of the standard normal is less than that number?
result=norm.ppf(0.15)
print(result)

In [None]:
# What is the number such that 50% of the standard normal is less than that number?
result=norm.ppf(0.5)
print(result)

## Evaluating the Standard Normal PDF

In [None]:
# What is f(0) of the Standard Normal distribution?
result=norm.pdf(0)
print(result)

In [None]:
# The normal distribution is symmetric, so f(x) = f(-x)
print( norm.pdf(1) )
print( norm.pdf(-1) )

## How about evaluating the Normal CDF with an arbitrary mean and standard deviation
- To do this, you simply need to provide the mean (location) and standard deviation (scale)

In [None]:
# How much of the N(10,5^2) distribution lies below 5?
result=norm.cdf(5,loc=10,scale=5)
print(result)

In [None]:
# How much of the N(10,5^2) distribution lies above 5?
result = 1-norm.cdf(5,loc=10,scale=5)
print(result)

In [None]:
# You can also evaluate it using SF (again, you are now asking for the "upper tail" probability)
result=norm.sf(5,loc=10,scale=5)
print(result)

In [None]:
# You can also calculate the inverse CDF function
# For example, what is the number such that 15% of the N(10,5^2) distribution is less than that number?
result=norm.ppf(0.15,loc=10,scale=5)
print(result)

# Lights Example
- Recall the details of the example: LED bulb lifespans are distributed $N(4.2, 0.6^2)$
- This notation means distributed Normal with Mean = 4.2 and Standard Deviation = 0.6
## We want to compute a few things related to how long these bulbs will last
- $Pr(Y<4)$
- $Pr(Y>3)$
- $Pr(3<Y<5)$

In [None]:
# Pr(Y<4)
print(norm.cdf(4,loc=4.2,scale=0.6))

In [None]:
# Pr(Y>3)
print(1-norm.cdf(3,loc=4.2,scale=0.6))

# Note that you can also get the same answer this way
print(norm.sf(3,loc=4.2,scale=0.6))

In [None]:
# Pr(3<Y<5)
print(norm.cdf(5,loc=4.2,scale=0.6)-norm.cdf(3,loc=4.2,scale=0.6))

## We can also compute the inverse CDF
### Compute the bulb life span such that 95% of bulbs last less this time
- Here, we want to find the value of $y$ such that $Pr(Y<y)=0.95$ 

In [None]:
print(norm.ppf(0.95,loc=4.2,scale=0.6))

### How about the bulb life span such that 95% of bulbs last longer this time
* Find the value of $y$ such that $Pr(Y>y)=0.95$ 

In [None]:
print(norm.ppf(0.05,loc=4.2,scale=0.6))

# Evaluating other distributions besides the Normal 
- For example, we can evaluate the uniform distribution from A to B, denoted Unif[A,B]
- Here A is the lower limit and B is the upper limit.
- Python calls the lower limit A as "location" and the "scale" is B-A
- Let's look at the gas tank example where the amount of gas G in my Hyundai Sonata tank is distributed Unif[0,12], so $G \sim \text{Unif}[0,12] $
- The symbol $\sim$ means "distributed as"

## Gas Example
- Suppose we want to compute
- $Pr(G<6)$: the probability that I have $\frac{1}{2}$ a tank or less
- $Pr(G>9)$: the probability that I have $\frac{3}{4}$ a tank or more
- $Pr(3<G<9)$: the probability that I have between $\frac{1}{4}$ and $\frac{3}{4}$ a tank 

In [None]:
# Note, we have set A equal to 0 and B equal to 12 (so its Unif[0,12]) with location 0 and scale 12
# Pr(G<6): the probability that I have 1/2 a tank or less
print(uniform.cdf(6, loc=0, scale=12))

In [None]:
# Pr(G>9): the probability that I have 3/4 a tank or more
print(1-uniform.cdf(9, loc=0, scale=12))

# The survival function option works here too
print(uniform.sf(9, loc=0, scale=12))

In [None]:
# Pr(3<G<9): the probability that I have between 1/4 and 3/4 a tank
print(uniform.cdf(9, loc=0, scale=12)-uniform.cdf(3, loc=0, scale=12))