# More Distributions and the Central Limit Theorem

## The normal distribution

- normal distribution: Symmetrical, defined by mean and std
- Standard normaldi stribution: mean = 0, std = 1

<img src='https://www.scribbr.de/wp-content/uploads/2023/01/Standard-normal-distribution.webp' width='400'><img src='https://www.scribbr.com/wp-content/uploads/2023/02/standard-normal-distribution-example.webp' width='400'>

In [49]:
from scipy.stats import norm

### Cumulative Distribution Function (CDF)

In [50]:
# What percent of stock prices are lower than $100 
norm.cdf(100, 110, 7)

0.07656372550983476

In [51]:
# What percent of stock prices are greater than $100 
1 - norm.cdf(100, 110, 7)

0.9234362744901652

In [52]:
# What percent of stock prices are between $100 and $105
norm.cdf(110, 110, 7) - norm.cdf(105, 110, 7)

0.2624747379730235

### Probability Point Function (PPF)

The `Probability Point Function` or `PPF` is the inverse of the CDF. 

Specifically, the PPF returns the exact point where the probability of everything to the left is equal to y . This can be thought of as the percentile function since the PPF tells us the value of a given percentile of the data.

In [53]:
# What stock price are 90% of stocks lower than?
norm.ppf(0.9, 110, 7)

118.97086095881221

In [54]:
# What stock price are 90% of stocks greater than?
# normal distribution is symmetrical
norm.ppf((1-0.9), 110, 7)

101.02913904118779

### Generating random numbers

In [55]:
# Generate 10 random exam grades
# norm.rvs(mean, std, size)
norm.rvs(76, 8, size=10)

array([82.70745075, 79.72023046, 78.17387624, 81.5279456 , 62.72500409,
       76.15815126, 87.58794319, 77.31134805, 70.14491495, 84.72219228])

## The central limit theorem

- The sampling distribution of a statistic becomes closer to the normal distribution as the number of trials increases, **no matter the original distribution being sampled from.**
- Samples should be random and independent

## The Poisson Distribution

- Probability of some # of events occurring over a fixed period of time
- $\lambda$
    - average number of events per time interval
    - expected value = $\lambda$
    - Lambda is also the distribution's peak

<img src ="https://www.scribbr.nl/wp-content/uploads/2022/08/Poisson-distribution-graph.webp" width= '400'>

In [56]:
from scipy.stats import poisson

### Probability Mass Function (PMF)
Probability Mass Function (pmf) is a function over the sample space of a discrete random variable X which gives the probability that X is equal to a certain value.

In [57]:
# Probability of a single value
# poisson.pmf(# of events, lambda)
poisson.pmf(10,15)

0.04861075082960534

### Cumulative Distribution Function (CDF)

In [58]:
# If the average number of phone calls an office received per day is 15, what is P(# phone calls in a day <= 10)
# poisson.cdf(# of events, lambda)
poisson.cdf(10, 15)

0.11846441152901499

In [59]:
# P(# phone calls in a day > 10
1 - poisson.cdf(10, 15)

0.881535588470985

### Sampling from a Poisson distribution

In [60]:
# poisson.rvs(lambda, size)
poisson.rvs(10, size=15)

array([12, 11,  3, 10, 10,  5, 10, 12,  9, 13,  9,  7,  9,  9, 13])

### The central limit theorem

- The CLT still applies (approaching $\lambda$)

## Exponential distribution

- Probability of time between Poisson events
- Uses $\lambda$ denoting scale (rate)
- Continuous (time)

<img src="https://upload.wikimedia.org/wikipedia/commons/8/86/WrappedExponentialPDF.png" width="400">


On average, one shoes is manufactured every 4 minutes
- λ = 1/4 = 0.25 shoes created each minute
- scale = 1 / $\lambda$ = 4

In [61]:
from scipy.stats import expon

In [62]:
# P(time < 1 min) 
# expon.cdf(expected value, scale)
expon.cdf(1,scale= 4)

0.22119921692859515

In [63]:
# P(time > 5 min) 
1 - expon.cdf(5,scale= 4)

0.28650479686019015

In [64]:
# P(1 min < time < 5 min) 
expon.cdf(5,scale= 4) - expon.cdf(1,scale= 4)

0.49229598621121473

## (Student's) t-distribution

Has parameter degrees of freedom (df) which affects the thickness of the tails
- Lower df = thicker tails, higher standard deviation -> observations are more likely to fall further from the mean.
- Higher df = closer to normal distribution

<img src='https://www.scribbr.co.uk/wp-content/uploads/2020/08/t_distribution_comparisons.png' width='400'>

## Log-normal distribution

a continuous probability distribution of a random variable whose logarithm is normally distributed.

<img src="https://upload.wikimedia.org/wikipedia/commons/thumb/8/89/Log-normal-pdfs.png/300px-Log-normal-pdfs.png" width="400">

Examples in finance:

- **Stock Prices**: In finance, stock prices are commonly assumed to follow a log-normal distribution. This assumption is based on the idea that the price changes of a stock are proportional and can be thought of as the product of many small percentage changes over time. The logarithm of stock prices often exhibits a more symmetrical, bell-shaped distribution.

- **Asset Returns**: Asset returns, such as stock returns or market returns, are often assumed to follow a log-normal distribution. Log-returns are typically more symmetrically distributed and often have fatter tails compared to simple percentage returns.

- **Option Pricing**: Log-normal distribution is used in option pricing models like the Black-Scholes model. This model assumes that the underlying asset price follows a log-normal distribution. This assumption helps in calculating the probability distribution of future prices, which is fundamental for options pricing.

- **Income Distribution**: In some cases, income or wealth distribution within a population is modeled using a log-normal distribution. For example, when studying income inequality or wealth distribution, economists might use the log-normal distribution to describe how income or wealth is distributed across different segments of the population.

- **Interest Rates**: While interest rates themselves are not often assumed to follow a log-normal distribution, the log of interest rates is sometimes modeled using a normal distribution. This assumption is made in certain financial models to simplify calculations and analyze interest rate movements.