## LLN & CLT

Sections:
- Law of Large Numbers
- Central Limit Theorem
- Framing for Applications

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import scipy

#### 1. Simuate Standard Uniform variables

In [None]:
# there is an even probability of selecting any value between this range, where each number has a 0 propbability of being selected
u = np.random.uniform(0, 1)
print(u)

we can simulate many variables, get a vector and visualise this

In [None]:
# you can see its like a horizontal line
u = np.random.uniform(0, 1, 100_000)
print(u)
plt.hist(u, bins=20)

#### 2. Simuate Normal variables
- 68% of the data will fall between 0 and 1

In [None]:
# in this case we dont have a lower and upper limit, so it is not bound but we supply a mean and standard deviation
n = np.random.normal(0, 1)
print(n)

In [None]:
# therefore, we get the bell curve
n = np.random.normal(0, 1, 10_000)
print(n)
plt.hist(n, bins=20);

In [None]:
sns.kdeplot(n);

In [None]:
# expectation mean should be close to 0
print(np.mean(n))

### Law of Large Numbers

*As the sample size increases indefinitely the difference between the estimate given by a sample and the population parameter will be bound by any arbitrary distance with certainty.*

In other words, as the sample gets closer to the population the distance between the estimate and the true population average will converge.

*"for some arbitrary positive epsilon, as n approaches infinity, we are certain this difference is bounded by this number"*

In [None]:
epsilon = .01
l = []
for i in range(1_000):
    # generate 10 draws from a normal distribution
    n = np.random.normal(0, 1, 50_000)
    # find xbarn
    xbarn = np.mean(n)
    # determine if xbarn is less and epsilon and create the indicator function
    if np.abs(xbarn) < epsilon:
        l.append(1)
    else:
        l.append(0)
# compute the raw probability
print(sum(l)/len(l))

### Central Limit Theorem

*"As the sample size increases indefinitely, the probability of observing the standardized version of the estimate can be approximated with a standard normal cumulative distribution function."*

With sufficient *n*...    
    $\bar{X} \sim N(\mu, \frac{\sigma}{\sqrt{n}})$

In [None]:
# the distribution of sample means, will always be normal
# so we will do this 1000 times where we take 10,000 draws from a uniform distribution
ubars = []
for i in range(1_000):
    u = np.random.uniform(0, 1, 10_000)
    xbarn = np.mean(u)
    ubars.append(xbarn)

In [None]:
# so we would expect it to follow a uniform distribution but it follows a normal
sns.kdeplot(ubars);