# Central Limit Theorem

In this notebook, we will use ```numpy``` to draw samples from several different types of distributions and check whether the sample mean actually converges to a normal distribution as sample size gets larger.

### A. Uniform Distribution

When the distribution is uniform, there is equal chance in getting amy number between the specified minimum and maximum.

1. Generate samples with ```np.random.random()```.
2. Compute sample means with ```np.mean()```.
3. Plot a histogram with ```matplotlib.pyplot.hist()```.


In [None]:
# Draw samples from np.random.random()

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline



In [None]:
# Repeat 500 times, calculate the sample mean each time



In [None]:
# Repeat 500 times with sample size increased to 30



If you want to plot a smoothed line based on the histogram, use ```seaborn.distplot()```:

In [None]:
import seaborn as sns


### B. Bernoulli Distribution

Bernoulli distribution gives 1 with chance $p$ and 0 with chance $1-p$.

1. Generate samples with ```scipy.stats.bernoulli()```.
2. Compute sample means with ```np.mean()```.
3. Plot a histogram with ```seaborn.displot()```.

We will use a Bernoulli distribution with $p=0.6$, meaning that we will get 1's 60% of the time.

In [None]:
# Bernoulli trials

from scipy.stats import bernoulli



In [None]:
# Repeat 500 times, calculate the sample mean each time



In [None]:
# Repeat 500 times with sample size increased to 30



### C. Exponential Distribution

In an exponential distribution, the chance of getting a number goes smaller when the number is larger.

1. Generate samples with ```np.random.exponential()```.
2. Compute sample means with ```np.mean()```.
3. Plot a histogram with ```seaborn.displot()```.

In [None]:
# Exponential distribution


In [None]:
# Repeat 500 times, calculate the sample mean each time



In [None]:
# Repeat 500 times with sample size increased to 30



### Variance Estimate 

How accurate does the sample variance approximate the population variance?
We will use the Bernoulli distribution as an example.

$$
Var[X] = E[X^2] - E[X]^2
$$

For Bernoulli trials, this gives a population variance of $Var[x] = 0.6 - 0.6^2 = 0.24$.

In [None]:
# Sample size = 5


In [None]:
# Sample size = 30


In [None]:
# Sample size = 1000


As we can see, when the sample size is small the sample variance is very likely to be different from the population variance.

When sample size is large, sample variance does give a good estimate of population variance.