<h1 id="tocheading">Table of Contents and Notebook Setup</h1>
<div id="toc"></div>

In [3]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>

In [11]:
import numpy as np 
import pandas as pd

# The Central Limit Theorem

Suppose we have n independent distributions of random variables of <b> any kind. </b> These variables can be normally distributed, uniformly distributed, or any ditribution we like. Lets label these ditributions as follows:

$$\{x_1(k), x_2(k), ... x_n(k)\}$$

with $E[x_i(k)]=\mu_i$ and $E[(x_i(k)-\mu_i)^2]=\sigma_i^2$.

These distributions consist of measurements (they <b> are not </b> probability density functions). They can be thought of as vectors with finite many measurements. Note that the more measurements we make, the more we approach the actually probability distribution.

## Creating a New Distribution

Thinking of the distributions of vectors, we can some over them and create a new distribution as such:

$$x=\frac{1}{N} \sum_{i=1}^N x_i(k)$$

This new distribution takes the mean value for vector entry. Consider the example below where $x_1=[9,8,12]$, and $x_2=[10,11,8]$ and $x_3=[12,7,9]$.

In [32]:
x_1 = np.array([9,8,12])
x_2 = np.array([10,11,8])
x_3 = np.array([12,7,9])

x = (1./3)*(x_1+x_2+x_3)
x

array([ 10.33333333,   8.66666667,   9.66666667])

The new entries are the element-wise mean of the three data sets.

## The Prediction of the Central Limit Theorem

The central limit theorem predicts that the distribution $x$ , for large $N$, approaches a normal distribution.

# Programming Examples

## Application of Central Limit Theorem on Uniform Distributions

Below we make 50 measurements of a uniformly distributed variable for each data set $x_i(k)$. In total we measure 30 data sets.

In [62]:
df = pd.DataFrame(columns = ['Data Set'])

num_datasets = 30
for i in np.arange(num_datasets):
    s = pd.Series([np.random.rand(50)], index=['Data Set'])
    s.name=('$x_{}$'.format(i))
    df = df.append(s)
    
df.head()

Unnamed: 0,Data Set
$x_0$,"[0.457998615832, 0.0652584516164, 0.1847594649..."
$x_1$,"[0.15520448996, 0.187205538144, 0.620046926802..."
$x_2$,"[0.105056391327, 0.693403246722, 0.48446242380..."
$x_3$,"[0.254089585794, 0.00362408349563, 0.518781290..."
$x_4$,"[0.690431188056, 0.958970692975, 0.56146227269..."


The central limit theorem claims that if we 