# Section 0. What is the Central Limit Theorem?

## Introduction

There are two usual ways of justifying the widespread usage of the Gaussian or "normal" distribution. One is that of convenience: [TODO: TALK ABOUT SPECIAL PROPERTIES]. The second one, somewhat related, is through the Central Limit Theorem. Informally speaking we can express it as:

> The distribution of the sum of many small independent random variables converges to the normal distribution, with mean and variance equal to the sums of those magnitudes in the original variables.

[TODO: ADD MORE]

## Some examples

We can see this theorem in action by seeing how the sum of independent random variables with different distributions ends distributed. We can start by creating series of random variables distributed in many different ways:

In [None]:
import numpy as np
import matplotlib.mlab as mlab
import matplotlib.pyplot as plt
np.random.seed(1234)

# number of examples of each class to use
N = 1000000

# number of bins
B = 200

# defines N uniform, triangular, exponential variables
uniform_vars = np.random.uniform(0, 1, N)
triangular_vars = np.random.triangular(0, 1, 3, N)
exponential_vars = np.random.exponential(3, N)
bernoulli_vars = np.random.choice(2, N)

# plots their distributions
fig, axs = plt.subplots(2,2, figsize=(15, 5))    
axs[0,0].hist(uniform_vars, B, normed=1)
axs[0,1].hist(triangular_vars, B, normed=1)
axs[1,0].hist(exponential_vars, B, normed=1)
axs[1,1].hist(bernoulli_vars, B, normed=1)
plt.show()

If we now group them in groups of a relatively small size and sum them, we observe s surprising convergence in distribution shapes,

In [None]:
B = 200
G = 100

# summing them in groups of size G
def group_sum(rv):
    return rv.reshape((G, -1)).sum(axis=0)
sum_uv = group_sum(uniform_vars)
sum_tv = group_sum(triangular_vars)
sum_ev = group_sum(exponential_vars)
sum_bv = group_sum(bernoulli_vars)

# plots their distributions
fig, axs = plt.subplots(2,2, figsize=(15, 5))    
axs[0,0].hist(sum_uv, B, normed=1)
axs[0,1].hist(sum_tv, B, normed=1)
axs[1,0].hist(sum_ev, B, normed=1)
axs[1,1].hist(sum_bv, B, normed=1)
plt.show()

with some important differences:

  * Mean and variance.
  * The sum of Bernoulli variables only ends up occupying integer values ("conserved quantity").