# The Central Limit Theorem

Tutorial https://www.hackerrank.com/challenges/s10-the-central-limit-theorem-1/tutorial

The central limit theorem states that under certain (fairly common) conditions, the sum of many random variables will have an approximately normal distribution.

https://en.wikipedia.org/wiki/Normal_distribution#Central_limit_theorem

For large n, the distribution of sample sums $S_n$ is close to normal distribution N ($\mu', \sigma'$) where:
$$\mu' = n . \mu$$
$$\sigma' = \sqrt{n} . \sigma$$

### Exercise

In [7]:
'''Standardized Z'''
import math

max_load = 9800 # max load can be transported (pounds)
n = 49 # number of box

# Box weight follows distribution ~ (mean = 205, stdev = 15)
mean = 205 # per box (pounds)
stdev = 15 # per box(pounds)

# for sum S (of all 49 boxes) is close to normal distribution with (MEAN, STDEV) where:
MEAN = mean * n
STDEV = stdev * math.sqrt(n)

# Standardize z to make a standard nomal distribution ~(0,1)
z = (max_load - MEAN) / STDEV

def phi(x):
    'Cumulative distribution function for the standard normal distribution'
    return (1.0 + math.erf(x / math.sqrt(2.0))) / 2.0

# Probability that elevator can safely transport all 49 boxes (Weight of 49 boxes is below max_load)
result = phi(z)
print(round(result, 4))

0.0098


### Exercise II

In [3]:
import math
'''question'''
# number of student
n = 100
# number of ticket per student ~ (mean = 2.4, stdev = 2.0)
mean = 2.4
stdev = 2.0
# number of ticket left
left = 250
# what is the probability that all 100 students will be able to purchase tickets?
'''answer'''
# Sum ticket of n students (S variable ~ (MEAN, STDEV))
MEAN = n * mean
STDEV = math.sqrt(n) * stdev
standardized_S = (left - MEAN) / STDEV
# standardized_S ~ standard normal distribution ( mean = 0, stdev = 1)

def phi(x):
    'Cumulative distribution function for the standard normal distribution'
    return (1.0 + math.erf(x / math.sqrt(2.0))) / 2.0

print(round(phi(standardized_S),4))

0.6915


### Exercise III

In [17]:
import math
'''question'''
# sample of 100 from a population ~ (mean_pop = 500, stdev_pop = 80)
n = 100
mean_pop = 500
stdev_pop = 80

# sample ~ (mean_sam = mean_pop, stdev_sam = stdev_pop / math.sqrt(n))
mean_sam = mean_pop
stdev_sam = stdev_pop / math.sqrt(n)

# Compute the interval that covers the middle 95% of the distribution of the sample mean
p = .95
# z score = 1.95
z = 1.95

'''answer'''
A = mean_sam - stdev_sam * z
B = mean_sam + stdev_sam * z
print(round(A, 2))
print(round(B, 2))

484.4
515.6
