$
    \def\given{\,\middle|\,}
    \newcommand{\setm}{\setminus}
    \newcommand{\Pr}[1]{\mathbf{P}\left(#1 \right)}
    \newcommand{\given}{\,\middle|\,}
    \newcommand{\qp}[1]{\left(#1\right)}
    \newcommand{\Var}[1]{\mathrm{Var}\left[#1\right]}
    \newcommand{\E}[1]{\mathrm{E}\left[#1\right]}
$

# Approximations with the Central Limit Theorem

- Submitted by Colton Grainger for Math 451-1: Probability Theory, Engineering Outreach.
- Text: Samaniego, J.F., *Stochastic Modeling and Mathematical Statistics*, CRC Press, 2014.
    - &sect; 5.5 Prob 24, 27, 28, 31, 35

### Prob 24

Let the $X_i$ be identitical and indepenedently distributed RVs with pmf $p(1) = \frac12$, $p(-1) = \frac12$.

For any of the $X_i$, we have 

- $\mu = \E{X_i} = 0$,
- $\sigma^2 = \Var{X_i} = 1$.

Now consider the sum of $n$ of the $X_i$, 

$$
S_n = \sum_{i=1}^n X_i.
$$

We'll approximate $S_n$ for large $n$ with a normal distribution. That is, where 

$$
g(x) = \frac{x - n\mu}{\sqrt{n}\sigma},
$$

the Central Limit Theorem provides that $g(S_n)$ [converges in distribution](https://en.wikipedia.org/wiki/Convergence_of_random_variables#Convergence_in_distribution) to $Z \sim N(0,1)$.

We desire $\Pr{S_n \geq 25}$, or equivalently, as $g$ is a continuous, monotone function, 
$$
\Pr{g(S_n) \geq g(25)}.
$$

With $n = 500$, we approximate 
$$
\begin{align}
    \Pr{g(S_n) \geq g(25)} &\approx \Pr{ Z \geq g(25)}\\
        & \approx 1 - \Phi(g(25))\\
        & \approx 0.1318.
\end{align}
$$

In [None]:
from scipy.stats import norm

mean = 0
stddev = 1
n = 500

def g(x):
    # N for numerical approx, rather than symbolic sqrt
    return N((x - n*mean)/(stddev*sqrt(n)))

print 1 - norm.cdf(g(25))

### Prob 27

Let the $X_i$ be independent, identically distributed random variables with probability mass function $p(1) = \frac9{19}$ and $p(-1) = \frac{10}{19}$. 

We find
- $\mu = \E{X_i} = -\frac{1}{19}$,
- $\sigma^2 = \Var{X_i} = \frac{360}{361}$.

By the Central Limit Theorem, $\Pr{\frac{S_n - n\mu}{\sqrt{n}\sigma} \leq x} \approx \Phi(x)$, for large $n$. In our case, $n = 1000$. 

We thus approximate 
$$
\begin{align}    
        \Pr{S_n \geq 0} &= 1 - \Pr{S_n \leq 0}\\
        &= 1 - \Pr{\frac{S_n - n\mu}{\sqrt{n}\sigma}  \leq -\frac{n\mu}{\sigma \sqrt{n}}}
        \\
        &\approx 1 - \Phi\left(-\frac{n\mu}{\sqrt{n}\sigma}\right)\\
        &\approx 0.04779
\end{align}
$$

In [None]:
mean = -1/19
stddev = sqrt(1 - (1/19)^2)
n = 1000

print 1 - norm.cdf(g(0))

### Prob 28

Suppose the $X_i$ are all $\stackrel{\text{iid}}{\sim} G(1/2)$. 

Describing the $X_i$, we have

- $\mu = \E{X_i} = 2$,
- $\sigma^2 = \Var{X_i} = 2$.

Now consider the sum
$$
S_n = \sum_{i=1}^n X_i.
$$

We desire $\Pr{S_n \geq 220}$ for $n = 100$. 

Since $n$ is large, we'll apply the Central Limit Theorem to approximate

$$
\begin{align}
    1 - \Pr{S_n \leq 220} 
        &= 1- \Pr{\frac{S_n - n\mu}{\sqrt{n}\sigma} \leq \frac{220 - n\mu}{\sqrt{n}\sigma}}\\
        &\approx  1- \Phi\left(\frac{220 - n\mu}{\sqrt{n}\sigma}\right)\\
        &\approx 0.07865.
\end{align}
$$

In [None]:
mean = 2
stddev = sqrt(2)
n = 100

print 1 - norm.cdf(g(220))

### Prob 31

Suppose the $X_i$ are iid RVs with 

- $\mu = \E{X_i} = 12$,
- $\sigma^2 = \Var{X_i} = 2$.

Then, considering $S = \sum_{i=1}^{100} X_i$, we have 

$$
\begin{align}
    \Pr{S \leq 1230} 
        &\approx  \Phi\left(\frac{1230 - 100\cdot\mu}{10\cdot\sigma}\right)\\
        &\approx 0.9332.
\end{align}
$$

In [None]:
mean = 12
stddev = 2
n = 100

def g(x):
# the N() wrapper returns a numerical approx 
# ... rather than the symbolic square root
    return N((x - n*mean)/(stddev*sqrt(n)))

print norm.cdf(g(1230))

### Prob 35

Suppose the $X_i$ are iid $\mathrm{Exp}\left(\frac1{10}\right)$ RVs.

Then

- $\mu = \E{X_i} = 10$,
- $\sigma^2 = \Var{X_i} = 100$.

Considering $S = \sum_{i=1}^{100} X_i$, we have 

$$
\begin{align}
    \Pr{S \leq 1050} 
        &\approx  \Phi\left(\frac{1050 - 1000}{100}\right)\\
        &\approx 0.6915.
\end{align}
$$

In [None]:
mean = 10
stddev = 10
n = 100

print norm.cdf(g(1050))