<a href="https://colab.research.google.com/github/dlsun/Stat425F19/blob/master/Normal_Distribution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [0]:
#@title Imports
!pip install -q symbulate
from symbulate import *

# Normal Distribution

The normal distribution is the most important distribution in probability and statistics.  The reason has to do with asymptotic theory ($n \to \infty$), which will be covered in STAT 426.

The normal distribution goes by other names:
- Gaussian distribution (especially common in engineering)
- the bell curve

## The Standard Normal Distribution

A random variable $Z$ with a standard normal distribution is described by the following p.d.f. 

$$ p(z) = \frac{1}{\sqrt{2\pi}} e^{-z^2 / 2}. $$

It is defined for all real values $z$, from $-\infty$ to $\infty$.

The distribution looks like this:

In [0]:
Normal().plot()

### Calculating Probabilities

To calculate probabilities, we integrate the p.d.f. over the relevant region. For example,

$$ P(Z \leq 1) = \int_{-\infty}^1 \frac{1}{\sqrt{2\pi}} e^{-z^2 / 2}\,dz. $$

Unlike other continuous distributions we have studied, the p.d.f. $p(z)$ has no elementary antiderivative. That means that you will not be able to evaluate this integral by paper and pencil, using techniques you learned in calculus. It has to be evaluated numerically. Fortunately, you can do this easily in Symbulate. 

For example, $P(Z \leq 1)$ is just the c.d.f. evaluated at $1$. The c.d.f. of the standard normal distribution is often represented by $\Phi(z)$. So we need to calculate $\Phi(1)$.

In [0]:
Normal().cdf(1)

How would you calculate $P(-2 < Z < 2)$?

In [0]:
# YOUR CODE HERE

### Where does the $\frac{1}{\sqrt{2\pi}}$ come from? (optional)



The $\frac{1}{\sqrt{2\pi}}$ turns out to be precisely the constant we need to make the total probability equal to 1. It seems oddly precise for a function that has no closed form antiderivative. Where does it come from?

We need to show that 
$$ I \overset{def}{=} \int_{-\infty}^\infty e^{-x^2 / 2}\,dx = \sqrt{2\pi}. $$

To do this, we will calculate 
$$ I^2 = \left(\int_{-\infty}^\infty e^{-x^2/2}\,dx \right) \left(\int_{-\infty}^\infty e^{-y^2/2}\,dy \right), $$
which we can write as the double integral 
$$ \int_{-\infty}^\infty \int_{-\infty}^\infty e^{-x^2/2} e^{-y^2/2}\,dx\,dy. $$
Now we change to polar coordinates $(x, y) \mapsto (r, \theta)$. Remembering that $r^2 = x^2 + y^2$ and $dx\,dy = r\,dr\,d\theta$, we have 
$$ \int_{0}^{2\pi}\int_{0}^\infty e^{-r^2/2} r\,dr\,d\theta. $$
Now the inner integral over $r$ is a simple $u$-substitution. (Let $u = r^2 / 2$ so that $du = r\,dr$.) The inner integral evaluates to $1$; integrating the constant function $1$ from $\theta=0$ to $2\pi$ yields $2\pi$. So 
$$ I^2 = 2\pi $$
and 
$$ I = \sqrt{2\pi}.$$

## The (General) Normal Distribution

The standard normal distribution is centered at 0 with a variance of 1. In general, we can
- scale the bell shape to be as wide as we want, 
- shift the bell shape to be centered wherever we want.

If $Z$ is standard normal, then 
$$ X = \mu + \sigma Z $$
is $\text{Normal}(\mu, \sigma)$.

In [0]:
Normal().plot()
Normal(mean=0, sd=2).plot()
Normal(mean=3, sd=0.5).plot()

# Example

A binary communications system transmits either a $+2$V signal or a $-2$V signal. The signal incurs Gaussian noise with mean $0$V and standard deviation $1.5$V in the transmission process so that the received signal is 
$$ X = v + N, $$
where $v =$ transmitted voltage and $N =$ noise.

The receiver judges the intended signal by whether the received voltage is positive or negative. For example, if the received voltage is negative, the receiver infers the intended signal was $-2$V.

In [0]:
Normal(mean=-2, sd=1.5).plot()
Normal(mean=2, sd=1.5).plot()

### Question 1.

If a $-2$V signal is sent, what is the probability the receiver makes the wrong judgment?

**Modern Way**

Note that $X$ is a $\text{Normal}(\mu=-2, \sigma=1.5)$ random variable, and we are interested in $P(X > 0)$.

In [0]:
1 - Normal(mean=-2, sd=1.5).cdf(0)

**In the olden days...**

Since $X$ is $\text{Normal}(\mu=-2, \sigma=1.5)$, we know that $X = -2 + 1.5 Z$, where $Z$ is standard normal. In other words, $Z = \frac{X - (-2)}{1.5}$. This process---subtracting the mean and dividing by the standard deviation---is called _standardization_. 

By standardizing, we reduce the problem to a probability under the standard normal distribution:

$$ P(X > 0) = P\left(\frac{X - (-2)}{1.5} > \frac{0 - (-2)}{1.5}\right) = P(Z > 1.33). $$

How do we find this probability? By looking it up in a table of standard normal probabilities, [like this one](https://web.calpoly.edu/~dsun09/tables/normal_cdf_table.pdf). This table gives us the value of the standard normal c.d.f. $\Phi(z)$ at different values. We see that $\Phi(1.3) = 0.903$, so

$$ P(Z > 1.33) = 1 - P(Z \leq 1.33) = 1 - \Phi(1.33) \approx 1 - 0.903 = 0.097. $$

### Question 2.

If a $-2$V signal is sent, what is the 99th percentile of the received voltage? That is, what is the voltage $x$ such that $P(X \leq x) = .99$.

You are familiar with the c.d.f., which translates values $x$ into probabilities. But to answer this question, we need something like the inverse of the c.d.f., that translates a probability, like $.99$, into values $x$.

That function is called the **quantile function**.

In [0]:
Normal(mean=-2, sd=1.5).quantile(.99)

So there is a $0.99$ probability that the received voltage is less than 1.49. Out of an abundance of caution, let's double check this using the c.d.f.

In [0]:
Normal(mean=-2, sd=1.5).cdf(1.49)

### Question 3.

Suppose we want to reduce the probability that the receiver makes the wrong judgment to 1%. How much would we have to reduce the standard deviation of the noise?

Now, the received signal $X$ follows a $\text{Normal}(\mu=-2, \sigma)$ distribution, where $\sigma$ is unknown. Our job is to find $\sigma$ so that $P(X > 0) = .01$.

Standardization can help here. We know that $Z = \frac{X - (-2)}{\sigma}$ follows a standard normal distribution, so we need 

$$.01 = P(X > 0) = P\left(Z > \frac{0 - (-2)}{\sigma}\right) = P(Z > 2 / \sigma). $$

If we can find the value of $z$ that achieves $P(Z > z) = .01$, then we can set it equal to $2/\sigma$ and solve for $\sigma$. 

Note that $P(Z > z) = .01$ is equivalent to $P(Z \leq z) = .99$, so we just need a quantile of the standard normal distribution.

In [0]:
Normal().quantile(.99)

So we need $2/\sigma = 2.326$, or $\sigma = 0.86$. We would need to reduce the standard deviation by about 40% relative to its current value of $1.5$.

In [0]:
Normal(mean=-2, sd=1.5).plot(xlim=(-6, 6))
Normal(mean=-2, sd=0.86).plot(xlim=(-6, 6))