# The Standard Normal Distribution

## Objectives
- Calculate probabilities involving the standard normal distribution.
- Calculate quantiles of the standard normal distribution corresponding to given probabilities.

## The Standard Normal Distribution
To begin to understand how to calculate probabilities of normal distributions, we will first study how to find probabilities of special kind of normal distribution called the **standard normal distribution**. The standard normal distribution is the particular normal distribution with mean $\mu = 0$ and standard deviation $\sigma = 1$. The standard normal distribution is so important that we reserve the random variable $Z$ exclusively for the standard normal distribution, so $Z \sim N(0, 1)$.

In [189]:
png("stdnormal1.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, 1, 0.01), 1)
polyy = c(0, dnorm(seq(-3, 1, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(1, 1), c(0, dnorm(1)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=-3:3, lwd.ticks = 0, cex.axis = 2)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.3, 0.1, labels = expression(P(Z < 1)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal1.png
---
width: 100%
alt: The standard normal distribution with the area to the left of z = 1 shaded.
name: stdnormal1
---
The standard normal distribution with the area to the left of $z = 1$ shaded. The shaded area is equal to $P(Z < 1)$.
```

Suppose we want to calculate $P(Z < 1)$. This probability is equal to the area under the curve of the standard normal distribution and to the left of $z = 1$. (See {numref}`Figure {number} <stdnormal1>`.) This region has an unusual shape, so we can't calculate the area of this region using the usual geometric formulas for shapes like rectangles and triangles. For this reason, probabilities of normal distributions are found using computers or calculators instead of being calculated directly.

## Finding Probability Given a $z$-value

For $Z\sim N(0, 1)$, we can calculate the probability $P(Z < z)$ of randomly choosing a value less than $z$ using the R function <code>pnorm</code>:
```R
pnorm(q)
```
The argument <code>q</code> is the $z$-value we want to find the probability of being less than. (Here, <code>q</code> stands for **q**uantile, since the probability that <code>pnorm(q)</code> returns is the theoretical proportion of the values that are less than argument <code>q</code>.)

For example, let's use R to find $P(Z < 1)$.

In [1]:
pnorm(q = 1)

So $P(Z < 1) = 0.8413$.

Note that <code>pnorm(q = z)</code> only directly computes probability of the form $P(Z < z)$, the probability that a randomly chosen value is <i>less</i> than $z$. What if we want to calculate $P(Z > z)$, the probability that a randomly chosen value is <i>greater</i> than $z$? Or, even trickier, what if we want to calculate $P(z_1 < Z < z_2)$, the probability that a randomly chosen value is <i>between</i> the values $z_1$ and $z_2$? We can actually still indirectly use the <code>pnorm</code> function to calculate these probabilities as the next examples illustrate.

***
### Example 4.1.4
The data in a research study has a standard normal distribution.
1. If you randomly choose a data value from the study, what is the probability that the data value will be less than $z = -0.83$?
2. If you randomly choose a data value from the study, what is the probability that the data value will be greater than $z = -0.57$?
3. If you randomly choose a data value from the study, what is the probability that the data value will be between $z = -0.94$ and $z = 1.25$?

#### Solution
##### Part 1
Graphically, we want to find the area under the curve of the standard normal distribution and to the left of $z = -0.83$. This area is equal to $P(Z < -0.83)$.

In [190]:
png("stdnormal_ex1.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = -0.83
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, -0.83, 0, 3), labels=c("", -0.83, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-1.37, 0.05, labels = expression(P(Z < -0.83)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex1.png
---
width: 100%
alt: The standard normal distribution with the area to the left of z = -0.83 shaded.
name: stdnormal_ex1
---
The standard normal distribution with the area equal to $P(Z < -0.83)$ shaded.
```

We use the <code>pnorm</code> function to calculate the probability.

In [25]:
pnorm(q = -0.83)

So $P(Z < -0.83) = 0.2033$. This means that if we randomly choose a data value from the research study, there is about a $20.33\%$ chance that the value will be less than $z = -0.83$.

##### Part 2
We want to find $P(Z > -0.57)$. This is a bit tricky since <code>pnorm(q = -0.57)</code> gives the probability that a randomly chosen value is <i>less</i> than $z = -0.57$, but we want the probability that a randomly chosen value is <i>greater</i> than $z = -0.57$. How can we use R to calculate $P(Z > -0.57)$?

Instead of calculating $P(Z > -0.57)$ directly, we start with the <i>total</i> area under the <i>entire</i> curve (which is equal to $1$), then take away the area that we <i>don't</i> want (which equals $P(Z < -0.57)$). (See {numref}`Figure {number} <stdnormals_rtail_eqn>`.) This idea expressed as an equation gives us

$$ P(Z > -0.57) = 1 - P(Z < -0.57). $$

In [41]:
png("stdnormals_rtail_eqn.png", width = 1500, height = 275)

library("shape")

mat = matrix(c(1, 2, 3, 4, 5), ncol = 5)
layout(mat, widths = c(1, 0.1, 1, 0.1, 1))


z = seq(-3, 3, 0.01)
y = dnorm(z)
zval = -0.57
cexsize = 3


# Area P(Z > -0.57)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z > -0.57)), cex = cexsize, col = "blue3", pos = 3)


# Equal Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "=", cex = 4)


# Area 1.00
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border="NA")
#lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(1), cex = cexsize, col = "blue3", pos = 3)


# Minus Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "–", cex = 4)


# Area P(Z < -0.57)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < -0.57)), cex = cexsize, col = "blue3", pos = 3)


dev.off()

````{div} full-width
```{figure} stdnormals_rtail_eqn.png
---
width: 100%
alt: A diagram illustrating that P(Z > -0.57) = 1 - P(Z < -0.57).
name: stdnormals_rtail_eqn
---
A diagram illustrating that $P(Z > -0.57) = 1 - P(Z < -0.57)$. Rather than calculating $P(Z > -0.57)$ directly, we take the total area under the standard normal distribution (which equals $1$), then take away the part that we don't want (the part equal to $P(Z < -0.57)$). 
```
````

With this formula, we can compute $P(Z > -0.57)$ using R.

In [2]:
1 - pnorm(q = -0.57) 

Then $P(Z > -0.57) = 0.7157$. So if we randomly choose one of the data values from the research study, there is about a $71.57\%$ chance that the data value will be greater than $z = -0.57$.

##### Part 3
We need to calculate $P(-0.94 < Z < 1.25)$. Once again, we can't calculate this probability directly using the <code>pnorm</code> function since, in this instance, we want the probability that a randomly chosen value is <i>in between</i> two other values, $z = -0.94$ and $z = 1.25$. Similar to what we did before, we begin by finding <i>all</i> of the area under the curve to the left of the larger $z$-value (which is equal to $P(Z < 1.25)$ in this case), then take away the extra area to the left of the smaller $z$-value (which equals $P(Z < -0.94)$). (See {numref}`Figure {number} <stdnormals_between_eqn>`.) This gives us the equation

$$ P(-0.94 < Z < 1.25) = P(Z < 1.25) - P(Z < -0.94). $$

In [40]:
png("stdnormals_between_eqn.png", width = 1500, height = 275)

library("shape")

mat = matrix(c(1, 2, 3, 4, 5), ncol = 5)
layout(mat, widths = c(1, 0.1, 1, 0.1, 1))


z = seq(-3, 3, 0.01)
y = dnorm(z)
z1 = -0.94
z2 = 1.25
cexsize = 3


# Area P(z1 < Z < z2)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(z1, seq(z1, z2, 0.01), z2)
polyy = c(0, dnorm(seq(z1, z2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(z1, z1), c(0, dnorm(z1)))
lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(~paste(-0.94 < Z) < 1.25)), cex = cexsize, col = "blue3", pos = 3)


# Equal Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "=", cex = 4)


# Area P(Z < z2)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, z2, 0.01), z2)
polyy = c(0, dnorm(seq(-3, z2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
#lines(c(z1, z1), c(0, dnorm(z1)))
lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < 1.25)), cex = cexsize, col = "blue3", pos = 3)


# Minus Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "–", cex = 4)


# Area P(Z < z1)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, z1, 0.01), z1)
polyy = c(0, dnorm(seq(-3, z1, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(z1, z1), c(0, dnorm(z1)))
#lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < -0.94)), cex = cexsize, col = "blue3", pos = 3)


dev.off()

````{div} full-width
```{figure} stdnormals_between_eqn.png
---
width: 100%
alt: A diagram illustrating that P(-0.94 < Z < 1.25) = P(Z < 1.25) - P(Z < -0.94).
name: stdnormals_between_eqn
---
A diagram illustrating that $P(-0.94 < Z < 1.25) = P(Z < 1.25) - P(Z < -0.94)$. Rather than calculating $P(Z > -0.57)$ directly, we take all the area under the curve to the left of $z = 1.25$ (which equals $P(Z < 1.25)$), then take away the part that we don't want (the part equal to $P(Z < -0.94)$). 
```
````

Using this formula, we can use R to find $P(-0.94 < Z < 1.25)$.

In [42]:
pnorm(q = 1.25) - pnorm(q = -0.94)

So $P(-0.94 < Z < 1.25) = 0.7207$, meaning there is about a $72.07\%$ chance that a randomly chosen data value from the research study will be between $z = -0.94$ and $z = 1.25$.

***

### Example 4.1.5
Let $Z\sim N(0, 1)$.
1. Find $P(Z < 0.67)$.
2. Find $P(Z > -1.91)$.
3. Find $P(0.34 < Z < 1.62)$.

#### Solution
##### Part 1

In [28]:
png("stdnormal_ex2.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = 0.67
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.45, 0.1, labels = expression(P(Z < 0.67)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex2.png
---
width: 100%
alt: The area under the standard normal distribution and to the left of z = 0.67 is shaded. The shaded area is labeled P(Z < 0.67).
name: stdnormal_ex2
---
The area equal to $P(Z < 0.67)$.
```

We want to find $P(Z < 0.67)$, the probability that a randomly chosen value from the distribution is <i>less</i> than $z = 0.67$. We don't need to do anything special to use R to calculate this probability. We just use the <code>pnorm</code> function without any changes.

In [6]:
pnorm(q = 0.67)

So $P(Z < 0.67) = 0.74857$.

##### Part 2

In [23]:
png("stdnormals_rtail_eqn2.png", width = 1500, height = 275)

library("shape")

mat = matrix(c(1, 2, 3, 4, 5), ncol = 5)
layout(mat, widths = c(1, 0.1, 1, 0.1, 1))


z = seq(-3, 3, 0.01)
y = dnorm(z)
zval = -1.91
cexsize = 3


# Area P(Z > -1.91)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z > -1.91)), cex = cexsize, col = "blue3", pos = 3)


# Equal Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "=", cex = 4)


# Area 1.00
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border="NA")
#lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(1), cex = cexsize, col = "blue3", pos = 3)


# Minus Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "–", cex = 4)


# Area P(Z < -1.91)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), lab=c("", zval, 0, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < -1.91)), cex = cexsize, col = "blue3", pos = 3)


dev.off()

````{div} full-width
```{figure} stdnormals_rtail_eqn2.png
---
width: 100%
alt: A diagram illustrating that P(Z > -1.91) = 1 - P(Z < -1.91).
---
A diagram illustrating that $P(Z > -1.91) = 1 - P(Z < -1.91)$.
```
````

We want to find $P(Z > -1.91)$, the probability that a randomly chosen value from the distribution is <i>greater</i> than $z = -1.91$. To find $P(Z > -1.91)$, we take the total area under the standard normal distribution (which equals $1$), then take away the part that we don't want (the part equal to $P(Z < -1.91)$).

In [29]:
1 - pnorm(q = -1.91)

So $P(Z > -1.91) = 0.97193$.

##### Part 3

In [41]:
png("stdnormals_between_eqn2.png", width = 1500, height = 275)

library("shape")

mat = matrix(c(1, 2, 3, 4, 5), ncol = 5)
layout(mat, widths = c(1, 0.1, 1, 0.1, 1))


z = seq(-3, 3, 0.01)
y = dnorm(z)
z1 = 0.34
z2 = 1.62
cexsize = 3


# Area P(z1 < Z < z2)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(z1, seq(z1, z2, 0.01), z2)
polyy = c(0, dnorm(seq(z1, z2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(z1, z1), c(0, dnorm(z1)))
lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(~paste(0.34 < Z) < 1.62)), cex = cexsize, col = "blue3", pos = 3)


# Equal Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "=", cex = 4)


# Area P(Z < z2)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, z2, 0.01), z2)
polyy = c(0, dnorm(seq(-3, z2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
#lines(c(z1, z1), c(0, dnorm(z1)))
lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < 1.62)), cex = cexsize, col = "blue3", pos = 3)


# Minus Sign
par(mar = c(0, 0, 0, 0))
plot.new()
text(0.5, 0.5, labels = "–", cex = 4)


# Area P(Z < z1)
par(mar = c(3, 0, 0, 0), mgp = c(3, 2, 0))
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2, ylim = c(0, dnorm(0) + 0.06))
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

polyz = c(-3, seq(-3, z1, 0.01), z1)
polyy = c(0, dnorm(seq(-3, z1, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(z1, z1), c(0, dnorm(z1)))
#lines(c(z2, z2), c(0, dnorm(z2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, z1, z2, 3), lab=c("", z1, z2, ""), lwd.ticks = 0, cex.axis = cexsize)

text(0, dnorm(0), labels = expression(P(Z < 0.34)), cex = cexsize, col = "blue3", pos = 3)


dev.off()

````{div} full-width
```{figure} stdnormals_between_eqn2.png
---
width: 100%
alt: A diagram illustrating that P(0.34 < Z < 1.62) = P(Z < 1.62) - P(Z < 0.34).
name: stdnormals_between_eqn2
---
A diagram illustrating that $P(0.34 < Z < 1.62) = P(Z < 1.62) - P(Z < 0.34)$.
```
````

We want to find $P(0.34 < Z < 1.62)$, the probability that a randomly chosen value from the distribution is <i>between</i> $z = 0.34$ and $z = 1.62$. To calculate $P(0.34 < Z < 1.62)$, we take <i>all</i> the area under the curve to the left of the larger $z$ value (which equals $P(Z < 1.62)$ in this case), then take away the area that we don't want to the left of the smaller $z$ value (which equals $P(Z < 0.34)$).

In [42]:
pnorm(q = 1.62) - pnorm(q = 0.34)

So $P(0.34 < Z < 1.62) = 0.31431$.

***

## Finding a $z$-value Given Probability
We've seen that, given a $z$-value, the R code <code>pnorm(q = z)</code> will tell us $P(Z < z)$, the probability that a randomly chosen value from the standard normal distribution is smaller than $z$. But how do we do the inverse? That is, given a probability, how do we find a $z$-value so that $P(Z < z)$ is equal to that probability?

We can do this using the <code>qnorm</code> function:
```R
qnorm(p)
```
Here, <code>p</code> is the given **p**robability. The function <code>qnorm(p)</code> tells us the $z$-value (or **q**uantile) so that $P(Z < z)$ equals <code>p</code>.

To illustrate the inverse relationship between the <code>pnorm</code> function and the <code>qnorm</code> function, let us take an arbitrary $z$-value, say $z = 0.52$. Let's use the <code>pnorm</code> function to find $P(Z < 0.52)$.

In [78]:
pnorm(q = 0.52)

So $P(Z < 0.52) = 0.69847$.

In [12]:
png("stdnormal_ex3.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = 0.52
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.52, 0.1, labels = expression(P(Z < 0.52) == 0.69847), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex3.png
---
width: 100%
alt: The standard normal distribution. The area under the curve to the left of z = 0.52 is shaded. The shaded region is labeled P(Z < 0.52) = 0.69847.
name: stdnormal_ex3
---
The standard normal distribution with the area equal to $P(Z < 0.52) = 0.69847$ shaded.
```

Now let's see what happens when we plug this probability, $p = 0.69847$, into the <code>qnorm</code> function.

In [79]:
qnorm(p = 0.69847)

We get $z = 0.52$, which is the $z$-value we started with.

In [1]:
png("pnorm_qnorm_inv.png", width = 1000, height = 300)

library("shape")

par(mar = c(0, 0, 0, 0))

a = 2
b = 1

top_theta = seq(1.5*pi/12, 10.5/12*pi, 0.01)
x = a*cos(top_theta)
y = b*sin(top_theta)

plot(x, y, type="l", xlab="", ylab="", axes=FALSE, col="blue3", lwd = 5, xlim=c(-a-0.25, a+0.25), ylim=c(-b-0.25, b+0.25))
Arrows(x[2], y[2], x[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")

bot_theta = seq(-10.5/12*pi, -1.5*pi/12, 0.01)
x = a*cos(bot_theta)
y = b*sin(bot_theta)

lines(x, y, col="red3", lwd = 5)
Arrows(x[2], y[2], x[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "red3")

text(-a, 0.15, labels=expression(z), cex=2.5)
text(-a, -0.15, labels=expression(N(0, 1)), cex=1.5)
text(a, 0, labels="Probability", cex=2.5)

text(0, b, labels="pnorm(q)", cex=2.5, col="blue3", pos=3)
text(0, -b, labels="qnorm(p)", cex=2.5, col="red3", pos=1)

dev.off()

```{figure} pnorm_qnorm_inv.png
---
width: 100%
alt: One arrow labled pnorm(q) points from "z" to "Probability". Another arrow labeled qnorm(p) points the opposite direction, from "Probability" to "z".
name: pnorm_qnorm_inv
---
The <code>pnorm</code> and <code>qnorm</code> functions are inverse functions. For the <code>pnorm</code> function, we input the $z$-value, and the function outputs the corresponding probability $P(Z < z)$. For the <code>qnorm</code> function, we input a probability, and the function outputs the $z$-value so $P(Z < z)$ equals the probability we input.
```

Similar to the <code>pnorm</code> function, the <code>qnorm</code> function expects a probability with corresponding area in the lower tail of the standard normal distribution, not in the upper tail of the distribution. That is, if we input a probability into the <code>qnorm</code> function, the function will output the $z$-value so that the probability equals $P(Z < z)$, not $P(Z > z)$. As before, we can work around this limitation with a little arithmetic. The following examples illustrate the different cases.

***

### Example 4.1.6
Find $a$ so that $P(Z < a) = 0.85$.

#### Solution

In [65]:
png("stdnormal_ex4.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(0.85)
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(a), 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.3, 0.1, labels = expression(P(Z < a) == 0.85), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex4.png
---
width: 100%
alt: The standard normal distribution. The area under the curve to the left of some value z=a is shaded. The shaded region is labeled P(Z < a) = 0.85.
name: stdnormal_ex4
---
The standard normal distribution. The area under the curve to the left of some unknown value $z = a$ is shaded. We want to find $z = a$ so that $P(Z < a) = 0.85$.
```

We want to find some value $z = a$ so that $P(Z < a) = 0.85$. In this case, we already know the probability, and we want to find the corresponding $z$-value. That means we need to use the <code>qnorm</code> function.

In [66]:
qnorm(p = 0.85)

So $a = 1.03643$, and $P(Z < 1.03643) = 0.85$. In other words, we have found that $z = 1.03643$ is larger than $85\%$ of the values in the distribution.

***

### Example 4.1.7
Find $b$ so that $P(Z > b) = 0.59$.

#### Solution

In [5]:
png("stdnormal_ex5.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(1 - 0.59)
polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(b), 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(0.9, 0.1, labels = expression(P(Z > b) == 0.59), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex5.png
---
width: 100%
alt: The standard normal distribution. The area under the curve to the right of some value z=b is shaded. The shaded region is labeled P(Z > b) = 0.59.
name: stdnormal_ex5
---
The standard normal distribution. The area under the curve to the right of some unknown value $z = b$ is shaded. We want to find $z = b$ so that $P(Z > b) = 0.85$.
```

We want to find some value $z = b$ so that $P(Z > b) = 0.59$. In this case, we already know the probability, and we want to find the corresponding $z$-value. That means we need to use the <code>qnorm</code> function.

However, the <code>qnorm</code> function expects a probability with corresponding area in the lower tail, but the area corresponding to our probability fills the upper tail of the distribution. To use the <code>qnorm</code> function, we first need to re-express the problem to one dealing with probability in the lower tail. To do so, note that since the total area under the curve is equal to $1$, it must be that $P(Z < b) = 1 - P(Z > b) = 1 - 0.59$. Since $P(Z < b)$ is the form of probability that <code>qnorm</code> expects as input, this formula gives us a way to find $b$ using the <code>qnorm</code> function.

In [73]:
qnorm(p = 1 - 0.59)

So $b = -0.22754$, and $P(Z > -0.22754) = 0.59$. In other words, we have found that $z = -0.22754$ is less than than $59\%$ of the values in the distribution.

***

### Example 4.1.8

Find the $34$th percentile of the standard normal distribution.

#### Solution

In [1]:
png("stdnormal_ex6.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(0.34)
polyz = c(-3, seq(-3, zval, 0.001), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.001)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(z), 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-1, 0.1, labels = expression(P(Z < z) == 0.34), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex6.png
---
width: 100%
alt: The standard normal distribution. The area under the curve to the left of some value z is shaded. The shaded region is labeled P(Z < z) = 0.34.
name: stdnormal_ex6
---
The standard normal distribution. The area under the curve to the left of some unknown value $z$ is shaded. We want to find $z$ so that $P(Z < z) = 0.34$.
```

Recall that the $34$th percentile of the standard normal distribution would be the $z$-value larger than $34\%$ of the possible values in the distribution. In the language of probability, we want to find $z$ so that $P(Z < z) = 0.34$. Then since we know the probability, we can use the <code>qnorm</code> function to find the corresponding $z$-value.

In [3]:
qnorm(p = 0.)

So $P(Z < -0.41246) = 0.34$, meaning the $34$th percentile of the standard normal distribution is $z = -0.41246$.

***

### Example 4.1.9

Find the $z$-value less than $95\%$ of values in the standard normal distribution.

#### Solution

In [7]:
png("stdnormal_ex7.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(1 - 0.95)
polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(z), 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(0.1, 0.1, labels = expression(P(Z > z) == 0.95), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex7.png
---
width: 100%
alt: The standard normal distribution. The area under the curve to the right of some value z is shaded. The shaded region is labeled P(Z > z) = 0.95.
name: stdnormal_ex7
---
The standard normal distribution. The area under the curve to the right of some unknown value $z$ is shaded. We want to find $z$ so that $P(Z > z) = 0.95$.
```

We want the $z$-value smaller than $95\%$ of values in the standard normal distribution, which means we want $z$ so that $P(Z > z) = 0.95$. Since we know the probability, we want to use the <code>qnorm</code> function to find the corresponding $z$-score. However, the probability we are given corresponds to an area in the upper tail of the distribution. The <code>qnorm</code> function only works for probabilities in the lower tail of the distribution. 

With a little arithmetic, we can see that if $P(Z > z) = 0.95$, then $P(Z < z) = 1 - P(Z > z) = 1 - 0.95 = 0.05$. In other words, if a $z$-value smaller than exactly $95\%$ of the values in the distribution, it is also larger than the remaining $5\%$ of the values. Now that we have the probability in the form the <code>qnorm</code> function expects, we can use the function to find $z$.

In [8]:
qnorm(p = 1 - 0.95)

So $P(Z > -1.64485) = 0.95$, meaning $z = -1.64485$ is less than $95\%$ of the values in the standard normal distribution.