# The Normal Distribution and $z$-Scores

## Objectives ##
- Use $z$-scores to find probabilities for non-standard normal distributions.
- Use $z$-scores to find quantiles for non-standard normal distributions.

## Finding Probability Given an $x$-value by Computing the $z$-score
In the last section, we learned how to find probabilities for the standard normal distribution, $Z \sim N(0, 1)$. To find probabilities generally for a non-standard normal distribution, $X \sim N(\mu, \sigma)$, we first convert the $x$-value of interest into a **$z$-score**.

The $z$-score of an $x$-value is how many standard deviations the $x$-value is away from the mean and in which direction. For example, consider $X \sim N(5, 2)$, the normal distribution with mean $\mu = 5$ and standard deviation $\sigma = 2$. Then $x = 11$ is three standard deviations above (or to the right of) the mean since

$$ 11 = \mu + 3\sigma = 5 + 3(2),$$

so the $z$-score of $x = 11$ is $z = 3$.

Similarly, $x = -3$ is four standard deviations below (or the the left of) the mean since

$$ -3 = \mu + (-4)\sigma = 5 + (-4)(2),$$

so the $z$-score of $x = -3$ is $z = -4$.

Generally, if $X \sim N(\mu, \sigma)$, an $x$-value and its $z$-score are related by the formula $x = \mu + z\sigma$. If we use algebra to rearrange this equation so that the $z$ variable is by itself, we find that formula for finding the $z$-score of an $x$-value is

$$ z = \frac{x - \mu}{\sigma}. $$

When calculating a probability for a non-standard normal distribution, we first find the $z$-score corresponding to an $x$-value because it transforms the distribution we are working with from a non-standard normal distribution $X \sim N(\mu, \sigma)$ to the standard normal distribution $Z \sim N(0, 1)$. From the previous section, we know how to find a probability of the standard normal distribution using the <code>pnorm</code> function. So once we tranform the distribution to the standard normal distribution by finding the $z$-score, we can use the <code>pnorm</code> function to find the probability.

For example, suppose we want to find the probability $P(X < x)$ to the left of an $x$-value in a non-standard normal distribution $X \sim N(\mu, \sigma)$. We first calculate the $z$-score using the formula $z = \frac{x - \mu}{\sigma}$. Then the probability we are looking for, $P(X < x)$, is equal to the probability $P(Z < z)$ to the left of the $z$-score in the standard normal distribution $Z \sim N(0, 1)$. So we can find $P(X < x)$ by first finding the $z$-score of $x$, then calculating $P(Z < z)$ using the <code>pnorm</code> function.

In [22]:
png("pnorm_zscore.png", width = 1000, height = 250)

library("shape")

par(mar = c(0, 0, 0, 0))

a = 2
b = 1

top_theta = seq(1.5*pi/12, 10.5/12*pi, 0.01)
x1 = a*cos(top_theta) - a
x2 = a*cos(top_theta) + a
y = b*sin(top_theta)

plot(x1, y, type="l", xlab="", ylab="", axes=FALSE, col="blue3", lwd = 5, xlim=c(-2*a-0.35, 2*a+0.55), ylim=c(-0.35, b+0.55))
Arrows(x1[2], y[2], x1[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")
lines(x2, y, col="blue3", lwd=5)
Arrows(x2[2], y[2], x2[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")

text(-2*a, 0.15, labels=expression(x), cex=2.5)
text(-2*a, -0.15, labels=expression(N(mu, sigma)), cex=1.5)
text(0, 0.15, labels=expression(z), cex=2.5)
text(0, -0.15, labels=expression(N(0, 1)), cex=1.5)
text(2*a, 0, labels="Probability", cex=2.5)

text(-a, b, labels=expression(z == frac(x-mu,sigma)), cex=2.5, col="blue3", pos=3)
text(a, b, labels="pnorm(q)", cex=2.5, col="blue3", pos=3)

dev.off()

```{figure} pnorm_zscore.png
---
width: 100%
alt: An arrow labeled z = (x - mu)/sigma points from "x" to "z", then another arrow labeled pnorm(q) points from "z" to "Probability".
name: pnorm_zscore
---
To find the probability associated with an $x$-value in a non-standard normal distribution, first tranform the problem to be over the standard normal distribution by finding the $z$-score using the formula $z = \frac{x - \mu}{\sigma}$, then use the <code>pnorm</code> function to compute the probability.
```

***

### Example 4.2.1
Let $X \sim N(15, 2.4)$.

1. Find $P(X < 13.5)$.
2. Find $P(X > 16.1)$.
3. Find $P(12.4 < X < 19.5)$.

#### Solution
Recall that $X \sim N(15, 2.4)$ means that $X$ is a normally distributed random variable with mean $\mu = 15$ and standard deviation $\sigma = 2.4$. 

##### Part 1
To find $P(X < 13.5)$, we first need to calculate the $z$-score of $x = 13.5$:

$$ z = \frac{x - \mu}{\sigma} = \frac{13.5 - 15}{2.4} = -0.625. $$

So the $z$-score of $x = 13.5$ is $z = -0.625$, meaning $P(X < 13.5) = P(Z < -0.625)$.

In [5]:
png("stdnormal_ex11.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = -0.625
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-1.225, 0.05, labels = expression(P(Z < -0.625)), cex = 2, col = "blue3", pos = 3)

#Arrows(-2.75, 0.05, -2.75, 0.013, lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex11.png
---
width: 100%
alt: The area under the standard normal distribution and to the left of z = -0.625 is shaded. The shaded area is labeled P(Z < -0.625).
name: stdnormal_ex11
---
The standard normal distribution with the area equal to $P(Z < -0.625)$ shaded.
```

Now that we have converted the probability from a non-standard normal distribution to the standard normal distribution, we can use R to find the probability.

In [7]:
pnorm(q = -0.625)

Thus, $P(X < 13.5) = P(Z < -0.625) = 0.26599$.

##### Part 2
For this part, we want to find $P(X > 16.1)$. To calculate this probability, we first find the $z$-score of $x = 16.1$:

$$ z = \frac{x - \mu}{\sigma} = \frac{16.1 - 15}{2.4} = 0.45833. $$

Thus $P(X > 16.1) = P(Z > 0.45833)$.

In [10]:
png("stdnormal_ex12.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = 0.45833
polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(1.03, 0.1, labels = expression(P(Z > 0.45833)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex12.png
---
width: 100%
alt: The area under the standard normal distribution and to the right of z = 0.45833 is shaded. The shaded area is labeled P(Z > 0.45833).
name: stdnormal_ex12
---
The standard normal distribution with the area equal to $P(Z > 0.45833)$ shaded.
```

Next, we use R to calculate the probability. Recall that the <code>pnorm</code> function only gives the area to the left of the input $z$-value. To find the area to the right of a $z$-value, we start with the total area under the standard normal distribution (which is $1 = 100\%$), then subtract away the part that we don't want, which is the area left of the $z$-value.

In [11]:
1 - pnorm(q = 0.45833)

We find that $P(X > 16.1) = P(Z > 0.45833) = 0.32336$.

##### Part 3
We want to find $P(12.4 < X < 19.5)$. We need to find two $z$-scores: one for $x = 12.4$, and one for $x = 17.5$. The $z$-score of $x = 12.4$ is

$$ z = \frac{x - \mu}{\sigma} = \frac{12.4 - 15}{2.4} = -1.08333. $$

The $z$-score of $x = 19.5$ is

$$ z = \frac{x - \mu}{\sigma} = \frac{19.5 - 15}{2.4} = 1.875. $$

So $P(12.4 < X < 19.5) = P(-1.08333 < Z < 1.875)$.

In [17]:
png("stdnormal_ex13.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval1 = -1.08333
zval2 = 1.875
polyz = c(zval1, seq(zval1, zval2, 0.01), zval2)
polyy = c(0, dnorm(seq(zval1, zval2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval1, zval1), c(0, dnorm(zval1)))
lines(c(zval2, zval2), c(0, dnorm(zval2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval1, 0, zval2, 3), labels=c("", zval1, 0, zval2, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(0.23, 0.1, labels = expression(P(~paste(-1.08333 < Z) < 1.875)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex13.png
---
width: 100%
alt: The area under the standard normal distribution between z = -1.08333 and z = 1.875 is shaded. The shaded area is labeled P(-1.08333 < Z < 1.875).
name: stdnormal_ex13
---
The standard normal distribution with the area equal to $P(-1.08333 < Z < 1.875)$ shaded.
```

Next, we'll use R to calculate the probability. Since we want the probability between two $z$-values, recall that we first calculate all the area to the left of the larger $z$-value, then subtract the excess area to the left of the smaller $z$-value.

In [19]:
pnorm(q = 1.875) - pnorm(q = -1.08333)

Thus, $P(12.4 < X < 19.5) = P(-1.08333 < Z < 1.875) = 0.83027$.

***


### Example 4.2.2

The heights of adult men in the United States are normally distributed with a mean of 70 inches (5 ft, 10 in) and a standard deviation of 4 inches. If you choose an adult male from the U.S. population at random, what is the probability that:
1. he will be shorter than $60$ inches ($5$ ft)?
2. he will be taller than $72$ inches ($6$ ft)?
3. he will be between $60$ inches and $72$ inches tall?

#### Solution
The distribution of the heights of adult males is $X \sim N(70, 4)$. This is a non-standard normal distribution. In each case, to find the probabilities, we will first convert to the standard normal distribution by finding the appropriate $z$-scores.

##### Part 1
We want $P(X < 60)$. To find the $z$-score associated with $x = 60$, we calculate

$$ z = \frac{x - \mu}{\sigma} = \frac{60 - 70}{4} = -2.5. $$

So the $z$-score of $x = 60$ is $z = -2.5$. This means $P(X < 60) = P(Z < -2.5)$.

In [15]:
png("stdnormal_ex8.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = -2.5
polyz = c(-3, seq(-3, zval, 0.01), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-2.75, 0.05, labels = expression(P(Z < -2.5)), cex = 2, col = "blue3", pos = 3)

Arrows(-2.75, 0.05, -2.75, 0.013, lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex8.png
---
width: 100%
alt: The area under the standard normal distribution and to the left of z = -2.5 is shaded. The shaded area is labeled P(Z < -2.5).
name: stdnormal_ex8
---
The standard normal distribution with the area equal to $P(Z < -2.5)$ shaded.
```

 Now that we have converted the probability from a non-standard normal distribution to the standard normal distribution, we can use R to find the probability.

In [1]:
pnorm(q = -2.5)

Then $P(X < 60) = P(Z < -2.5) = 0.00621$. In other words, there is only a $0.621\%$ chance that the adult male you randomly choose from the population is less than $5$ feet tall.

##### Part 2
We want $P(X > 72)$. To find the $z$-score associated with $x = 72$, we calculate

$$ z = \frac{x - \mu}{\sigma} = \frac{72 - 70}{4} = 0.5. $$

So the $z$-score of $x = 72$ is $z = 0.5$. This means $P(X > 72) = P(Z > 0.5)$.

In [20]:
png("stdnormal_ex9.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = 0.5
polyz = c(zval, seq(zval, 3, 0.01), 3)
polyy = c(0, dnorm(seq(zval, 3, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", zval, 0, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(1.05, 0.1, labels = expression(P(Z > 0.5)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex9.png
---
width: 100%
alt: The area under the standard normal distribution and to the right of z = 0.5 is shaded. The shaded area is labeled P(Z > 0.5).
name: stdnormal_ex9
---
The standard normal distribution with the area equal to $P(Z > 0.5)$ shaded.
```

 Now that we have converted the probability from a non-standard normal distribution to the standard normal distribution, we can use R to find the probability.

In [2]:
1 - pnorm(q = 0.5)

So $P(x > 72) = P(z > 0.5) = 0.30854$. In other words, there is a $30.854\%$ chance that the adult male you randomly choose from the population is more than $6$ feet tall.

##### Part 3
We want $P(60 < X < 72)$. We need a $z$-score for both $x = 60$ and $x = 72$. Fortunately, we've already done the hard work. We already found the $z$-score of $x = 60$ is $z = -2.5$ in part 1, and we found the $z$-score of $x = 72$ is $z = 0.5$ in part 2. This means that

$$ P(60 < X < 72) = P(-2.5 < Z < 0.5). $$

In [28]:
png("stdnormal_ex10.png", width = 1000, height = 500)

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)
library("shape")

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z)
plot(z, y, type="l", xlab = "", ylab = "", axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01)), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval1 = -2.5
zval2 = 0.5
polyz = c(zval1, seq(zval1, zval2, 0.01), zval2)
polyy = c(0, dnorm(seq(zval1, zval2, 0.01)), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval1, zval1), c(0, dnorm(zval1)))
lines(c(zval2, zval2), c(0, dnorm(zval2)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval1, 0, zval2, 3), labels=c("", zval1, 0, zval2, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "z", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.55, 0.1, labels = expression(P(~paste(-2.5 < Z) < 0.5)), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} stdnormal_ex10.png
---
width: 100%
alt: The area under the standard normal distribution between z = -2.5 and z = 0.5 is shaded. The shaded area is labeled P(-2.5 < Z < 0.5).
name: stdnormal_ex10
---
The standard normal distribution with the area equal to $P(-2.5 < Z < 0.5)$ shaded.
```

Now that we have converted the probability from a non-standard normal distribution to the standard normal distribution, we can use R to find the probability.

In [3]:
pnorm(q = 0.5) - pnorm(q = -2.5)

So $P(60 < X < 72) = P(-2.5 < Z < 0.5) = 0.68525$. That is, there is a $68.525\%$ chance that the adult male you choose is between $5$ feet and $6$ feet tall.

***

## Finding an $x$-value Given Probability by Computing the $z$-score
In the above examples, we saw how to find a probability involving a non-standard normal distribution. Next, we will learn how to do the inverse: given a probability, we want to find a corresponding $x$-value in a non-standard normal distribution. For example, we might want to find $x$ so that $P(X < x)$ is equal to a certain probability.

To do so, we first use the <code>qnorm</code> function (which is the inverse of the <code>pnorm</code> function) to find the $z$-score associated with our initial probability. Once we have the $z$-score, we use the formula

$$ x = \mu + z\sigma $$

to calculate the $x$-value we are looking for.

In [16]:
png("pnorm_qnorm_zscore.png", width = 1000, height = 350)

library("shape")

par(mar = c(0, 0, 0, 0))

a = 2
b = 1

top_theta = seq(1.5*pi/12, 10.5/12*pi, 0.01)
x1 = a*cos(top_theta) - a
x2 = a*cos(top_theta) + a
y = b*sin(top_theta)

plot(x1, y, type="l", xlab="", ylab="", axes=FALSE, col="blue3", lwd = 5, xlim=c(-2*a-0.35, 2*a+0.55), ylim=c(-b-0.35, b+0.55))
Arrows(x1[2], y[2], x1[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")
lines(x2, y, col="blue3", lwd=5)
Arrows(x2[2], y[2], x2[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "blue3")

bot_theta = seq(-10.5/12*pi, -1.5*pi/12, 0.01)
x1 = a*cos(bot_theta) - a
x2 = a*cos(bot_theta) + a
y = b*sin(bot_theta)

lines(x1, y, col="red3", lwd = 5)
Arrows(x1[2], y[2], x1[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "red3")
lines(x2, y, col="red3", lwd = 5)
Arrows(x2[2], y[2], x2[1], y[1], lwd = 5, arr.type = "triangle", arr.width = 0.3, col = "red3")

text(-2*a, 0.15, labels=expression(x), cex=2.5)
text(-2*a, -0.15, labels=expression(N(mu, sigma)), cex=1.5)
text(0, 0.15, labels=expression(z), cex=2.5)
text(0, -0.15, labels=expression(N(0, 1)), cex=1.5)
text(2*a, 0, labels="Probability", cex=2.5)

text(-a, b, labels=expression(z == frac(x-mu,sigma)), cex=2.5, col="blue3", pos=3)
text(a, b, labels="pnorm(q)", cex=2.5, col="blue3", pos=3)

text(-a, -b, labels=expression(x == mu + z*sigma), cex=2.5, col="red3", pos=1)
text(a, -b, labels="qnorm(p)", cex=2.5, col="red3", pos=1)

dev.off()

```{figure} pnorm_qnorm_zscore.png
---
width: 100%
alt: At the top of the image, an arrow labeled z = (x - mu)/sigma points from "x" to "z", then another arrow labeled pnorm(q) points from "z" to "Probability". At the bottom of the image, an arrow labeled qnorm(p) points from "probability" to "z", then another arrow labeled x = mu + z*sigma points from "z" to "x".
name: pnorm_qnorm_zscore
---
The approach we take to solve a problem involving a non-standard normal distribution depends on whether we know an $x$-value and want to find an associated probability, or we know a probability and want to find an associated $x$-value. But in either case, we must find a $z$-score as an intermediary step.
```

***

### Example 4.2.3
Let $X \sim N(5, 0.6)$.

1. Find $a$ so that $P(X < a) = 0.75$.
2. Find $b$ so that $P(X > b) = 0.9$.

#### Solution
First, observe that the random variable $X$ is normally distributed with mean $\mu = 5$ and standard deviation $\sigma = 0.6$.

##### Part 1


In [35]:
S = 0.6
png("normal_ex6.png", width = 1000, height = 500*dnorm(0, sd = S)/dnorm(0))

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z, sd = S)
plot(z, y, type="l", xlab = "", ylab = "", ylim = c(0, dnorm(0, sd = S)), axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01), sd = S), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(0.75, sd = S)
polyz = c(-3, seq(-3, zval, 0.001), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.001), sd = S), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval, sd = S)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(a), 5, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.37, 0.1, labels = expression(P(X < a) == 0.75), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} normal_ex6.png
---
width: 100%
alt: A normal distribution with mean 5 is shown. The area under the curve to the left of an unknown value a is shaded, and the shaded region is labeled P(X < a) = 0.75.
name: normal_ex6
---
The distribution $N(5, 0.6)$ is shown. We want to find the value $a$ so that $P(X < a) = 0.75$.
```

We want to find $a$ so that $P(X < a) = 0.75$. Note that we are given a probability ($0.75$), and we want to find a particular $x$-value in the distribution ($x = a$). To do this, we first need to use the <code>qnorm</code> function to find the $z$-score associated with the probability:

In [36]:
qnorm(p = 0.75)

So the $z$-score is $z = 0.67449$, meaning $P(Z < 0.67449) = 0.7$.

Now that we have the $z$-score, we can calculate the value of $x = a$:

$$ a = \mu + z\sigma = 5 + 0.67449(0.6) = 5.40469. $$

Thus, $P(X < 5.40469) = 0.75$.

##### Part 2

In [43]:
S = 0.6
png("normal_ex7.png", width = 1000, height = 500*dnorm(0, sd = S)/dnorm(0))

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z, sd = S)
plot(z, y, type="l", xlab = "", ylab = "", ylim = c(0, dnorm(0, sd = S)), axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01), sd = S), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(1 - 0.9, sd = S)
polyz = c(zval, seq(zval, 3, 0.001),3)
polyy = c(0, dnorm(seq(zval, 3, 0.001), sd = S), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval, sd = S)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(b), 5, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)

text(0.18, 0.1, labels = expression(P(X > b) == 0.9), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} normal_ex7.png
---
width: 100%
alt: A normal distribution with mean 5 is shown. The area under the curve to the right of an unknown value b is shaded, and the shaded region is labeled P(X > b) = 0.9.
name: normal_ex7
---
The distribution $N(5, 0.6)$ is shown. We want to find the value of $b$ so that $P(X > b) = 0.9$.
```

We want to find the value of $b$ so that $P(X > b) = 0.9$. Again, we are given a probability ($0.9$), and we need to find an $x$-value associated with that probability ($x = b$).

We will begin by using the <code>qnorm</code> function to find the $z$-value associated with the probability $0.9$. However, recall that the <code>qnorm</code> function assumes that the input represents a *left*-tailed probability. The probability of $0.9$ in this example is a *right*-tailed probability. To use the <code>qnorm</code> function, we first observe that since the probability to the right of $b$ is $P(X > b) = 0.9$, the probability to the left of $b$ is $P(X < b) = 1 - P(X > b) = 1 - 0.9$. With this observation, we are able to calculate the $z$-score:

In [44]:
qnorm(p = 1 - 0.9)

We get $z = -1.28155$.

Now that we have the $z$-score, we can calculate the value of $x = b$:

$$ b = \mu + z\sigma = 5 + -1.28155(0.6) = 4.23107. $$

Thus, $P(X > 4.23107) = 0.9$.

***

### Example 4.2.4
Scores on a calculus test are normally distributed with a mean score of $73$ and a standard deviation of $9$.
1. Find the $80$th percentile.
2. The highest $32\%$ of scores are higher than which score?

#### Solution
##### Part 1

In [1]:
S = 1.5
png("normal_ex4.png", width = 1000, height = 500*dnorm(0, sd = S)/dnorm(0))

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z, sd = S)
plot(z, y, type="l", xlab = "", ylab = "", ylim = c(0, dnorm(0, sd = S)), axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01), sd = S), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(0.85, sd = S)
polyz = c(-3, seq(-3, zval, 0.001), zval)
polyy = c(0, dnorm(seq(-3, zval, 0.001), sd = S), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval, sd = S)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(x), 73, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)

text(-0.3, 0.1, labels = expression(P(X < x) == 0.85), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} normal_ex4.png
---
width: 100%
alt: A normal distribution with mean 73 is shown. The area under the curve to the left of an unknown value x is shaded, and the shaded region is labeled P(X < x) = 0.85.
name: normal_ex4
---
The distribution $N(73, 9)$ is shown. We want to find the $x$-value that is the $85$th percentile.
```

In this example, the $85$th percentile is the test score that is larger than $85\%$ of all test scores. Note that we are given a probability ($85\% = 0.85$), and we need to find a particular $x$-value (the test score that is the $85$th percentile). We will first use the <code>qnorm</code> function to find the $z$-score associated with the given probability.

In [63]:
qnorm(p = 0.85)

We get $z = 1.03643$. This $z$-score is the $85$th percentile of the *standard* normal distribution. To find the $x$-value that is the $85$th percentile of test scores, we calculate

$$ x = \mu + z\sigma = 73 + 1.03643(9) = 82.32790. $$

So a test score of $x = 82.32790$ is greater than $85\%$ of scores on the calculus test.

##### Part 2

In [2]:
S = 1.5
png("normal_ex5.png", width = 1000, height = 500*dnorm(0, sd = S)/dnorm(0))

par(mar = c(2, 0, 0, 0))

#library(repr)
#options(repr.plot.width = 7, repr.plot.height=4)

rangebrace = function(x0, x1, y, d, col){
    Y = seq(y, y+d, length = 500)
    Xl = (x1 - x0)/4 * ( (2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) + 1 ) + (x0 + x1)/2
    Xr = (x1 - x0)/4 * ( -(2/d)^(1/3) * sign(Y - (y + d/2))*abs(Y - (y + d/2))^(1/3) - 1 ) + (x0 + x1)/2
    lines(Xl, Y, lwd = 3, col = col)
    lines(Xr, Y, lwd = 3, col = col)
    }

z = seq(-3, 3, 0.01)
y = dnorm(z, sd = S)
plot(z, y, type="l", xlab = "", ylab = "", ylim = c(0, dnorm(0, sd = S)), axes = FALSE, cex = 2)
polyz = c(-3, seq(-3, 3, 0.01), 3)
polyy = c(0, dnorm(seq(-3, 3, 0.01), sd = S), 0)
polygon(polyz, polyy, col="gray90", border="NA")

zval = qnorm(1 - 0.32, sd = S)
polyz = c(zval, seq(zval, 3, 0.001),3)
polyy = c(0, dnorm(seq(zval, 3, 0.001), sd = S), 0)
polygon(polyz, polyy, col="lightsteelblue", border = "NA")
lines(c(zval, zval), c(0, dnorm(zval, sd = S)))

lines(z, y, type="l", cex = 2)
axis(1, pos=0, at=c(-3, zval, 0, 3), labels=c("", expression(x), 73, ""), cex.axis = 2, lwd.ticks = 0)
text(3, -0.01, labels = "x", cex = 2.5, pos = 4, xpd = TRUE)

text(1.35, 0.1, labels = expression(P(X > x) == 0.32), cex = 2, col = "blue3")

#lines(c(0, 0), c(0, dnorm(0)), type = "l", cex = 2)

dev.off()

```{figure} normal_ex5.png
---
width: 100%
alt: A normal distribution with mean 73 is shown. The area under the curve to the right of an unknown value x is shaded, and the shaded region is labeled P(X > x) = 0.36.
name: normal_ex5
---
The distribution $N(73, 9)$ is shown. We want to find the $x$-value with $36\%$ of test scores higher than it.
```

Again, we are given a probability ($32\% = 0.32$), and we want to find an $x$-value (the test score with $32\%$ of scores higher than it). We can use the <code>qnorm</code> function to find the $z$-score associated with the probability. However, the probability we are given is located in the *upper* tail of the distribution. The <code>qnorm</code> function expects probability in the *lower* tail of the distribution. But this is easy to calculate. Since the probability in the upper tail of the distribution is $0.32$, the area in the lower tail of the distribution is $1 - 0.32$.

In [69]:
qnorm(p = 1 - 0.32)

We get $z = 0.46770$. This means that the $32\%$ of possible $z$-values in the *standard* normal distribution are larger than $z = 0.46770$. To find the $x$-value which is the test score with $32\%$ of test scores greater than it, we calculate

$$ x = \mu + z\sigma = 73 + 0.46770(9) = 77.20929. $$

So $32\%$ of test scores were higher than a score of $77.20929$.