### Unit 3 - In-class exercises (Application of the Normal Distribution)


#### Exercise 1 (Age at Childbirth):

In the last decade, the average age of a mother at childbirth is 26.4 years, with standard deviation 5.8 years. The distribution of age at childbirth is approximately normal.

  

a) What age at childbirth puts a woman in the upper 2.5% of the age distribution? In other words, what is the 97.5 percentile of this age distribution?

#### Solution:

First, use `qnorm()` to identify the $z$-value associated with the 97.5 percentile: 1.96. Convert this $z$-value to the distribution of age at childbirth, where $X \sim N(26.4, 5.8)$. 

$$X = \sigma Z + \mu = (5.8)(1.96) + 26.4 = 37.77$$

An age of about 38 (37.8) puts a woman in the upper 2.5% of the age distribution.

In [1]:
#use qnorm with the standard normal
qnorm(.975)
5.8*qnorm(.975) + 26.4

#have R convert to the given normal distribution
qnorm(.975, mean = 26.4, sd = 5.8)

  b) What proportion of women who give birth are 21 years of age or older?


#### Solution:

Convert 21 to a $Z$-score, then use `pnorm()` to calculate the percentile.

$$Z = \dfrac{X- \mu}{\sigma} = \dfrac{21 - 26.4}{5.8} = -0.93$$

$P(X \geq 21) = 0.824$; 82.4\% of women are 21 years of age or older. It is important to remember that by default, `pnorm()` calculates the probability $P(X \leq x)$.


In [2]:
#use pnorm with the standard normal
z.score = (21 - 26.4)/5.8
1 - pnorm(z.score)
pnorm(z.score, lower.tail = FALSE)

#have R convert to the given normal distribution
pnorm(21, mean = 26.4, sd = 5.8, lower.tail = FALSE)

#### Exercise 2 (Mild Hypertension):

Suppose a mild hypertensive is defined as a person whose diastolic blood pressure (DBP) is between 90 and 100 mm Hg, inclusive. Assume that in the population of 35-44 year old men, mean DBP is 80 mm Hg, with variance 144. What is the probability that a randomly selected male from this population is a mild hypertensive?


#### Solution:

Find $P(90 \leq X \leq 100) = P(X \leq 100) - P(X \leq 90)$. Convert to $Z$-scores, then calculate the probabilities using `pnorm()`.

$$Z = \dfrac{X - \mu}{\sigma} = \dfrac{100 - 80}{12} = 1.67$$
$$Z = \dfrac{X - \mu}{\sigma} = \dfrac{90 - 80}{12} = 0.83$$

The probability that a randomly selected male from this population is a mild hypertensive is 0.15. Note that the slight discrepancy between the two methods below is from rounding.


In [3]:
#calculations from z-scores
pnorm(1.67) - pnorm(0.83)

#let R convert
pnorm(100, mean = 80, sd = 12) - pnorm(90, mean = 80, sd = 12)

#### Exercise 3 (Normal Heights) :

The distribution of height in a population is typically modeled with the normal distribution. Let $X$ represent the height of an American female and $Y$ represent the height of an American male: $X \sim$ N($\mu _{x} = 63.8$ in, $\sigma_{x} = 2.5$ in), $Y \sim$ N($\mu_{y} = 69.2$ in, $\sigma_{y} = 2.8$ in).



a) What is the probability that a random American male is at most 68 inches tall?
    

#### Solution:

First, convert $Y$ to a $Z$-score:
	
$$\text{P}(Y \leq 68) = \text{P}\left(\frac{Y-\mu_y}{\sigma_y} \leq \frac{68-\mu_y}{\sigma_y}\right) = \text{P}\left(Z \leq \frac{68-69.2}{2.8}\right) = \text{P}(Z \leq -0.43)$$

In [5]:
#use R to calculate the probability
pnorm(-.43)

#alternatively, use R to convert to a z and find the probability
pnorm(68, 69.2, 2.8)

The probability that a random American male is at most 68 inches tall is 0.334.

b) How tall would a person have to be in order to be taller than 90% of the male US population?

#### Solution:

\begin{align*}
	0.9 =& \text{P}(Y \leq k)\\ 
	0.9=& \text{P}\left(\frac{Y-\mu_y}{\sigma_y} \leq \frac{k-\mu_y}{\sigma_y}\right) \\
	0.9=& \text{P}\left( Z \leq \frac{k-69.2}{2.8} \right)
\end{align*}

Use `R` to find the $Z$-value associated with .90 area to the left.

In [6]:
qnorm(.9)

Solve for $k$ (in other words, transform the $Z$-value back into $Y$). A person must be at least 72.8 inches tall (about 6 ft. tall) to be taller than 90% of the male US population. 

In [7]:
qnorm(.9)*2.8+69.2

Alternatively, use `R` in one step:

In [8]:
qnorm(.9, 69.2, 2.8)

c) The mean and standard deviation of US female and male heights can be converted to meters using formulas in \textit{OI Biostat} 2.3. $X \sim$ N($\mu _{x} = 1.62$ m, $\sigma_{x} = 0.0635$ m), $Y \sim$ N($\mu_{y} = 1.76$ m, $\sigma_{y} = 0.0711$ m).
    
The retired Chinese basketball player Yao Ming was the tallest active player in the NBA at the time of his final season, at 2.29 m. What proportion of the US population is as tall as Yao Ming (or taller)? Assume that half the population are male and half are female.
 

#### Solution:

If we assume that half of the population are male and half are female, this proportion is given by:
	
$$\dfrac{1}{2} P(Y \geq 2.29) + \dfrac{1}{2} P(X \geq 2.29)$$
	
\begin{align*}
	\dfrac{1}{2} P(Y \geq 2.29) + \dfrac{1}{2} P(X \geq 2.29) &= \dfrac{1}{2}[1 - P(Y \leq 2.29)] + \dfrac{1}{2}[1 - P(X \leq 2.29)] \\
	&= \dfrac{1}{2} \left[1 - \text{P}\left(\frac{Y-1.76}{0.0711} \leq \frac{2.29-1.76}{0.0711}\right)\right] + \dfrac{1}{2}\left[ 1 - \text{P}\left(\frac{X-1.62}{0.0635} \leq \frac{2.29-1.62}{0.0635}\right) \right] \\
	&= \dfrac{1}{2} \left[1 - \text{P}(Z\leq 7.45)\right] + \dfrac{1}{2}\left[1 - \text{P}(Z\leq 10.55) \right] \\
	&= 2.33 \times 10^{-14}
\end{align*}



In [9]:
0.5*(pnorm(2.29, 1.76, 0.0711, lower.tail = FALSE)) +
	  + 0.5*(pnorm(2.29, 1.62, 0.0635, lower.tail = FALSE))

The answers differ slightly due to rounding of the $z$-values in the written approach.

#### Exercise 4 (Hypertension):

People are classified as hypertensive if their systolic blood pressure (SBP) is higher than a specified level for their age group. For ages 1-14, SBP has a mean of 105.0 and standard deviation 5.0, with hypertension level 115.0.  For ages 15-44, SBP has a mean of 125.0 and standard deviation 10.0, with hypertension level 140.0. Assume SBP is normally distributed.

(Problem from Rosner, **Fundamentals of Biostatistics**, 7$^{th}$ edition, pp. 138-139)

a) What proportion of 1- to 14-year-olds are hypertensive?

#### Solution:

Convert to a $Z$-score, then find the associated probability:
	
$$\text{P}(X \geq 115.0) = \text{P}\left(\frac{X-\mu}{\sigma} \geq \frac{115.0-\mu_y}{\sigma_y}\right) = \text{P}\left(Z \geq \frac{115.0-105.0}{5.0}\right) = \text{P}(Z \geq 2)$$
	
$$\text{P}(Z \geq 2) = 1 - \text{P}(Z \leq 2)  = 1 - .9772 = 0.0228$$

In [10]:
pnorm(115, 105, 5, lower.tail=FALSE)

2.28% of 1- to 14-year-olds are hypertensive.

b) What proportion of 15- to 44-year-olds are hypertensive?

#### Solution:

$$\text{P}(X \geq 140.0) = \text{P}\left(\frac{X-\mu}{\sigma} \geq \frac{140.0-\mu_y}{\sigma_y}\right) = \text{P}\left(Z \geq \frac{140.0-125.0}{10.0}\right) = \text{P}(Z \geq 1.5)$$
	
$$\text{P}(Z \geq 1.5) = 1 - \text{P}(Z \leq 1.5)  = 1 - .9332 = 0.0668$$


In [11]:
pnorm(140, 125, 10, lower.tail=FALSE)

6.68% of 15- to 44-year-olds are hypertensive.