## Chapter 05
# Normal Probability Distributions

Adopted from ["Elementary Statistics - Picturing the World" 6th edition](https://www.amazon.com/Elementary-Statistics-Picturing-World-6th/dp/0321911210/)

In [1]:
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
        'scroll': True,
        'width': "100%",
        'height': "100%",
})

{'width': '100%', 'height': '100%', 'scroll': True}

## 5.1. <br/>Introduction to Normal Distributions and <br/>the Standard Normal Distribution

### Definition of a Normal Distribution

- A **normal distribution** is a continuous probability distribution for a random variable $x$. 
- The graph of a normal distribution is called the **normal curve**.

### Properties of a Normal Distribution

A normal distribution has these properties:

1. The mean, median, and mode are equal.
2. The normal curve is bell-shaped and is symmetric about the mean.
3. The total area under the normal curve is equal to 1.
4. The normal curve approaches, but never touches, the $x$-axis as it extends farther and farther away from the mean
5. Between $\mu - \sigma$ and $\mu + \sigma$ (in the center of the curve), the graph curves downward.<br/> The graph curves upward to the left of $\mu - \sigma$ and to the right of $\mu + \sigma$.<br/> The points at which the curve changes from curving upward to curving downward are called **inflection points**.

### Properties of a Normal Distribution

![](./image/5_1_normal_distribution.png)

### Properties of a Normal Distribution
- A discrete probability distribution can be graphed with a histogram. 
- A continuous probability distribution, you can use a probability density function (pdf). 
- A probability density function has two requirements:
    1. the total area under the curve is equal to 1, and 
    2. the function can never be negative.
- Formula for pdf: $y = \frac{1}{\sigma \sqrt{2 \pi}} e^{-(x-\mu)^{2}/(2 \sigma^{2})}$

#### Meand and Standard Deviation (recap)

![](./image/5_1_mean_and_std.png)

### Properties of a Normal Distribution [example]

Understanding Mean and Standard Deviation

![](./image/5_1_example_understanding_mean_and_std.png)

1. Which normal curve has a greater mean?
2. Which normal curve has a greater standard deviation?


### Properties of a Normal Distribution [solution]

1. The line of symmetry of curve A occurs at $x = 15$.<br/>The line of symmetry of curve B occurs at $x = 12$.<br/>So, curve A has a greater mean.
2. Curve B is more spread out than curve A.<br/>So, curve B has a greater standard deviation.

### Properties of a Normal Distribution [example]

Interpreting Graphs of Normal Distributions

- The scaled test scores for the New York State Grade 8 Mathematics Test are normally distributed. 
- The normal curve shown below represents this distribution. 
- What is the mean test score? Estimate the standard deviation of this normal distribution.

![](./image/5_1_example_interpreting_graph_of_normal_distribution.png)

### Properties of a Normal Distribution [solution]

![](./image/5_1_solution_interpreting_graph_of_normal_distribution.png)

The scaled test scores for the New York State Grade $8$ Mathematics Test are normally distributed with a mean of about $675$ and a standard deviation of about $35$.

### The Standard Normal Distribution

- The normal distribution with a mean of $0$ and a standard deviation of $1$ is called the standard normal distribution. 
- The horizontal scale of the graph of the standard normal distribution corresponds to $z$-scores.
- $z = \frac{value - mean}{standard deviation} = \frac{x - \mu}{\sigma}$

![](./image/5_1_standard_normal_distribution.png)

It is important that you know the difference between $x$ and $z$.
The random variable $x$ is sometimes called a raw score and represents values in a nonstandard normal distribution, whereas $z$ represents values in the standard normal distribution.

### Properties of the Standard Normal Distribution

1. The cumulative area is close to $0$ for $z$-scores close to $z = -3.49$.
2. The cumulative area increases as the $z$-scores increase.
3. The cumulative area for $z = 0$ is $0.5000$.
4. The cumulative area is close to $1$ for $z$-scores close to $z = 3.49$.

### Using the Standard Normal Table [example]

1. Find the cumulative area that corresponds to a $z$-score of $1.15$.
2. Find the cumulative area that corresponds to a $z$-score of $-0.24$.

### Using the Standard Normal Table [solution]

- Cumulative area that corresponds to a $z$-score of $1.15$

![](./image/5_1_example_table_z_1_15.png)
![](./image/5_1_example_graph_z_1_15.png)

- Using [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- Using SciPy

In [2]:
from scipy import stats
stats.norm.cdf(1.15)

0.8749280643628496

### Using the Standard Normal Table [solution]

- Cumulative area that corresponds to a $z$-score of $-0.24$

![](./image/5_1_example_table_z_0_24.png)
![](./image/5_1_example_graph_z_0_24.png)

- Using SciPy

In [3]:
from scipy import stats
stats.norm.cdf(-0.24)

0.40516512830220414

### Finding Areas Under the Standard Normal Curve [guidelines]

To find the area to the left of $z$, find the area that corresponds to $z$ in the Standard Normal Table.

![](./image/5_1_graph_left_of_z.png)

### Finding Areas Under the Standard Normal Curve [guidelines]

To find the area to the right of $z$, use the Standard Normal Table to find the area that corresponds to $z$.<br/>
Then subtract the area from $1$.
    
![](./image/5_1_graph_right_of_z.png)

### Finding Areas Under the Standard Normal Curve [guidelines]

To find the area between two $z$-scores, find the area corresponding to each $z$-score in the Standard Normal Table. <br/>
Then subtract the smaller area from the larger area.

![](./image/5_1_graph_between_two_z.png)

### Finding Area Under the Standard Normal Curve [example]

Find the area under the standard normal curve to the left of $z = -0.99$.

### Finding Area Under the Standard Normal Curve [solution]

The area under the standard normal curve to the left of $z = -0.99$ is shown.

![](./image/5_1_area_under_standard_normal_curve.png)

From the Standard Normal Table, this area is equal to $0.1611$.

In [4]:
from scipy import stats
stats.norm.cdf(-0.99)

0.1610870595108309

### Finding Area Under the Standard Normal Curve [example]

Find the area under the standard normal curve to the left of $z = -0.99$.

### Finding Area Under the Standard Normal Curve [solution]

The area under the standard normal curve to the right of $z = 1.06$ is shown.

![](./image/5_1_area_under_standard_normal_curve_right.png)

From the Standard Normal Table, the area to the left of $z = 1.06$ is $0.8554$. <br/>
$Area = 1 - 0.8554 = 0.1446$

In [5]:
from scipy import stats
1 - stats.norm.cdf(1.06)

0.1445722996639096

### Finding Area Under the Standard Normal Curve [example]

Find the area under the standard normal curve between $z = -1.5$ and $z = 1.25$.

### Finding Area Under the Standard Normal Curve [solution]

The area under the standard normal curve between $z = -1.5$ and $z = 1.25$ is shown.

![](./image/5_1_area_under_standard_normal_curve_between.png)

From the Standard Normal Table, the area to the left of $z = 1.25$ is $0.8944$ and the area to the left of $z = -1.5$ is $0.0668$. <br/>
$Area = 0.8944 - 0.0668 = 0.8276$

In [6]:
from scipy import stats
stats.norm.cdf(1.25) - stats.norm.cdf(-1.5)

0.8275430250642866

## 5.2. <br/>Normal Distributions: Finding Probabilities

### Probability and Normal Distributions

- When a random variable $x$ is normally distributed, we can find the probability that $x$ will lie in an interval by calculating the area under the normal curve for the interval. 
- To find the area under any normal curve, first convert the upper and lower bounds of the interval to $z$-scores. 
- Then use the standard normal distribution to find the area.

### Finding Probabilities for Normal Distributions [example]

- A survey indicates that people keep their cell phone an average of $1.5$ years before buying a new one. 
- The standard deviation is $0.25$ year. 
- A cell phone user is selected at random. 
- Find the probability that the user will keep his or her current phone for less than $1$ year before buying a new one. 
- Assume that the lengths of time people keep their phone are normally distributed and are represented by the variable $x$.

### Finding Probabilities for Normal Distributions [solution]

![](./image/5_2_example_graph_age_of_cellphone.png)

- The figure shows a normal curve with $\mu = 1.5$, $\sigma = 0.25$, and the shaded area for $x$ less than $1$.
- The $z$-score that corresponds to $1$ year is $z = \frac{x - \mu}{\sigma} = \frac{1 - 1.5}{0.25} = -2$
- The Standard Normal Table shows that $P(z < -2) = 0.0228$. 
- The probability that the user will keep his or her phone for less than 1 year before buying a new one is 0.0228.
- Using SciPy

In [7]:
from scipy import stats
stats.norm.cdf(-2)

0.022750131948179195

#### Interpretation  

- $2.28\%$ of cell phone users will keep their phone for less than $1$ year before buying a new one. 
- Because $2.28\%$ is less than $5\%$, this is an **unusual event**.

### Finding Probabilities for Normal Distributions [example]

- A survey indicates that for each trip to a supermarket, a shopper spends an average of $45$ minutes with a standard deviation of $12$ minutes in the store. 
- The lengths of time spent in the store are normally distributed and are represented by the variable $x$. 
- A shopper enters the store:
    1. Find the probability that the shopper will be in the store for each interval of time listed below. 
    2. Interpret your answer when 200 shoppers enter the store. How many shoppers would you expect to be in the store for each interval of time listed below?
- Interval of interest:
    - Q1: Between 24 and 54 minutes    
    - Q2: More than 39 minutes

### Finding Probabilities for Normal Distributions [solution]
##### Q1: Between 24 and 54 minutes

![](./image/5_2_example_supermarket_shopper_a.png)

- The figure shows a normal curve with $\mu = 45$ minutes and $\sigma = 12$ minutes. 
- The area for $x$ between $24$ and $54$ minutes is shaded.
- The $z$-scores that correspond to $24$ minutes and to $54$ minutes are:
    - $z_{1} = \frac{24 - 45}{12} = -1.75$
    - $z_{2} = \frac{54 - 45}{12} = 0.75$
- The probability that a shopper will be in the store between $24$ and $54$ minutes is:

$$ 
\begin{aligned}
P(24 < x < 54)  &= P(-1.75 < z < 0.75) \\
                &= P(z < 0.75) - P(z < -1.75) \\
                &= 0.7734 - 0.0401 = 0.7333
\end{aligned}
$$

- Using SciPy

In [8]:
from scipy import stats
stats.norm.cdf(0.75) - stats.norm.cdf(-1.75)

0.7333134907593146

#### Interpretation

When $200$ shoppers enter the store, you would expect $200 \times 0.7333 = 146.66$, or about $147$, shoppers to be in the store between $24$ and $54$ minutes.

### Finding Probabilities for Normal Distributions [solution]

##### Q2: More than 39 minutes

![](./image/5_2_example_supermarket_shopper_b.png)

- The figure at the left shows a normal curve with $\mu = 45$ minutes and $\sigma = 12$ minutes. 
- The area for $x$ greater than $39$ minutes is shaded. 
- The $z$-score that corresponds to $39$ minutes is: $z = \frac{39 - 45}{12} = -0.5$
- The probability that a shopper will be in the store more than $39$ minutes is:

$$ 
\begin{aligned}
P(x > 39) &= P(z > -0.5) \\
          &= 1 - P(z < -0.5) \\
          &= 1 - 0.3085 = 0.6915
\end{aligned}
$$

- Using SciPy

In [9]:
from scipy import stats
1 - stats.norm.cdf(-0.5)

0.6914624612740131

#### Interpretation

When $200$ shoppers enter the store, you would expect $200 \times 0.6915 = 138.3$, or about $138$, shoppers to be in the store more than $39$ minutes.

## 5.3 <br/>Normal Distributions: Finding Values

### Finding $z$-Scores

- We were given a normally distributed random variable $x$ and we found the probability that $x$ would lie in an interval by calculating the area under the normal curve for the interval.
- But what if you are given a probability and want to find a value?

### Finding a $z$-Score Given an Area [example]

1. Find the $z$-score that corresponds to a cumulative area of $0.3632$.
2. Find the $z$-score that has $10.75\%$ of the distribution’s area to its right.

### Finding a $z$-Score Given an Area using the Standard Normal Table [solution]

Q1. Find the $z$-score that corresponds to a cumulative area of $0.3632$.


![](./image/5_3_example_table_given_an_area_a.png)

- Find the $z$-score that corresponds to an area of $0.3632$ by locating $0.3632$ in the Standard Normal Table. 
- The values at the beginning of the corresponding row and at the top of the corresponding column give the $z$-score. 
- For this area, the row value is $-0.3$ and the column value is $0.05$. 
- The $z$-score is $-0.35$, as shown in the figure.

![](./image/5_3_example_graph_given_an_area_a.png)

- Using [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- Using SciPy

In [10]:
from scipy import stats
stats.norm.ppf(0.3632)

-0.34991831705262144

### Finding a $z$-Score Given an Area [solution]

Q2. Find the $z$-score that has $10.75\%$ of the distribution’s area to its right.

![](./image/5_3_example_table_given_an_area_b.png)

- Because the area to the right is $10.75\%$, the cumulative area is $1 - 0.1075 = 0.8925$. 
- Find the $z$-score that corresponds to an area of $0.8925$ by locating $0.8925$ in the Standard Normal Table. 
- For this area, the row value is $1.2$ and the column value is $0.04$. 
- The $z$-score is $1.24$, as shown in the figure.

![](./image/5_3_example_graph_given_an_area_b.png)

- [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- Using SciPy

In [11]:
from scipy import stats
p = 1 - (10.75/100)
stats.norm.ppf(p)

1.2399334778907378

### Finding a $z$-Score Given a Percentile [example]

Find the $z$-score that corresponds to each percentile:

1. $P_{5}$         
2. $P_{50}$         
3. $P_{90}$

### Finding a $z$-Score Given a Percentile [solution]
Q1. $P_{5}$         


- [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- To find the $z$-score that corresponds to $P_{5}$ , find the $z$-score that corresponds to an area of $0.05$ (see the figure) by locating $0.05$ in the Standard Normal Table. 
- The areas closest to $0.05$ in the table are $0.0495$ $(z = -1.65)$ and $0.0505$ $(z = -1.64)$. 
- Because $0.05$ is halfway between the two areas in the table, use the $z$-score that is halfway between $-1.64$ and $-1.65$. 
- The $z$-score that corresponds to an area of $0.05$ is $-1.645$.

![](./image/5_3_example_graph_z_score_percentile_a.png)

- Using SciPy

In [12]:
from scipy import stats
stats.norm.ppf(5/100)

-1.6448536269514729

### Finding a $z$-Score Given a Percentile [solution]

Q2. $P_{50}$         


- [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- To find the $z$-score that corresponds to $P_{50}$ , find the $z$-score that corresponds to an area of $0.5$ (see the figure) by locating $0.5$ in the Standard Normal Table. 
- The area closest to $0.5$ in the table is $0.5000$, so the $z$-score that corresponds to an area of $0.5$ is $0$.

![](./image/5_3_example_graph_z_score_percentile_b.png)

- Using SciPy

In [13]:
from scipy import stats
stats.norm.ppf(50/100)

0.0

### Finding a $z$-Score Given a Percentile [solution]

Q3. $P_{90}$

- [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- To find the $z$-score that corresponds to $P_{90}$ , find the $z$-score that corresponds to an area of $0.9$ (see the figure) by locating $0.9$ in the Standard Normal Table. 
- The area closest to $0.9$ in the table is $0.8997$
- The $z$-score that corresponds to an area of $0.9$ is about $1.28$.

![](./image/5_3_example_graph_z_score_percentile_c.png)

- Using SciPy

In [14]:
from scipy import stats
stats.norm.ppf(90/100)

1.2815515655446004

### Transforming a $z$-Score to an $x$-Value

$$ 
\begin{aligned}
z &= \frac{x - \mu}{\sigma} \\
z \sigma &= x - \mu \\
\mu + z \sigma &= x \\
x &= \mu + z \sigma
\end{aligned}
$$

### Finding an $x$-Value Corresponding to a $z$-Score [example]

- A veterinarian records the weights of cats treated at a clinic. 
- The weights are normally distributed, with a mean of $9$ pounds and a standard deviation of $2$ pounds. 
- Find the weights $x$ corresponding to $z$-scores of $1.96$, $-0.44$, and $0$.
- Interpret your results.

### Finding an $x$-Value Corresponding to a $z$-Score [solution]

- The $x$-value that corresponds to each standard $z$-score is calculated using the formula $x = \mu + z \sigma$. 
- Note that $\mu = 9$ and $\sigma = 2$.

$$ 
\begin{aligned}
z = 1.96 &: x = 9 + 1.96 \times 2 = 12.92 \\
z = -0.44 &: x = 9 + (-0.44) \times 2 = 8.12 \\
z = 0 &: x = 9 + 0 \times 2 = 9 \\
\end{aligned}
$$

#### Interpretation  

You can see that $12.92$ pounds is above the mean, $8.12$ pounds is below the mean, and $9$ pounds is equal to the mean.

### Finding a Specific Data Value for a Given Probability

We can also use the normal distribution to find a specific data value ($x$-value) for a given probability

### Finding a Specific Data Value [example]

- Scores for the California Peace Officer Standards and Training test are normally distributed, with a mean of $50$ and a standard deviation of $10$. 
- An agency will only hire applicants with scores in the top $10\%$. 
- What is the lowest score an applicant can earn and still be eligible to be hired by the agency?

### Finding a Specific Data Value [solution]

- Exam scores in the top $10\%$ correspond to the shaded region shown.

![](./image/5_3_california_peace_officer_score_graph.png)

- A test score in the top $10\%$ is any score above the $90^{th}$ percentile. 
- To find the score that represents the $90^{th}$ percentile, you must first find the $z$-score that corresponds to a cumulative area of $0.9$. 
- In the Standard Normal Table, the area closest to $0.9$ is $0.8997$.
- The $z$-score that corresponds to an area of $0.9$ is $z = 1.28$.
- [$z$-score calculator](https://www.calculator.net/z-score-calculator.html)
- To find the $x$-value, note that $\mu = 50$ and $\sigma = 10$, and use the formula $x = \mu + z \sigma = 50 + 1.28 \times 10 = 62.8$
- Using SciPy

In [15]:
from scipy import stats
mu = 50
std = 10
z = stats.norm.ppf(90/100)
x = mu + z * std
print(f'z-score: {z} \t x: {x}')

z-score: 1.2815515655446004 	 x: 62.815515655446006


#### Interpretation  

The lowest score an applicant can earn and still be eligible to be hired by the agency is about $63$.

### Finding a Specific Data Value [example]

- In a randomly selected sample of women ages $20 – 34$, the mean total cholesterol level is $181$ milligrams per deciliter with a standard deviation of $37.6$ milligrams per deciliter. 
- Assume the total cholesterol levels are normally distributed.
- Find the highest total cholesterol level a woman in this $20 – 34$ age group can have and still be in the bottom $1\%$.

### Finding a Specific Data Value [solution]

- Total cholesterol levels in the lowest $1\%$ correspond to the shaded region shown.

![](./image/5_3_cholesterol_levels_in_women_graph.png)

- A total cholesterol level in the lowest $1\%$ is any level below the $1^{st}$ percentile.
- To find the level that represents the $1^{st}$ percentile, we must first find the $z$-score that corresponds to a cumulative area of $0.01$. 
- In the Standard Normal Table, the area closest to $0.01$ is $0.0099$. 
- The $z$-score that corresponds to an area of $0.01$ is $z = -2.33$. 
- To find the $x$-value, note that $\mu = 181$ and $\sigma = 37.6$, and use the formula $x = \mu + z \times \sigma = 181 + (-2.33) \times 37.6 \approx 93.39$.
- Using SciPy

In [16]:
from scipy import stats
mu = 181
std = 37.6
z = stats.norm.ppf(1/100)
x = mu + z * std
print(f'z-score: {z} \t x: {x}')

z-score: -2.3263478740408408 	 x: 93.52931993606438


#### Interpretation  

The value that separates the lowest $1\%$ of total cholesterol levels for women in the $20–34$ age group from the highest $99\%$ is about $93$ milligrams per deciliter.

## 5.4 <br/>Sampling Distributions and the Central Limit Theorem

### Sampling Distributions

- In this section, you will study the relationship between a population mean and the means of samples taken from the population.
- A **sampling distribution** is the probability distribution of a sample statistic that is formed when samples of size $n$ are repeatedly taken from a population.
- If the **sample statistic is the sample mean**, then the distribution is the sampling distribution of sample means. 
- Every sample statistic has a sampling distribution.

### Sampling Distributions


![](./image/5_4_sample_population_venn_diagram.png)

- The rectangle represents a large population, and each circle represents a sample of size $n$. 
- Because the sample entries can differ, the sample means can also differ. 
- The mean of Sample $1$ is $\bar{x}_{1}$; the mean of Sample $2$ is $\bar{x}_{2}$; and so on. 
- The sampling distribution of the sample means for samples of size $n$ for this population consists of $\bar{x}_{1}, \bar{x}_{2}, \bar{x}_{3},$ and so on. 
- If the samples are drawn with replacement, then an infinite number of samples can be drawn from the population.

### Properties of Sampling Distributions of Sample Means

1. The mean of the sample means $\mu_{\bar{x}}$ is equal to the population mean $\mu$.<br/>
$\mu_{\bar{x}} = \mu$
2. The standard deviation of the sample means $\sigma_{\bar{x}}$ is equal to the population standard deviation $\sigma$ divided by the square root of the sample size $n$. <br/>
$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$  <br/><br/>

The standard deviation of the sampling distribution of the sample means is called the **standard error of the mean**.

### A Sampling Distribution of Sample Means [example]

- You write the population values $\{51, 3, 5, 76\}$ on slips of paper and put them in a box. 
- Then you randomly choose two slips of paper, with replacement. 
- List all possible samples of size $n = 2$ and calculate the mean of each. 
- These means form the sampling distribution of the sample means. 
- Find the mean, variance, and standard deviation of the sample means. 
- Compare your results with the mean $\mu = 4$, variance $\sigma^{2} = 5$, and standard deviation $\sigma \approx 2.23$ of the population.

### A Sampling Distribution of Sample Means [solution]

- List all $16$ samples of size $2$ from the population and the mean of each sample.

![](./image/5_4_sample_means_example_table.png)

![](./image/5_4_probability_distribution_of_sample_means_table.png)

- After constructing a probability distribution of the sample means, we can graph the sampling distribution using a probability histogram.
- Notice that the shape of the histogram is bell-shaped and symmetric, similar to a normal curve.

![](./image/5_4_probability_histogram_of_sampling_distribution.png)

The mean, variance, and standard deviation of the $16$ sample means are:

- $\mu_{\bar{x}} = 4$
- $(\sigma_{\bar{x}})^{2} = \frac{5}{2} = 2.5$
- $\sigma_{\bar{x}} = \sqrt{2.5} \approx 1.581$

These results satisfy the properties of sampling distributions because:

- $\mu_{\bar{x}} = \mu = 4$
- $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{\sqrt{5}}{\sqrt{2}} \approx 1.581$


### The Central Limit Theorem

- The Central Limit Theorem forms the foundation for the inferential branch of statistics. 
- This theorem describes the relationship between the sampling distribution of sample means and the population that the samples are taken from. 
- The Central Limit Theorem is an important tool that provides the information you will need to use sample statistics to make inferences about a population mean.

### The Central Limit Theorem

1. If samples of size $n$, where $n \ge 30$, are drawn from any population with a mean m and a standard deviation $\sigma$, then the sampling distribution of sample means approximates a normal distribution. The greater the sample size, the better the approximation.
2. If the population itself is normally distributed, then the sampling distribution of sample means is normally distributed for any sample size $n$. <br/><br/>

In either case, the sampling distribution of sample means has a mean equal to the population mean.<br/>
$\mu_{\bar{x}} = \mu$ <br/>

and a standard deviation equal to the population standard deviation divided by the square root of $n$. <br/>
$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$

### The Central Limit Theorem

- The distribution of sample means has the same mean as the population. 
- But its standard deviation is less than the standard deviation of the population. 
- This tells you that the distribution of sample means has the same center as the population, but it is not as spread out.
- The distribution of sample means becomes less and less spread out (tighter concentration about the mean) as the sample size $n$ increases.

![](./image/5_4_distribution_of_sample_means.png)

### Interpreting the Central Limit Theorem [example]

- Cell phone bills for residents of a city have a mean of $\$47$ and a standard deviation of $\$9$, as shown in the figure. 
- Random samples of $100$ cell phone bills are drawn from this population, and the mean of each sample is determined.
- Find the mean and standard deviation of the sampling distribution of sample means. 
- Then sketch a graph of the sampling distribution.

![](./image/5_4_distribution_of_all_cell_phone_bills.png)

### Interpreting the Central Limit Theorem [solution]

- The mean of the sampling distribution is equal to the population mean, <br/>
**Mean of the sample means**: $\mu_{\bar{x}} = \mu = 47$

- and the standard deviation of the sample means is equal to the population standard deviation divided by $\sqrt{n}$. <br/>
**Standard deviation of the sample means**: $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{9}{\sqrt{100}} = 0.9$

#### Interpretation

Because the sample size is greater than $30$, the sampling distribution can be approximated by a normal distribution with a mean of $\$47$ and a standard deviation of $\$0.9$, as shown in the figure.

![](./image/5_4_distribution_of_sample_means_example.png)

### Interpreting the Central Limit Theorem [example]

- Assume the training heart rates of all $20$-year-old athletes are normally distributed, with a mean of $135$ beats per minute and a standard deviation of $18$ beats per minute, as shown in the figure. 
- Random samples of size $4$ are drawn from this population, and the mean of each sample is determined. 
- Find the mean and standard deviation of the sampling distribution of sample means.
- Then sketch a graph of the sampling distribution.

![](./image/5_4_distribution_of_population_training_heart_rates.png)

### Interpreting the Central Limit Theorem [solution]

- **Mean of the sample means**: $\mu_{\bar{x}} = \mu = 135$ beats per minute
- **Standard deviation of the sample means**: $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{18}{\sqrt{4}} = 9$ beats per minute

#### Interpretation  

From the Central Limit Theorem, because the population is normally distributed, the sampling distribution of the sample means is also normally distributed, as shown in the figure.

![](./image/5_4_distribution_of_sample_means_example_n_4.png)

### Probability and the Central Limit Theorem

- We can find the probability that a sample mean $\bar{x}$ will lie in a given interval of the $\bar{x}$ sampling distribution.
- To transform $\bar{x}$ to a $z$-score, you can use the formula: <br/>
$z = \frac{ \bar{x} - \mu_{\bar{x}} }{ \sigma_{\bar{x}} } = \frac{ \bar{x} - \mu }{ \sigma / \sqrt{n} }$

### Finding Probabilities for Sampling Distributions [examplle]