# Introduction to Confidence Intervals - Means

1)  Answer the following questions using R and the associated normal
    function pnorm and qnorm.

a.  Verify that the interval $\mu \pm 2\sigma$ contains 95% of the
    distribution. Do this for a few means and standard deviations.

    

b.  Use qnorm to find the value $z_{*}$ such that $\mu \pm z_{*}\sigma$ will contain 80% of the distribution.

2)  A Gallup poll from 2009 asked people how much money that they
    expected to spend on Christmas presents in 1999. The sample mean of
    the 922 responses was \$857.

a)  Is the variable *amount expected to be spent on Christmas presents
    in 1999* qualitative or quantitative?

b)  Is \$857 a parameter or statistics?

c)  Identify (in words) the parameter of interest in this study. What
    symbol represents this parameter?

d)  Do you know the value of the parameter in this study? Is it more
    likely to be close to \$857 than far away from it?

3)  Recall that standard deviation of the sample mean
    $\overset{\overline{}}{x}\ $ is given by $\frac{\sigma}{\sqrt{n}}$,
    where $\sigma$ is the population standard deviation and $n$ the
    sample size. Suppose for these expected shopping expenditures that
    the population standard deviation is $\sigma = 205$.

a.  Calculate the standard deviation of the sample mean.

b.  Add subtract two standard deviations to/from the observed sample
    mean to form a reasonable interval estimate for the population
    mean $\mu$.

4)  Next we will simulate many intervals of the form
    $\overset{\overline{}}{x} \pm z_{*}\frac{\sigma}{\sqrt{n}}$
    Use the <http://www.rossmanchance.com/applets/ConfSim.html> with
    the following settings to explore many individual confidence
    intervals. (95% confidence interval is given by $\bar{x}\pm 2*\sigma/\sqrt{n}$)

<img src="./img/image1.png" width="150"> 

a.  Which is variable, the interval or the population mean?

b.  Now generate 100 intervals and determine the number of the intervals
    that fail to capture $\mu$.

<img src="./img/image2.png"  width="150">

How many intervals captured $\mu$?

c.  We don’t really know the value of the population standard deviation
    $\sigma$, what would be a reasonable value to substitute for is
    $\sigma$?

### The Standard Error

  The estimated standard deviation of the sample mean, called the **standard error**, is $SE_{\overset{\overline{}}{x}} = \frac{s}{\sqrt{n}}$

## Simulation – The effect of replacing** $\mathbf{\sigma}$ **with** $\mathbf{s}$

We will simulate the effect using
<http://www.rossmanchance.com/applets/ConfSim.html>

5) Compute the following table.

  |**Method**           | Percent of intervals that capture $\mu$  |
  |---------------------|:-----------------------------------------|
  |**z** with $\sigma$  | *percent here*                           |
  |**z** with $s$       | *percent here*                           |
  |**t** with $s$       | *percent here*                           |

6)  What was the effect of replacing $\sigma$ with $s$ on the rate at
    which the intervals captured $\mu$? Why is this a big deal?

## The $t$ distribution

  To fix the issue of replacing $\sigma$ with $s$, we will use a different table value, from the **t-distribution**, to correct the issue. The t-distribution has a parameter called **degrees of freedom**, which has the following formula when estimating the value of a single mean.
  
  $$d.f. = n - 1$$
  
  We can use the `qt` function in R to find the correct critical value for a specific sample size and confidence level.

7)  Did using a *t* critical value fix the problem with the effect of
    replacing $\sigma$ with $s$? Explain.

## Find *t* critical value using the table.**

8)  Find each of the *degrees of freedom* and the *t critical values*
    for each of the sample size and confidence level listed in the table

  |**Sample size**  |**Confidence Level**   |**d.f.**   |**t critical value**  |
  |-----------------|----------------------:|---------- |----------------------|
  |5                |90%                    |           |                      |
  |12               |80%                    |           |                      |
  |33               |99%                    |           |                      |
  |1201             |99.9%                  |           |                      |

## Putting is all together

  **Confidence Interval for a population mean** $\mathbf{\mu}$
  
  $$\overset{\overline{}}{x} \pm t_{*}\frac{s}{\sqrt{n}}$$
  
  where $\overset{\overline{}}{x}$ is the sample mean, $s$ is the sample standard deviation, *n* is the sample size, and $t_{*}$ represents the critical value from the *t* distribution with $n - 1$ degrees of freedom for the confidence level desired.

9)  A sample of 130 healthy adults was taken and the mean body
    temperature is 98.249 degrees, and the sample standard deviation is
    0.733 degrees.

a.  Calculate a 95% confidence interval for the population mean body
    temperature, based on the sample results for these 130 healthy
    adults.

In [5]:
# Code here

b.  Write a sentence interpreting this interval. Then write a
    separate sentence interpreting what the phrase “95% confidence”
    means in a statistical sense.

 ## Confidence Interval for a population mean** $\mathbf{\mu}$ **in JMP
  
i.   Enter the data in a data table or open a JMP file.
ii.  Goto **Ditribution > Analyze**
iii. Add the CI to the 