# Confidence Intervals and Classical Hypothesis Testing: Mean
*Curtis Miller*

Now we look at inference regarding the mean of a population. The **mean** is the average value, and in this context refers to the mean of quantitative variables.

## Confidence Interval for the Mean

You are employed by a company that fabricates chips and other electronic components. The company wants you to investigates the resistors it uses in producing its components. In particular, while the resistors used by the company are labeled with a particular resistance, the company wants to ensure the manufacturer of the resistors produces quality products. Your task is to verify that the labeled resistance agrees with the observed resistance, after subjecting the resistors to testing.

You test the resistance (in $\text{k}\Omega$) of some resistors labeled $1\text{k}\Omega$ and obtain the following dataset (stored in a NumPy array).

In [None]:
import numpy as np

In [None]:
res = np.array([ 0.984,  0.988,  0.984,  0.987,  0.976,  0.997,  0.993,  0.985,
                 1.002,  0.987,  1.005,  0.993,  0.987,  0.992,  0.976,  0.998,
                 1.011,  0.971,  0.981,  1.008,  0.963,  0.992,  0.995,  0.99 ,
                 0.996,  0.99 ,  0.985,  0.997,  0.983,  0.981,  0.988,  0.991,
                 0.971,  0.982,  0.979,  1.008,  1.006,  1.006,  1.001,  0.999,
                 0.98 ,  0.996,  0.979,  1.009,  0.99 ,  0.996,  1.001,  0.981,
                 0.99 ,  0.987,  0.97 ,  0.992,  0.982,  0.983,  0.974,  0.999,
                 0.987,  1.002,  0.971,  0.982,  0.989,  0.985,  1.014,  0.991,
                 0.984,  0.992,  1.003,  0.985,  0.987,  0.985,  1.   ,  0.978,
                 0.99 ,  0.99 ,  0.985,  0.983,  0.981,  0.993,  0.993,  0.973,
                 1.   ,  0.982,  0.987,  0.988,  0.982,  0.978,  0.989,  1.   ,
                 0.983,  1.008,  0.997,  0.974,  0.988,  1.002,  0.988,  0.994,
                 0.991,  1.   ,  0.976,  0.987,  0.991,  1.010,  0.999,  1.002])
res.mean()

You now want to construct a confidence interval for the true resistance of the resistors.

You believe it's safe to assume that the data follows a Normal distribution; in that case, the confidence interval for the mean resistance is given by:

$$\bar{x} \pm t_{n - 1, 1 - \frac{\alpha}{2}} \frac{s}{\sqrt{n}}$$

where $\bar{x}$ is the sample mean, $s$ is the sample standard deviation, $\alpha$ is one minus the confidence level, and $t_{\nu, p}$ is the $p$th percentile of the [$t$ distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution) with $\nu$ degrees of freedom.

We can use functions from **statsmodels** to compute this interval.

*(WARNING: The following function/code is NOT stable!)*

In [None]:
from statsmodels.stats.weightstats import _tconfint_generic    # Computes confidence intervals

In [None]:
_tconfint_generic(mean=res.mean(),    # The mean of the data
                  std_mean=res.std()/np.sqrt(len(res)),    # The standard deviation of the MEAN (s/sqrt(n))
                  dof=len(res) - 1,    # The degrees of freedom (n - 1)
                  alpha=(1 - 0.95),    # 1 minus the confidence level
                  alternative="two-sided")

Notice that 1 is *not* in the confidence interval. This leads you to suspect that the resistors the supplier produces are not being properly manufactured.

## Hypothesis Testing for the Mean

The confidence interval you computed suggests that the resistors' resistance level does not agree with the label. You now want to perform a hypothesis test to confirm your suspicion. In particular, you believe that the resistors have less resistance than specified.

You will be testing the hypotheses:

$$H_0: \mu = 1\text{k}\Omega$$
$$H_A: \mu < 1\text{k}\Omega$$

Since you are assuming that the resistance is Normally distributed, you use the test statistic:

$$t = \frac{\bar{x} - 1}{\frac{s}{\sqrt{n}}}$$

to determine if you should reject $H_0$ or not.

The function `_tstat_generic()` can perform such a test and yield a $p$-value.

*(WARNING: The following function/code is NOT stable!)*

In [None]:
from statsmodels.stats.weightstats import _tstat_generic

In [None]:
_tstat_generic(value1=res.mean(),    # The mean of the dataset
               value2=0,
               diff=1,    # The mean under the null hypothesis
               std_diff=res.std()/np.sqrt(len(res)),    # The standard deviation of the mean
               dof=len(res) - 1,    # The degrees of freedom
               alternative="smaller")    # The direction of the alternative (the true mean is SMALLER than 1)

The p-value is miniscule! Clearly the resistance of the resistors the manufacturer makes is less than $1\text{k}\Omega$. Your company is being fleeced by this manufacturer!

## Two-Sample Test for Common Mean

In light of your study the manager of your division has decided to stop outsourcing resistor production. The company wants to start manufacturing its own resistors, and has started experimenting with different processes before engaging in full-scale production.

Right now there are two manufacturing processes, and you are tasked with determining whether the mean resistance of supposedly-$1\text{k}\Omega$ resistors is the same between the two processes. That is, given process A and process B, you wish to test

$$H_0: \mu_A = \mu_B$$
$$H_A: \mu_A \neq \mu_B$$

While you feel safe assuming that the resistance level of resistors is Normally distributed regardless of the manufacturing process employed, you don't assume that the standard deviation is common to all processes. In that case, you use the test statistic

$$t = \frac{\bar{x}_A - \bar{x}_B}{\sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}}}$$

After some tests you obtain the following datasets for the resistance of resistors produced by the different processes.

In [None]:
res_A = np.array([ 1.002,  1.001,  1.   ,  0.999,  0.998,  1.   ,  1.001,  0.999,
                   1.002,  0.998,  1.   ,  0.998,  1.001,  1.001,  1.002,  0.997,
                   1.001,  1.   ,  1.001,  0.999,  0.998,  0.998,  1.002,  1.002,
                   0.996,  0.998,  0.997,  1.001,  1.002,  0.997,  1.   ,  1.   ,
                   0.998,  0.997])

res_B = np.array([ 0.995,  1.022,  0.993,  1.014,  0.998,  0.99 ,  0.998,  0.998,
                   0.99 ,  1.003,  1.016,  0.992,  1.   ,  1.002,  1.003,  1.005,
                   0.979,  1.012,  0.978,  1.01 ,  1.001,  1.026,  1.011,  1.   ,
                   0.98 ,  0.993,  1.016,  0.991,  0.986,  0.987,  1.012,  0.996,
                   1.013,  1.001,  0.984,  1.011,  1.01 ,  1.   ,  1.001])

This test is performed by `ttest_ind()` from **statsmodels**.

In [None]:
from statsmodels.stats.weightstats import ttest_ind

In [None]:
ttest_ind(res_A, res_B,    # The datasets
          alternative="two-sided",
          usevar="unequal")

In the above output, the middle number is the p-value. In this case the p-value is approximately 0.659, which is large. We should not reject the null hypothesis. The two processes appear to produce resistors with the same mean level of resistance.

## One-Way ANOVA

Before you were able to report your findings you received word that three more manufacturing processes were tested and you now have resistors for five manufacturing processes. Your supervisor wants to know if all of the resistors produced by these processes have the same mean resistance or if some processes produce resistors with a mean resistance different from the rest.

In other words, for resistors produced by processes A, B, C, D, or E, you need to test

$$H_0: \mu_A = \mu_B = \mu_C = \mu_D = \mu_E$$
$$H_A: H_0 \text{ is false}$$

The test for deciding which of these two hypotheses is true is known as ANOVA. ANOVA has assumptions. In addition to the assumption that the data was drawn from Normal distributions, you must assume that the data was drawn from distributions with the same standard deviation. You would need to check this, but you are in a hurry.

You now have the following datasets in addition to the ones you started with.

In [None]:
res_C = np.array([ 1.005,  1.012,  1.003,  0.993,  0.998,  1.002,  1.002,  0.996,
                   0.999,  1.004,  1.006,  1.007,  0.991,  1.011,  1.   ,  1.   ,
                   1.005,  1.   ,  0.995,  0.995,  1.002,  1.002,  0.991,  1.003,
                   0.997,  0.994,  0.995,  1.   ,  1.001,  1.005,  0.992,  0.999,
                   0.999,  1.002,  1.   ,  0.994,  1.001,  1.007,  1.003,  0.993])

res_D = np.array([ 1.006,  0.996,  0.986,  1.004,  1.004,  1.   ,  1.   ,  0.993,
                   0.991,  0.992,  0.989,  0.996,  1.   ,  0.996,  1.001,  0.989,
                   1.   ,  1.004,  0.997,  0.99 ,  0.998,  0.994,  0.991,  0.995,
                   1.002,  0.997,  0.998,  0.99 ,  0.996,  0.994,  0.988,  0.996,
                   0.998])

res_E = np.array([ 1.009,  0.999,  0.995,  1.008,  0.998,  1.001,  1.001,  1.001,
                   0.993,  0.992,  1.007,  1.005,  0.997,  1.   ,  1.   ,  1.   ,
                   0.996,  1.005,  0.997,  1.013,  1.002,  1.006,  1.004,  1.002,
                   1.001,  0.999,  1.001,  1.004,  0.994,  0.999,  0.997,  1.004,
                   0.996])

The function `f_oneway()` from **scipy.stats** performs the one-way ANOVA test.

In [None]:
from scipy.stats import f_oneway

In [None]:
f_oneway(res_A, res_B, res_C, res_D, res_E)

The p-value of approximately 0.0347 appears small, so we can reject the null hypothesis that all processes yield resistors with the same level of resistance.