# T-statistic small sample hypotesis test and confidence interval examples

## Problem statement

The mean emission of all engines of a new design needs to be below the 20 ppm if the design is to meet new emission requirements. Ten engines are manufactured for a testing purpose, and the emission level of each is determined. The emission data is available in vector X.

Does the data supply sufficient evidence to conclude that the type of engine meets the new standard? Assume we are willing to risk a type I error with probability = 0.01



In [1]:
import numpy as np
from IPython.display import Image
from IPython.core.display import HTML 

In [2]:
## Data

X = np.array([[15.6,16.2,22.5,20.5,16.4,19.4,16.6,17.9,12.7,13.9]])

Let's see what the t table looks like:

In [3]:
Image(url= "http://i.stack.imgur.com/PiSUh.png")


So we want to find a random t statistic that has a 95% probability taht will be within the following values:
    
                                                                  -2.262 < t <2.262

In [6]:
## First we want to compute the mean and standard deviation of the data

X_mean = np.mean(X)

sigma = np.std(X)

n = X.shape[1]

print("X mean=",X_mean," X standard deviation=",sigma, "# of ovserbations=",n)

## And after the problem statement we know that

mu = 20


X mean= 17.169999999999998  X standard deviation= 2.8284448023604774 # of ovserbations= 10


If we define the null hypotesis and the alternative hypotesis we have that:
    
$$H_0: \mu = 20 ppm$$ \
$$H_1: \mu < 20 ppm$$

We are going to reject $H_O$ if $P(\bar{X}| H_0 = true) < 1 \%$

If we use the definition of the t statistic which is:
    
$$ t = \frac{\bar{X} - \mu}{\frac{\hat{\sigma}}{\sqrt[]n}} $$

In [10]:
##Function to compute the t statistic

def t_statistic(X_mean,mu,sigma,n):
    
    
    t = (X_mean - mu) / (sigma/np.sqrt(n))
    
    return t

In [26]:
## Print the t statistic

t_stat = t_statistic(X_mean,mu,sigma,n)

print("We have a t statistic = ",t_stat)

we have a t statistic =  -3.1640164131214195


In [25]:
## Now we know, from the density t Table, that the threshold for the t distribution for a 99% probability with n-1 degres of freedom is:

## 99% threashold with 9 degrees of freedom is equal to 2.821

threshold = -2.821 

## As the t distribution is symetric, we have that :

if t_stat < threshold:
    
    print (True)
    print ("Assuming the null hypotesis to be true, the probability of getting a t statistic of " ,t_stat," is lower than 1%\
          thus we can reject the null hypotesis")
    
else:
        
    print (False)
    print ("Assuming the null hypotesis to be true, we do not have enough evidence at 99 % confidence, to reject\
          the null hypotesis")       
    




True
Assuming the null hypotesis to be true, the probability of getting a t statistic of  -3.1640164131214195  is lower than 1%          thus we can reject the null hypotesis


Now let's compute a 95% confidence interval. From the t Table, we have that for a confidence interval of 95% and 9 degrees of freedom the t value is equal to 2.262.

As the t Table is symetric, we have that we need a t statistic respecting the following constraint:

$$-2.262 < t < 2.262$$\
$$-2.262 < \frac{\bar{X} - \mu}{\frac{\hat{\sigma}}{\sqrt[]n}} < 2.262$$\
$$-2.262 < \frac{17.17-\mu}{\frac{2.82}{\sqrt{10}}}< 2.262 $$\
$$-2.262 < \frac{17.17-\mu}{.8918} < 2.262  $$\
$$-2.262 (.8918) < 17.17-\mu  < (.8918) 2.262 $$\
$$-2.0173< 17.17-\mu <2.0173$$\
$$-2.0173< 17.17-\mu <2.0173$$\
$$2.0173 > \mu - 17.17 > - 2.0173$$\ 
$$2.0173 + 17.17 > \mu > - 2.0173 + 17.17$$\
$$ 15.1527 > \mu > 19.1873 $$


Hence, there is a 95 % chance that the real population mean $\mu$ falls between the interval we have just computed.


