## Instructions
We will have another simple example on two sample t test (pooled- when the variances are equal). But this time this is a one sided t-test. <br>
In a packing plant, a machine packs cartons with jars. It is supposed that a new machine will pack faster on the average than the machine currently used. To test that hypothesis, the times it takes each machine to pack ten cartons are recorded. The results, in seconds, are shown in the tables in the file files_for_lab/machine.txt. Assume that there is sufficient evidence to conduct the t test, does the data provide sufficient evidence to show if one machine is better than the other.


## Solution
- Null hypothesis: the average packing speed of the new machine is equal or lower to the one of the machine currently used.
- Alternative hypothesis: the average packing speed of the new machine is higher than the one of the machine currently used.
- It is a one-tailed test.
- Level of significance: 0.05

In [1]:
import math
import numpy as np
from scipy.stats import ttest_ind
from scipy.stats import ttest_ind_from_stats

In [2]:
old_machine = [42.7, 43.6, 43.8, 43.3, 42.5, 43.5, 43.1, 41.7, 44, 44.1]
new_machine = [42.1, 41, 41.3, 41.8, 42.4, 42.8, 43.2, 42.3, 41.8, 42.7]

old_n = len(old_machine)
old_mean = np.mean(old_machine)
old_std = np.std(old_machine, ddof=1) # ddof=1 to calculate the sample's std, instead of the population's (default) 
old_dof = old_n - 1

new_n = len(new_machine)
new_mean = np.mean(new_machine)
new_std = np.std(new_machine, ddof=1) # ddof=1 to calculate the sample's std, instead of the population's (default) 
new_dof = new_n - 1

print(f"""
The statistics of the old machine sample are:
size = {old_n}
mean = {round(old_mean, 2)}
standard deviation = {round(old_std, 2)}
degrees of freedom = {old_dof}


The statistics of the new machine sample are:
size = {new_n}
mean = {new_mean}
standard deviation = {round(new_std, 2)}
degrees of freedom = {new_dof}
""")


The statistics of the old machine sample are:
size = 10
mean = 43.23
standard deviation = 0.75
degrees of freedom = 9


The statistics of the new machine sample are:
size = 10
mean = 42.14
standard deviation = 0.68
degrees of freedom = 9



In [3]:
dof_old_new = old_dof + new_dof
sp = np.sqrt(((old_dof * old_std**2) + (new_dof * new_std**2)) / dof_old_new)
ttest = (old_mean - new_mean) / (sp * np.sqrt(1/old_n + 1/new_n))
ttest

3.3972307061176026

In [4]:
ttest_ind(old_machine, new_machine)

Ttest_indResult(statistic=3.3972307061176026, pvalue=0.0032111425007745158)

In [5]:
ttest_ind_from_stats(old_mean, old_std, old_n, new_mean, new_std, new_n)

Ttest_indResult(statistic=3.3972307061176026, pvalue=0.0032111425007745158)

The critical value for a one-tailed test, alpha = 0.05 and 18 degrees of freedom is 1.7341 <br>
https://www.easycalculation.com/statistics/t-distribution-critical-value-table.php <br>
Since the statistic is higher than the critical value and the p-value is lower than alpha, we can conclude that the new machine is faster than the old one, with a confidence level of 95%.