- Null Hypothesis (H0): There is no significant difference in the diameter of cutlets between Unit A and Unit B.

- Alternative Hypothesis (H1): There is a significant difference in the diameter of cutlets between Unit A and Unit B.

In [1]:
import pandas as pd
import numpy as np
from scipy import stats
from scipy.stats import norm

# Load the dataset
data=pd.read_csv('Cutlets.csv')
data.head()

Unnamed: 0,Unit A,Unit B
0,6.809,6.7703
1,6.4376,7.5093
2,6.9157,6.73
3,7.3012,6.7878
4,7.4488,7.1522


In [2]:
unit_a = pd.Series(data.iloc[:,0])
unit_a.head()

0    6.8090
1    6.4376
2    6.9157
3    7.3012
4    7.4488
Name: Unit A, dtype: float64

In [3]:
unit_b = pd.Series(data.iloc[:,1])
unit_b.head()

0    6.7703
1    7.5093
2    6.7300
3    6.7878
4    7.1522
Name: Unit B, dtype: float64

In [4]:
# Assumption Checks

# If p-value > 0.05, we can assume normality.
_, p_value_normality_a = stats.shapiro(unit_a)
_, p_value_normality_b = stats.shapiro(unit_b)

# If p-value > 0.05, we can assume homogeneity of variances.
_, p_value_var = stats.levene(unit_a, unit_b)

print("p_value_normality_a ", p_value_normality_a)
print("p_value_normality_b ", p_value_normality_b)
print("p_value_var ", p_value_var)

p_value_normality_a  0.31998491287231445
p_value_normality_b  0.5225146412849426
p_value_var  0.4176162212502553


In [5]:
# Two-Sample t-Test
t_stat, p_value_ttest = stats.ttest_ind(unit_a, unit_b, equal_var=True)
print("Two-Sample t-Test:", p_value_ttest)

alpha = 0.05
if p_value_ttest < alpha:
    print('Reject the null hypothesis. There is significant difference in the diameter of cutlets between Unit A and Unit B.')
else:
    print('Fail to reject the null hypothesis. There is no significant difference in the diameter of cutlets between Unit A and Unit B.')


Two-Sample t-Test: 0.4722394724599501
Fail to reject the null hypothesis. There is no significant difference in the diameter of cutlets between Unit A and Unit B.


The result of the Two-Sample t-Test is 0.472

Since the p-value from the t-test is more than the significance level,
We can say that we fail to reject the null hypothesis.

This suggests that there is not enough evidence to conclude a significant difference in the diameter of cutlets between Unit A and Unit B.

 Assumptions and tests that I had carried out to check the validity of the assumptions are:
 
 1. Normality Assumption
 2. Homogeneity of Variance Assumption
 3. Two-Sample t-Test

Conclusion: At the 5% significance level, based on the p-value from the t-test, there is insufficient evidence to reject the null hypothesis. Therefore, we do not have enough evidence to claim a significant difference in the diameter of cutlets between the two units.
