#### 1Q) A F&B manager wants to determine whether there is any significant difference in the diameter of the cutlet between two units. A randomly selected sample of cutlets was collected from both units and measured? Analyze the data and draw inferences at 5% significance level. Please state the assumptions and tests that you carried out to check validity of the assumptions.

### Assumptions:
    Random Sampling: The cutlets are randomly selected from both units.
    
    Independence: The diameter measurements of the cutlets from one unit are independent of the measurements from the other unit.
    
    Normality: The diameter measurements follow a normal distribution within each unit. We can check this assumption using a normality test like the Shapiro-Wilk test.
    
    Equal Variances: The variance of diameter measurements is equal between the two units. We can check this assumption using a F-test.

In [1]:
import numpy as np
import pandas as pd
import scipy.stats as stats

In [2]:
# Loading Dataset
cutlets = pd.read_csv('Cutlets.csv')
cutlets.head()

Unnamed: 0,Unit A,Unit B
0,6.809,6.7703
1,6.4376,7.5093
2,6.9157,6.73
3,7.3012,6.7878
4,7.4488,7.1522


In [3]:
# Shapiro-Wilk test for normality
sw_stat_unitA, sw_p_unitA = stats.shapiro(cutlets['Unit A'])
sw_stat_unitB, sw_p_unitB = stats.shapiro(cutlets['Unit B'])

# Print Shapiro-Wilk test results
print('p_unit A :', sw_p_unitA)
print('p_unit B :', sw_p_unitB)

p_unit A : 0.3199819028377533
p_unit B : 0.5225092768669128


In [4]:
# Perform F-test for equal variances
f_statistic, f_pvalue = stats.f_oneway(cutlets['Unit A'], cutlets['Unit B'])

# Print F-test results
print("F-test (ANOVA), p_value =", f_pvalue)

F-test (ANOVA), p_value = 0.47223947245995734


In [5]:
# Define the null hypothesis
H0 = 'There is no significant difference in the mean diameter of cutlets between the two units.'

In [6]:
# Define the alternative hypothesis
H1 = 'There is a significant difference in the mean diameter of cutlets between the two units.'

In [7]:
# Check Assumptions
if sw_p_unitA > 0.05 and sw_p_unitB > 0.05 and f_pvalue > 0.05:
    
    # Calculate the test statistic
    t_stat, p_value = stats.ttest_ind(cutlets['Unit A'],cutlets['Unit B'])

# Print the results
print("Test statistic =", t_stat)
print("p_value =", p_value)

Test statistic = 0.7228688704678063
p_value = 0.4722394724599501


In [9]:
# Conclusion
if p_value < 0.05:
    print("Reject the null hypothesis.")
else:
    print("Fail to reject the null hypothesis.")

Fail to reject the null hypothesis.
