A F&B manager wants to determine whether there is any significant difference in the diameter of the cutlet between two units. A randomly selected sample of cutlets was collected from both units and measured? Analyze the data and draw inferences at 5% significance level. Please state the assumptions and tests that you carried out to check validity of the assumptions.

In [2]:
# Importing necessary libraries
import pandas as pd
import scipy.stats as stats

# Loading the data
data = pd.read_csv('Cutlets.csv')

# Separating data for Unit A and Unit B
unit_A = data['Unit A']
unit_B = data['Unit B']

# Normality (Shapiro-Wilk Test)
_, p_value_A = stats.shapiro(unit_A)
_, p_value_B = stats.shapiro(unit_B)

print("Shapiro-Wilk p-values:")
print("Unit A:", p_value_A)
print("Unit B:", p_value_B)

#  Equality of Variances (Levene's Test)
_, p_value_levene = stats.levene(unit_A, unit_B)

print("\nLevene's Test p-value:", p_value_levene)

# Two-Sample t-test assuming equal variances (since p_value_levene > 0.05)
t_stat, p_value_ttest = stats.ttest_ind(unit_A, unit_B)

print("\nTwo-Sample t-test results:")
print("t-statistic:", t_stat)
print("p-value:", p_value_ttest)

# Interpreting the results
alpha = 0.05
if p_value_ttest < alpha:
    print("\nReject the null hypothesis. There is a significant difference in the mean diameter of cutlets between Unit A and Unit B.")
else:
    print("\nFail to reject the null hypothesis. There is no significant difference in the mean diameter of cutlets between Unit A and Unit B.")


Shapiro-Wilk p-values:
Unit A: 0.31998491287231445
Unit B: 0.5225146412849426

Levene's Test p-value: 0.4176162212502553

Two-Sample t-test results:
t-statistic: 0.7228688704678063
p-value: 0.4722394724599501

Fail to reject the null hypothesis. There is no significant difference in the mean diameter of cutlets between Unit A and Unit B.


A hospital wants to determine whether there is any difference in the average Turn Around Time (TAT) of reports of the laboratories on their preferred list. They collected a random sample and recorded TAT for reports of 4 laboratories. TAT is defined as sample collected to report dispatch.Analyze the data and determine whether there is any difference in average TAT among the different laboratories at 5% significance level.
 

In [4]:
import pandas as pd
import scipy.stats as stats


data = pd.read_csv('LabTAT.csv')


print(data.head())

# one-way ANOVA
f_statistic, p_value = stats.f_oneway(data['Laboratory 1'], data['Laboratory 2'], data['Laboratory 3'], data['Laboratory 4'])

# results
print("\nOne-way ANOVA results:")
print("F-statistic:", f_statistic)
print("p-value:", p_value)


alpha = 0.05

if p_value < alpha:
    print("\nReject the null hypothesis. There is a significant difference in average TAT among the different laboratories.")
else:
    print("\nFail to reject the null hypothesis. There is no significant difference in average TAT among the different laboratories.")


   Laboratory 1  Laboratory 2  Laboratory 3  Laboratory 4
0        185.35        165.53        176.70        166.13
1        170.49        185.91        198.45        160.79
2        192.77        194.92        201.23        185.18
3        177.33        183.00        199.61        176.42
4        193.41        169.57        204.63        152.60

One-way ANOVA results:
F-statistic: 118.70421654401437
p-value: 2.1156708949992414e-57

Reject the null hypothesis. There is a significant difference in average TAT among the different laboratories.


Sales of products in four different regions is tabulated for males and females. Find if male-female buyer rations are similar across regions.

In [9]:
import scipy.stats as stats


observed_values = [[50, 142, 131, 70],
                   [435, 1523, 1356, 750]]

#Chi-squared test for independence
chi2_stat, p_value, _, _ = stats.chi2_contingency(observed_values)

# results
print("\nChi-squared test results:")
print("Chi-squared statistic:", chi2_stat)
print("p-value:", p_value)

#results
alpha = 0.05

if p_value < alpha:
    print("\nReject the null hypothesis. Male-female buyer ratios are not similar across regions.")
else:
    print("\nFail to reject the null hypothesis. Male-female buyer ratios are similar across regions.")



Chi-squared test results:
Chi-squared statistic: 1.595945538661058
p-value: 0.6603094907091882

Fail to reject the null hypothesis. Male-female buyer ratios are similar across regions.


TeleCall uses 4 centers around the globe to process customer order forms. They audit a certain %  of the customer order forms. Any error in order form renders it defective and has to be reworked before processing.  The manager wants to check whether the defective %  varies by centre. Please analyze the data at 5% significance level and help the manager draw appropriate inferences


In [19]:
import pandas as pd
import scipy.stats as stats

data = pd.read_csv('CustomerOrderForm.csv')


print(data.head())

#contingency table
contingency_table = pd.crosstab(data['Phillippines'], [data['Indonesia'], data['Malta'], data['India']])

# Chi-squared test for independence
chi2_stat, p_value, _, _ = stats.chi2_contingency(contingency_table)

# results
print("\nChi-squared test results:")
print("Chi-squared statistic:", chi2_stat)
print("p-value:", p_value)

# Interpreting the results
alpha = 0.05

if p_value < alpha:
    print("\nReject the null hypothesis. There is a significant association between the center and the defectiveness of order forms.")
else:
    print("\nFail to reject the null hypothesis. There is no significant association between the center and the defectiveness of order forms.")


  Phillippines   Indonesia       Malta       India
0   Error Free  Error Free   Defective  Error Free
1   Error Free  Error Free  Error Free   Defective
2   Error Free   Defective   Defective  Error Free
3   Error Free  Error Free  Error Free  Error Free
4   Error Free  Error Free   Defective  Error Free

Chi-squared test results:
Chi-squared statistic: 3.1001395592512266
p-value: 0.6845505149379718

Fail to reject the null hypothesis. There is no significant association between the center and the defectiveness of order forms.


Inferences based on result:
No Significant Difference: There is no statistically significant difference in the defectiveness of order forms among the centers (Phillippines, Indonesia, Malta, and India).

Consistency Across Centers: The order form processing appears to be consistent across all centers, ensuring a similar level of order form quality for customers.

Process Improvement: Since no specific center was identified as problematic, the focus should be on overall process improvement rather than singling out specific centers.

Continuous Monitoring: Ongoing monitoring of order form quality at each center is crucial to maintaining consistent quality standards over time.

Sample Size Consideration: Ensure that the sample size for each center is representative of actual order volume for more robust conclusions in the future.