# A/B Testing

This Python code performs A/B testing to compare if a new version of a samosa (based on ratings from a small group of people) is significantly different from the old one, whose average rating is assumed to be 8.0. First, it calculates the sample mean (x̄) and sample standard deviation (s) from the new samosa ratings. It then computes the standard error (SE), which tells how much the sample mean is expected to vary by chance. Using this, it calculates the t-statistic, which measures how far the sample mean is from the old mean in SE units. Next, it uses the t-distribution to calculate the p-value, representing how likely it is to get a result this extreme if the old and new samosas were truly the same (i.e., if the null hypothesis H₀ is true). Finally, it compares the p-value to a significance level (alpha = 0.05) and prints whether the difference is statistically significant — that is, if the new samosa is truly different or if the observed change could just be due to random chance.

In [1]:
import numpy as np
from scipy import stats

# Sample data
# Old samosa (assumed population mean)
mu_old = 8.0

# New samosa ratings from 20 friends (sample)
new_samosa_ratings = [8.7, 8.3, 8.5, 8.6, 9.0,
                      8.8, 8.4, 8.5, 8.9, 9.1,
                      8.6, 8.7, 8.3, 8.6, 8.5,
                      8.8, 8.7, 8.6, 8.9, 8.5]

# Calculate sample mean and standard deviation
x_bar = np.mean(new_samosa_ratings)
s = np.std(new_samosa_ratings, ddof=1)  # sample standard deviation
n = len(new_samosa_ratings)

# Standard error
SE = s / np.sqrt(n)

# t-statistic
t_stat = (x_bar - mu_old) / SE

# Degrees of freedom
df = n - 1

# Two-tailed p-value from t-distribution
p_value = 2 * (1 - stats.t.cdf(abs(t_stat), df))

# Print results
print(f"Sample Mean = {x_bar:.2f}")
print(f"Sample Std Dev = {s:.2f}")
print(f"Standard Error = {SE:.3f}")
print(f"t-statistic = {t_stat:.3f}")
print(f"Degrees of Freedom = {df}")
print(f"p-value = {p_value:.4f}")

# Decision
alpha = 0.05
if p_value < alpha:
    print("🎉 Result: Reject H₀ — New samosa is likely different from old!")
else:
    print("😐 Result: Fail to reject H₀ — No strong evidence that samosas are different.")


Sample Mean = 8.65
Sample Std Dev = 0.22
Standard Error = 0.049
t-statistic = 13.283
Degrees of Freedom = 19
p-value = 0.0000
🎉 Result: Reject H₀ — New samosa is likely different from old!
