<a href="https://colab.research.google.com/github/HarshaPriya03/PythonLearning/blob/main/Hypothesis_Testing_Problem_Statements.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### **Steps for Performing a Hypothesis Test**

1. **State the Hypotheses**
  -  **Null Hypothesis($H_0$)**: The default assumption or claim to be tested (e.g., "There is no difference in means").

  - **Alternative Hypothesis($H_1$)**: The claim you want to test for (e.g., "There is a difference in means").

2. **Choose the Significance Level (α)**
  - Common choices are 0.05, 0.01, or 0.10.

  - This is the threshold for deciding whether a result is statistically significant.

3. **Select the Appropriate Test**

  - Decide which statistical test suits your data and hypothesis (e.g., t-test, z-test, chi-square test, ANOVA).

4. **Calculate the Test Statistic**
  - Use the sample data to compute the test statistic (e.g., t, z, F, or chi-square value) based on the chosen test.

5. **Compute the p-value**
  - The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated(assuming the null hypothesis is true).

6. **Compare the p-value to the Significance Level**
  - **If p-value ≤ α**: Reject the null hypothesis($H_0$). There is sufficient evidence to support the alternative hypothesis.

  - **If p-value > α** : Fail to reject the null hypothesis($H_0$). There is insufficient evidence to support the alternative hypothesis.

7. **State the Conclusion**
  - Clearly state the result in the context of the problem, referencing whether the null hypothesis was rejected or not.

---

#### **The Significance Level**

Imagine you’re a doctor testing a new medicine. You want to know if it really works better than the old one. To decide, you run an experiment with patients and collect data.

**The Courtroom Analogy** : Think of your experiment like a courtroom trial:

- Null Hypothesis ($H_0$): The new medicine is no better than the old one (the defendant is innocent).

- Alternative Hypothesis ($H_1$): The new medicine is better (the defendant is guilty).

In court, you don’t want to wrongly convict an innocent person. In science, you don’t want to wrongly claim your medicine works when it doesn’t. This mistake is called a **Type I error**.

**Enter the Significance Level ($\alpha$)**

The significance level, often written as $\alpha$, is like the judge setting a threshold for evidence. It’s the maximum probability you’re willing to accept for making a Type I error—declaring the medicine works when it actually doesn’t.

If you set $\alpha = 0.05$, you’re saying: “I’m willing to accept a 5% chance of being wrong if I claim the new medicine is better.”

### **p-value**

- The higher the p_value, the lower the chances of rejecting the Null Hypothesis.
- The lower the p_value, the higher are the chances of rejecting the Null Hypothesis.

---

### **Z-Test**

A z-test is a statistical method used in hypothesis testing to determine if there is a significant difference between sample and population means, or between the means of two samples. It is particularly useful when the population standard deviation is known and the sample size is large (typically greater than 30).

`We have mainly two types of z-test:`

1. **One-Sample Z-Test:** Used to determine whether the mean of a single sample is different from a known population mean.
2. **Two-Sample Z-Test:** Used to compare the means of two independent samples to see if they are significantly different from each other.

**Assumptions of Z-Tests**

For a z-test to yield valid/correct results, these assumptions must be met:
1. **Normal Distribution:** The data should be approximately normally distributed. This assumption is satisfied with large sample sizes due to the central limit theorem.
2. **Known Population Standard Deviation:** The standard deviation of the population must be known. If it is unknown, using a `t-test` is more appropriate.
3. **Random Sampling:** The sample data should be randomly drawn from the population, ensuring that it is representative of the actual population data.
4. **Independence:** The samples must be independent of each other, particularly in two-sample z-tests.
5. **Continuous Data:** The z-test is applicable for continuous data, where the variable of interest can take any numeric value.

---

#### **1. One-Sample Z-test (One sample test for known pop variance)**

**Scenario 1: Manufacturing Quality Control**

A pharmaceutical company produces a specific tablet that must contain exactly 500 mg of an active ingredient. Historical data from their high-precision machines indicates a population standard deviation ($\sigma$) of 10 mg. A Quality Assurance (QA) engineer randomly samples 50 batches and finds the average content is 496 mg. They need to determine if the machine has drifted out of calibration.

**Null Hypothesis ($H_0$):** The machine is calibrated correctly. The mean content is equal to the target.$$H_0: \mu = 500$$

**Alternative Hypothesis ($H_1$):** The machine is out of calibration. The mean content is not equal to the target.$$H_1: \mu \neq 500$$

In [None]:
import numpy as np
from scipy import stats

# Given Data
population_mean = 500
population_std = 10
sample_size = 50
sample_mean = 496

# Calculate Z-Score
# Formula: (sample_mean - pop_mean) / (pop_std / sqrt(n))
z_statistic = (sample_mean - population_mean) / (population_std / np.sqrt(sample_size))

# Calculate P-Value (Two-tailed test)
p_value = stats.norm.sf(abs(z_statistic)) * 2

print(f"Z-Score: {z_statistic:.4f}")
print(f"P-Value: {p_value:.4f}")

if p_value < 0.05:
    print("Reject Null Hypothesis: The machine has drifted out of calibration.")
else:
    print("Fail to Reject Null: The deviation is likely due to chance.")

---

**Scenario: Quality Assurance (Food Manufacturing)**

**Problem Statement:** A large food processing plant produces "Energy Bars" that are advertised to weigh exactly 50 grams. The strict quality control policy states that the production line standard deviation is known to be 2 grams due to mechanical variances.

A Quality Assurance Data Scientist randomly samples 100 bars from today's production batch and finds the average weight is 49.5 grams. They need to test if the machinery has drifted significantly from the target weight of 50g.

**Hypothesis Statements:**

- **Null Hypothesis ($H_0$):** The average weight of the bars is equal to the target. (The machine is working correctly).$$H_0: \mu = 50$$

- **Alternative Hypothesis ($H_1$):** The average weight of the bars is not equal to the target. (The machine is miscalibrated).$$H_1: \mu \neq 50$$

In [None]:
import numpy as np
import scipy.stats as stats

# 1. Setup Data
pop_mean = 50
pop_std_dev = 2
sample_size = 100
sample_mean = 49.5

# Simulate the sample data (Mean is shifted to 49.5 to simulate a problem)
sample_data = np.array([50.40483822, 46.8761864 , 48.88114199, 51.30141821, 49.33128064,
       46.09371437, 50.14253783, 52.62251794, 51.48639237, 51.76837878,
       48.68765281, 50.161313  , 48.32608142, 49.34135682, 47.9729816 ,
       49.28430948, 51.73274552, 50.25960079, 48.7359658 , 47.13827422,
       49.71008232, 52.70339994, 46.66162822, 53.3433419 , 52.78276636,
       53.43381614, 49.63156066, 46.22869873, 46.73314012, 48.98528066,
       47.50090802, 49.75194656, 51.97821175, 51.29384476, 48.98192443,
       50.72538558, 48.00899826, 52.47304215, 49.33947076, 49.04525544,
       49.22226128, 48.01008311, 49.25012366, 46.82159139, 49.99530902,
       47.71889532, 49.68001116, 48.09053674, 48.82212249, 46.63723591,
       49.14591185, 47.97374877, 46.49607605, 50.50171503, 48.0091587 ,
       51.18485864, 48.94155265, 54.12698446, 50.75096089, 54.31920821,
       46.54053288, 48.53904457, 49.61707527, 48.66808338, 50.55790802,
       50.10122835, 47.79250497, 50.45203966, 51.7998047 , 47.50426235,
       47.84189047, 49.02526641, 48.2429361 , 49.19689951, 51.54477141,
       49.88651678, 50.97091732, 49.42268857, 49.63775632, 49.81267013,
       49.54469218, 49.17206591, 49.52264168, 52.81964232, 49.18501505,
       51.04107237, 49.6800387 , 51.60434694, 49.8592259 , 49.21821781,
       50.13179064, 49.44307184, 42.19376068, 52.79645355, 50.35153978,
       48.09518794, 50.83571755, 47.71391235, 50.35067636, 53.36916694])

# 2. Perform One-Sample Z-Test
# 'value' is the mean under the Null Hypothesis
#z_stat, p_value = ztest(sample_data, value=population_mean_target)

z_stat = (sample_mean - pop_mean) / (pop_std_dev / (np.sqrt(sample_size)))

p_val = stats.norm.cdf(z_stat)

# 3. Interpret Results
alpha = 0.05
print(f"Z-Statistic: {z_stat:.4f}")
print(f"P-Value: {p_val:.4f}")

if p_val < alpha:
    print("Reject Null Hypothesis: The machinery is miscalibrated (Significant difference).")
else:
    print("Fail to Reject Null Hypothesis: The production is within acceptable limits.")

Z-Statistic: -2.5000
P-Value: 0.0062
Reject Null Hypothesis: The machinery is miscalibrated (Significant difference).


---

**Note**:

- When we have ≠ in Alternate Hypothesis then we reject the null hypothesis if p_value * 2 < 5%.

- And if we have < or > symbols in alternate hypothesis then we reject the null hypothesis if p_value < 5%.

---

## **Two Sample Z-test**

**Scenario 1: E-Commerce A/B Testing (Average Order Value)**

A large online fashion retailer, "StyleHub," is testing a new checkout page design (Design B) against their current design (Design A). They believe the new design streamlines the process and encourages users to add more small accessories before paying. They have collected data from 200 randomly selected users for each design. The Data Science team needs to determine if there is a statistically significant difference in the Average Order Value (AOV) between the two designs.

Hypothesis Statements:

Let $\mu_A$ be the mean AOV of Design A and $\mu_B$ be the mean AOV of Design B.

- **Null Hypothesis ($H_0$)**: There is no difference in the Average Order Value between the two designs.$$H_0: \mu_A - \mu_B = 0$$

- **Alternative Hypothesis ($H_1$)**: There is a significant difference in the Average Order Value between the two designs.$$H_1: \mu_A - \mu_B \neq 0$$

In [None]:
import numpy as np
from statsmodels.stats.weightstats import ztest

# 1. Simulate Data (Large sample size > 30 implies Z-test is applicable)
np.random.seed(42)

# Design A: Mean $120, Std Dev $15, N=200
design_a_aov = np.random.normal(loc=120, scale=15, size=200)

# Design B: Mean $125, Std Dev $18, N=200
design_b_aov = np.random.normal(loc=125, scale=18, size=200)

# 2. Perform Two-Sample Z-Test
# value=0 implies we are testing for a difference of 0
z_stat, p_value = ztest(design_a_aov, design_b_aov, value=0)

# 3. Interpret Results
alpha = 0.05
print(f"Z-Statistic: {z_stat:.4f}")
print(f"P-Value: {p_value:.4f}")

if p_value < alpha:
    print("Reject Null Hypothesis: There is a significant difference in AOV between designs.")
else:
    print("Fail to Reject Null Hypothesis: No significant difference found.")

---

### **T-test**

#### **1. One-Sample t-test (One Sample test for unknown pop variance)**

**Business Problem Statement**

A beverage company claims that the average content of its soda cans is 330 ml. The quality control team wants to verify this claim by measuring the content of a random sample of cans.

**Hypotheses**
  - **Null hypothesis ($H_0$)**: The mean content in soda cans is 330 ml ($\mu = 330$).

  - **Alternative hypothesis ($H_1$)**: The mean content is not 330 ml ($\mu \neq 330$).

In [None]:
import numpy as np
from scipy.stats import ttest_1samp
from scipy.stats import shapiro

#sample data of 15 cans
sample_data = np.array([332, 329, 331, 328, 334, 330, 327, 333, 329, 331, 332, 328, 330, 334, 329])

#shapiro-wilk for normality of the sample
shapiro_stat, shapiro_p_value = shapiro(sample_data)
print(shapiro_p_value)

# 1-sample t-test on the data
t_statistic, p_value = ttest_1samp(sample_data, popmean = 330)
print(p_value)

0.5372052570493318
0.4250186718429414


In [None]:
import numpy as np
from scipy.stats import ttest_1samp
from scipy.stats import shapiro

#sample data of 15 cans
sample_data = np.array([332, 329, 331, 328, 334, 330, 327, 333, 329, 331, 332, 328, 330, 334, 329])

shapiro_stat, p_value_shapiro = shapiro(sample_data)
print(p_value_shapiro)

#perform the t-test using ttest-1samp function
t_statistic, p_value = ttest_1samp(sample_data, popmean = 330, alternative = 'two-sided')

print(p_value)

**Note**: If the sample data is not normally distributed, then we can run **Wilcoxon-Signed Rank test** as an alternative of 1 sample t-test.

---

#### **2. Two Sample t-test**

**Business Problem Statement**

An HR department wants to compare the average monthly salaries of employees in two different departments (A and B) to determine if there is a significant difference.

**Hypotheses**

 - Null hypothesis ($H_0$): The mean salary in Department A equals that in Department B ($\mu_1 = \mu_2$).

 - Alternative hypothesis ($H_1$): The mean salaries are different ($\mu_1 \neq \mu_2$).

Before performing a two-sample t-test, it is important to check two key assumptions:

- Normality: Each sample should be approximately normally distributed.

- Equality of Variances: The two samples should have similar variances (for the standard t-test).

In [None]:
import numpy as np
from scipy.stats import shapiro, levene, ttest_ind, mannwhitneyu

# Monthly salaries in USD for two departments
dept_a = np.array([4500, 4700, 4200, 4800, 4600, 4400, 4550, 4650, 4750, 4300])
dept_b = np.array([4000, 4150, 4100, 3950, 4200, 4050, 4100, 4000, 4150, 4050])

#shapiro-wilk test for normality of the samples
shap_stat_a, shap_p_val_a = shapiro(dept_a)
shap_stat_b, shap_p_val_b = shapiro(dept_b)
print(shap_p_val_a, shap_p_val_b)

#levene test to check for equality of variances
levene_stat,levene_p_val = levene(dept_a, dept_b)
print(levene_p_val)

#two-sample independent t-test
t_stats, p_val = ttest_ind(dept_a, dept_b, equal_var = False)
print(p_val)

0.7713665744080458 0.8485975564649946
0.029281204462727414
1.4988271248674483e-05


**1. The sample data**

In [None]:
import numpy as np

# Monthly salaries in USD for two departments
dept_a = np.array([4500, 4700, 4200, 4800, 4600, 4400, 4550, 4650, 4750, 4300])
dept_b = np.array([4000, 4150, 4100, 3950, 4200, 4050, 4100, 4000, 4150, 4050])

**2. Use the Shapiro-Wilk test to check if each sample follow a normal distribution.**

In [None]:
from scipy.stats import shapiro

# Shapiro-Wilk test for normality
stat_a, p_a = shapiro(dept_a)
stat_b, p_b = shapiro(dept_b)

print(f"Dept A: p-value = {p_a:.4f}")
print(f"Dept B: p-value = {p_b:.4f}")

if p_a > 0.05:
    print("Dept A: Sample looks normal.")
else:
    print("Dept A: Sample does NOT look normal.")

if p_b > 0.05:
    print("Dept B: Sample looks normal.")
else:
    print("Dept B: Sample does NOT look normal.")

If the p-value of shapiro wilk test is greater than 0.05, the sample can be considered normally distributed.

**3. Use Levene's test to check if the variances of the two samples are equal.**

In [None]:
from scipy.stats import levene

# Levene's test for equal variances
stat_levene, p_levene = levene(dept_a, dept_b)

print(f"Levene's test p-value = {p_levene:.4f}")

if p_levene > 0.05:
    print("Variances are equal.")
    equal_var = True
else:
    print("Variances are NOT equal.")
    equal_var = False

If the p-value of levene test is greater than 0.05, you can assume equal variances

**4. Choose the appropriate t-test based on the result of Levene's test.**

In [None]:
from scipy.stats import ttest_ind

# Two-sample t-test (Welch's if variances unequal)
t_stat, p_value = ttest_ind(dept_a, dept_b, equal_var=False)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

If the p-value is less than 0.05, you can conclude there is a significant difference in means.

**Scenario 3: R&D Battery Testing (Two-Sample Independent T-Test), Domain: Manufacturing / Hardware Engineering**

A tech company is selecting a battery supplier for their new smartwatch. Supplier A claims their battery lasts 24 hours. Supplier B is cheaper but claims similar performance.
**Constraint:** Testing is expensive and time-consuming. The engineering team can only test 12 batteries from each supplier. Since $n < 30$, a Z-test is inappropriate; we must use a T-test.

Null Hypothesis ($H_0$): The average battery life of Supplier A and Supplier B is identical.$$H_0: \mu_A = \mu_B$$

Alternative Hypothesis ($H_1$): The average battery life differs between the two suppliers.$$H_1: \mu_A \neq \mu_B$$

In [None]:
from scipy import stats
import pandas as pd

# 1. Data (Hours of battery life)
# Small sample sizes (n=12)
supplier_a = [24.1, 23.8, 24.5, 23.0, 24.2, 23.9, 24.0, 23.5, 24.8, 24.1, 23.7, 24.3]
supplier_b = [22.8, 23.1, 22.5, 23.0, 22.9, 23.5, 22.1, 23.3, 22.8, 23.0, 22.7, 23.4]

# 2. Perform T-Test (Assuming equal variance for this hardware spec)
t_stat, p_val = stats.ttest_ind(supplier_a, supplier_b, equal_var=True)

print(f"Supplier A Mean: {sum(supplier_a)/len(supplier_a):.2f} hrs")
print(f"Supplier B Mean: {sum(supplier_b)/len(supplier_b):.2f} hrs")
print(f"P-Value: {p_val:.6f}") # Using 6f because p-value might be very small

# 3. Decision
if p_val < 0.05:
    print("Reject Null: Supplier A performs significantly better (or different).")
    # Interpretation: If A > B and significant, don't buy B even if it's cheaper!
else:
    print("Fail to Reject Null: No significant difference found.")

---

### **ANOVA**

Before performing a one-way ANOVA, it is essential to check the following key assumptions:

- Normality: Each group should be approximately normally distributed.

- Homogeneity of Variances: All groups should have similar variances.

- Independence: Observations should be independent (usually ensured by study design).

**Scenario 1: Marketing Campaign Effectiveness Problem**

**Statement**: A digital marketing team wants to determine which social media platform yields the highest user engagement score (0-100) for a new product launch. They run simultaneous campaigns on TikTok, Instagram, and LinkedIn and collect engagement scores from random users on each platform.

Hypothesis Formulation:
- **Null Hypothesis ($H_0$)**: There is no significant difference in the mean engagement scores across the three platforms.$$H_0: \mu_{TikTok} = \mu_{Instagram} = \mu_{LinkedIn}$$
- **Alternative Hypothesis ($H_1$)**: At least one platform has a mean engagement score different from the others.

In [None]:
import pandas as pd
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import f_oneway, shapiro, levene, kruskal

tiktok = np.array([65.70876234, 75.24148111, 53.36094   , 88.31124168, 66.9498208 ,
       62.7274252 , 84.95705692, 73.6630608 , 74.53086367, 80.41976844,
       61.79220387, 74.1663686 , 84.6109146 , 65.48623431, 73.93197654,
       94.78663015, 78.69687923, 63.32408083, 63.86821659, 64.15995633,
       79.18828803, 66.04985321, 62.63865761, 77.99304857, 83.73391146,
       69.32029347, 55.91996341, 51.52365563, 75.99622561, 69.89560614])

instagram = np.array([76.38405464, 83.48538268, 68.92500799, 77.15189002, 60.37999486,
       54.9064827 , 84.64504956, 76.26070475, 51.35595813, 81.21324046,
       75.16737898, 66.89979229, 76.945872  , 85.60134782, 58.56430539,
       72.17314063, 52.95971012, 73.63052986, 63.54290254, 72.16740738,
       79.10188442, 83.40293232, 70.20805162, 58.08966378, 80.21496265,
       70.56094365, 70.68869961, 72.30945941, 76.05238316, 71.82903503])

linkedin = np.array([74.09639092, 62.94630478, 65.10572241, 81.18655794, 63.9444842 ,
       74.01858283, 65.93010531, 78.05168414, 57.61019555, 65.94576657,
       73.24615889, 85.56356553, 63.36566193, 65.25667837, 62.22340595,
       76.34393328, 77.79381458, 57.40172342, 61.51665284, 63.05031077,
       58.25281691, 66.58920115, 73.26048231, 40.97154251, 52.80879973,
       67.46343802, 60.08994255, 60.4971181 , 59.24497195, 80.8282937 ])

In [None]:
print("--- Assumption Checks ---")

# 2. Check Assumption 1: Normality (Shapiro-Wilk Test)
# H0: The data is normally distributed.
# H1: The data is NOT normally distributed.
# We fail to reject H0 if p-value > 0.05

groups = {'TikTok': tiktok, 'Instagram': instagram, 'LinkedIn': linkedin}
normality_passed = True

for name, data in groups.items():
    stat, p = shapiro(data)
    print(f"{name} Normality p-value: {p:.4f}")
    if p < 0.05:
        normality_passed = False
        print(f"--> Warning: {name} data may not be normally distributed.")

if normality_passed:
    print("Result: Normality assumption holds (all p > 0.05).")
else:
    print("Result: Normality assumption violated.")

print("\n")

# 3. Check Assumption 2: Homogeneity of Variance (Levene's Test)

stat, p_levene = levene(tiktok, instagram, linkedin)
print(f"Levene's Test p-value: {p_levene:.4f}")

variance_passed = True
if p_levene < 0.05:
    variance_passed = False
    print("Result: Homogeneity of variance assumption violated.")
else:
    print("Result: Homogeneity of variance assumption holds.")

print("\n-------------------------")

# 4. Run ANOVA (Only if assumptions are reasonably met)
# Note: ANOVA is robust to slight deviations, but if assumptions fail badly,
# we should use Kruskal-Wallis (Non-parametric).

if normality_passed and variance_passed:
    print("Assumptions met. Running One-Way ANOVA...")
    f_stat, p_value = f_oneway(tiktok, instagram, linkedin)
    print(f"F-Statistic: {f_stat:.2f}")
    print(f"P-Value: {p_value:.4f}")

    if p_value < 0.05:
        print("Final Conclusion: Significant difference found between platforms.")
    else:
        print("Final Conclusion: No significant difference found.")
else:
    print("Assumptions violated. Consider using Kruskal-Wallis H-test instead.")

--- Assumption Checks ---
TikTok Normality p-value: 0.8886
Instagram Normality p-value: 0.0931
LinkedIn Normality p-value: 0.3852
Result: Normality assumption holds (all p > 0.05).


Levene's Test p-value: 0.6090
Result: Homogeneity of variance assumption holds.

-------------------------
Assumptions met. Running One-Way ANOVA...
F-Statistic: 2.56
P-Value: 0.0829
Final Conclusion: No significant difference found.


In [None]:
groups

{'TikTok': array([65.70876234, 75.24148111, 53.36094   , 88.31124168, 66.9498208 ,
        62.7274252 , 84.95705692, 73.6630608 , 74.53086367, 80.41976844,
        61.79220387, 74.1663686 , 84.6109146 , 65.48623431, 73.93197654,
        94.78663015, 78.69687923, 63.32408083, 63.86821659, 64.15995633,
        79.18828803, 66.04985321, 62.63865761, 77.99304857, 83.73391146,
        69.32029347, 55.91996341, 51.52365563, 75.99622561, 69.89560614]),
 'Instagram': array([76.38405464, 83.48538268, 68.92500799, 77.15189002, 60.37999486,
        54.9064827 , 84.64504956, 76.26070475, 51.35595813, 81.21324046,
        75.16737898, 66.89979229, 76.945872  , 85.60134782, 58.56430539,
        72.17314063, 52.95971012, 73.63052986, 63.54290254, 72.16740738,
        79.10188442, 83.40293232, 70.20805162, 58.08966378, 80.21496265,
        70.56094365, 70.68869961, 72.30945941, 76.05238316, 71.82903503]),
 'LinkedIn': array([74.09639092, 62.94630478, 65.10572241, 81.18655794, 63.9444842 ,
        74.0

---

**Comparing Customer Satisfaction Across Service Centers**

**Problem Statement**

An automotive company operates four service centers in different cities. The company collects customer satisfaction scores (on a scale of 1 to 10) from recent customers at each center. Management wants to determine if customer satisfaction differs significantly across the service centers.

**Hypotheses**

- **Null Hypothesis ($H_0$)**: The mean customer satisfaction scores are equal across all service centers ($\mu_1 = \mu_2 = \mu_3 = \mu_4$).

- **Alternative Hypothesis ($H_1$)**: At least one service center has a different mean satisfaction score.

In [None]:
import numpy as np
from scipy.stats import f_oneway, shapiro, levene, kruskal

# Satisfaction scores for each service center
center_1 = np.array([8, 9, 7, 8, 9])
center_2 = np.array([6, 7, 6, 8, 7])
center_3 = np.array([9, 8, 9, 10, 9])
center_4 = np.array([7, 6, 7, 8, 7])

# 1. Check normality for each group (Shapiro-Wilk test)
print("Normality (Shapiro-Wilk):")
for name, group in zip(['Center 1', 'Center 2', 'Center 3', 'Center 4'], [center_1, center_2, center_3, center_4]):
    stat, p = shapiro(group)
    print(f"{name}: p-value = {p:.4f}")

# 2. Check homogeneity of variances (Levene's test)
stat, p = levene(center_1, center_2, center_3, center_4)
print(f"\nLevene's test for equal variances: p-value = {p:.4f}")

# 3. Perform one-way ANOVA, kruskal if assumptions of anova are not met
f_stat, p_value = f_oneway(center_1, center_2, center_3, center_4)
print(f"\nANOVA: F-statistic = {f_stat:.2f}, p-value = {p_value:.4f}")

**Note:** If the assumptions of ANOVA test are not met then you use **Kruskal Test**

**How to Interpret the results**

1. If all Shapiro-Wilk p-values > 0.05: Groups are normally distributed.

2. If Levene’s p-value > 0.05: Variances are equal (homogeneity holds).

3. If ANOVA p-value < 0.05: At least one group mean is significantly different.

### **Chi-Squared Test**

**Problem Statement:**

A marketing team at a software company wants to determine if there is a relationship between the type of advertising medium (online, print, television) and software purchases. They collect data from a sample of customers, noting the advertising medium each customer was exposed to and whether they purchased the software.

**Hypotheses:**

- Null Hypothesis ($H_0$): There is no association between the type of advertising medium and software purchases. The variables are independent.
- Alternative Hypothesis ($H_1$): There is an association between the type of advertising medium and software purchases. The variables are not independent.

In [None]:
import numpy as np
from scipy.stats import chi2_contingency

# Observed frequency data in a contingency table
# Rows in the data array represent: Advertising Medium (Online, Print, Television)
# Columns in the data array represent: Purchase (Yes, No)
data = np.array([[30, 10],  # Online
                 [20, 20],  # Print
                 [50, 30]]) # Television

# Perform Chi-Square Test of Independence
chi2_stat, p_value, dof, expected = chi2_contingency(data)

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We have sufficient evidence to reject the null hypothesis.")
    print("There is a significant association between the advertising medium and software purchases.")
else:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We do not have sufficient evidence to reject the null hypothesis.")
    print("There is no significant association between the advertising medium and software purchases.")

Chi-Square Statistic: 5.3333, p-value: 0.0695
We do not have sufficient evidence to reject the null hypothesis.
There is no significant association between the advertising medium and software purchases.


**Problem Statement 2:**
A restaurant chain wants to determine if there is an association between the type of cuisine offered (Italian, Chinese, Mexican) and customer satisfaction levels (satisfied, neutral, dissatisfied). They conduct a survey among customers across several locations to collect this data.

**Hypotheses:**
- Null Hypothesis ($H_0$): There is no association between the type of cuisine and customer satisfaction levels. The variables are independent.
- Alternative Hypothesis ($H_1$): There is an association between the type of cuisine and customer satisfaction levels. The variables are not independent.

In [None]:
import numpy as np
from scipy.stats import chi2_contingency

# Observed frequency data in a contingency table
# Rows: Type of Cuisine (Italian, Chinese, Mexican)
# Columns: Customer Satisfaction (Satisfied, Neutral, Dissatisfied)
data = np.array([[40, 30, 10],  # Italian
                 [35, 25, 20],  # Chinese
                 [25, 30, 15]]) # Mexican

# Perform Chi-Square Test of Independence
chi2_stat, p_value, dof, expected = chi2_contingency(data)

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We have sufficient evidence to reject the null hypothesis.")
    print("There is a significant association between the type of cuisine and customer satisfaction levels.")
else:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We do not have sufficient evidence to reject the null hypothesis.")
    print("There is no significant association between the type of cuisine and customer satisfaction levels.")

**Scenario 3: Banking Risk Management (Credit Default)**

**Problem Statement:** A commercial bank is analyzing its loan portfolio. The Risk Management team wants to understand if a customer's Housing Status (Own, Mortgage, Rent) influences their Loan Default Status (Defaulted, Paid Off). If there is a strong dependency, housing status will be weighted more heavily in the bank's future credit scoring models.

**Hypothesis Statements:**

- **Null Hypothesis ($H_0$):** Loan Default status is independent of Housing Status.
- **Alternative Hypothesis ($H_1$):** Loan Default status is dependent on Housing Status.

In [None]:
import pandas as pd
from scipy.stats import chi2_contingency

# 1. Create a Contingency Table directly
# In many reports, you are given the summarized table directly rather than raw rows.
# Rows: Housing Status (Mortgage, Own, Rent)
# Columns: Loan Status (Defaulted, Paid Off)

data = [
    [45, 350],  # Mortgage: 45 defaults, 350 paid off
    [15, 200],  # Own: 15 defaults, 200 paid off
    [60, 280]   # Rent: 60 defaults, 280 paid off
]

# Labels for clarity (not strictly needed for calculation but good for display)
rows = ['Mortgage', 'Own', 'Rent']
cols = ['Defaulted', 'Paid Off']
contingency_table = pd.DataFrame(data, index=rows, columns=cols)

print("--- Loan Portfolio Contingency Table ---")
print(contingency_table)
print("\n")

# 2. Perform Chi-Squared Test
chi2, p, dof, expected = chi2_contingency(contingency_table)

# 3. Interpret Results
alpha = 0.05
print(f"Chi2 Statistic: {chi2:.4f}")
print(f"P-Value: {p:.4f}")
print(f"Degrees of Freedom: {dof}")

if p < alpha:
    print("Reject Null Hypothesis: Housing Status impacts Default Risk.")
else:
    print("Fail to Reject Null Hypothesis: Housing Status has no significant effect on Default Risk.")

# Optional: View Expected Frequencies to check assumptions
# (Chi-Square assumes expected frequency > 5 in each cell)
print("\nExpected Frequencies:\n", expected)