### **Steps for Performing a Hypothesis Test Using the p-value Method**

1. **State the Hypotheses**
  -  **Null Hypothesis($H_0$)**: The default assumption or claim to be tested (e.g., "There is no difference in means").

  - **Alternative Hypothesis($H_1$)**: The claim you want to test for (e.g., "There is a difference in means").

2. **Choose the Significance Level (α)**
  - Common choices are 0.05, 0.01, or 0.10.

  - This is the threshold for deciding whether a result is statistically significant.

3. **Select the Appropriate Test**

  - Decide which statistical test suits your data and hypothesis (e.g., t-test, z-test, chi-square test, ANOVA).

4. **Calculate the Test Statistic**
  - Use the sample data to compute the test statistic (e.g., t, z, F, or chi-square value) based on the chosen test.

5. **Compute the p-value**
  - The p-value is the probability of observing a test statistic as extreme as, or more extreme than, the one calculated(assuming the null hypothesis is true).

6. **Compare the p-value to the Significance Level**
  - **If p-value ≤ α**: Reject the null hypothesis($H_0$). There is sufficient evidence to support the alternative hypothesis.

  - **If p-value > α** : Fail to reject the null hypothesis($H_0$). There is insufficient evidence to support the alternative hypothesis.

7. **State the Conclusion**
  - Clearly state the result in the context of the problem, referencing whether the null hypothesis was rejected or not.

---

#### **The Significance Level**

Imagine you’re a doctor testing a new medicine. You want to know if it really works better than the old one. To decide, you run an experiment with patients and collect data.

**The Courtroom Analogy** : Think of your experiment like a courtroom trial:

- Null Hypothesis ($H_0$): The new medicine is no better than the old one (the defendant is innocent).

- Alternative Hypothesis ($H_1$): The new medicine is better (the defendant is guilty).

In court, you don’t want to wrongly convict an innocent person. In science, you don’t want to wrongly claim your medicine works when it doesn’t. This mistake is called a **Type I error**.

**Enter the Significance Level ($\alpha$)**

The significance level, often written as $\alpha$, is like the judge setting a threshold for evidence. It’s the maximum probability you’re willing to accept for making a Type I error—declaring the medicine works when it actually doesn’t.

If you set $\alpha = 0.05$, you’re saying: “I’m willing to accept a 5% chance of being wrong if I claim the new medicine is better.”

### **p-value**

- The higher the p_value, the lower the chances of rejecting the Null Hypothesis.
- The lower the p_value, the higher are the chances of rejecting the Null Hypothesis.

---

### **Z-Test**

A z-test is a statistical method used in hypothesis testing to determine if there is a significant difference between sample and population means, or between the means of two samples. It is particularly useful when the population standard deviation is known and the sample size is large (typically greater than 30).

`We have mainly two types of z-test:`

1. **One-Sample Z-Test:** Used to determine whether the mean of a single sample is different from a known population mean.
2. **Two-Sample Z-Test:** Used to compare the means of two independent samples to see if they are significantly different from each other.

**Assumptions of Z-Tests**

For a z-test to yield valid/correct results, these assumptions must be met:
1. **Normal Distribution:** The data should be approximately normally distributed. This assumption is satisfied with large sample sizes due to the central limit theorem.
2. **Known Population Standard Deviation:** The standard deviation of the population must be known. If it is unknown, using a `t-test` is more appropriate.
3. **Random Sampling:** The sample data should be randomly drawn from the population, ensuring that it is representative of the actual population data.
4. **Independence:** The samples must be independent of each other, particularly in two-sample z-tests.
5. **Continuous Data:** The z-test is applicable for continuous data, where the variable of interest can take any numeric value.

---

#### **1. One-Sample Z-test**

**Business Problem Statement**

A retail chain claims that the average monthly spending of its loyalty program members is ₹5,000. The management wants to verify this claim using a random sample of 50 members. Population standard deviation is ₹600.

**Hypotheses**
- **Null hypothesis($H_0$)**: The average monthly spending is ₹5,000 (μ=5000).

- **Alternative hypothesis($H_1$)**: The average monthly spending is not ₹5,000 (μ ≠ 5000).

**Business Scenario**
- Claimed mean: ₹5,000

- Sample mean: ₹5,200(suppose)

- Population std dev: ₹600

- Sample size: 50

In [1]:
import numpy as np

sample = np.array([6258.43140758, 5440.09432502, 5787.24279046, 6544.53591952,
       6320.53479409, 4613.63327207, 5770.05305052, 5109.18567502,
       5138.06868892, 5446.35910116, 5286.4261427 , 6072.56410418,
       5656.62263509, 5273.0050099 , 5466.31793965, 5400.20459642,
       6096.44744389, 5076.90504174, 5387.84062099, 4687.54255642,
       3668.2061105 , 5592.17115726, 5718.66171932, 4754.70098776,
       6561.85277439, 4327.38059524, 5227.45511038, 5087.68968998,
       6119.66752862, 6081.61526194, 5292.96845542, 5426.89751176,
       4667.32855142, 4011.52211907, 4991.2527104 , 5293.80938146,
       5938.17440844, 5921.42790927, 4967.60390956, 5018.61834965,
       4570.86822096, 4347.98923769, 4176.23788562, 6370.46523714,
       4894.20869095, 4937.15541903, 4448.32278397, 5666.4942135 ,
       4231.66129147, 5072.35583187])

pop_std_dev = 600
sample_mean = np.mean(sample)
pop_mean = 5000
sample_size = 50

z_stat = (sample_mean - pop_mean) / (pop_std_dev / (np.sqrt(sample_size)))

print(f"Z-statistic: {z_stat:.2f}")

Z-statistic: 3.35


In [2]:
sample_mean

np.float64(5284.335563387601)

In [None]:
import scipy.stats as stats
stats.norm.cdf(z_stat)

np.float64(0.9995972919804501)

In [None]:
p_val = 1 - stats.norm.cdf(z_stat)

In [None]:
p_val

np.float64(0.0004027080195498911)

In [None]:
p_val * 2 #because we are working on two-sided hypothesis

np.float64(0.0008054160390997822)

---

#### **2. Two-Sample Z-Test**

**Business Problem Statement**

A bank wants to compare the average transaction amounts of customers from two different cities to determine if there is a significant difference.

**Hypotheses**
 - **Null hypothesis($H_0$)**: The average transaction amount is the same in both cities.

 - **Alternative hypothesis($H_1$)**: The average transaction amount differs between the two cities.

**Business Scenario**

- **City 1:** mean ≈ 2500, std ≈ 400, n = 60

- **City 2:** mean ≈ 2300, std ≈ 420, n = 55

In [None]:
import numpy as np
from statsmodels.stats.weightstats import ztest

# Generate samples for both cities
np.random.seed(1)
sample_city1 = np.random.normal(loc=2500, scale=400, size=60)
sample_city2 = np.random.normal(loc=2300, scale=420, size=60)

# Perform two-sample z-test
z_stat, p_value = ztest(sample_city1, sample_city2, alternative='two-sided')

print(f"Z-statistic: {z_stat:.2f}")
print(f"P-value: {p_value:.4f}")

Z-statistic: 2.67
P-value: 0.0076


---

### **T-test**

#### **1. One-Sample t-test**

**Business Problem Statement**

A beverage company claims that the average content of its soda cans is 330 ml. The quality control team wants to verify this claim by measuring the content of a random sample of cans.

**Hypotheses**
  - **Null hypothesis ($H_0$)**: The mean content is 330 ml ($\mu = 330$).

  - **Alternative hypothesis ($H_1$)**: The mean content is not 330 ml ($\mu \neq 330$).

In [4]:
import numpy as np
from scipy.stats import ttest_1samp
from scipy.stats import shapiro

#sample data of 15 cans
sample_data = np.array([332, 329, 331, 328, 334, 330, 327, 333, 329, 331, 332, 328, 330, 334, 329])

#shapiro-wilk test on the sample data
shapiro_statistic, p_value_shapiro = shapiro(sample_data)
print(p_value_shapiro)

t_statistic, p_value = ttest_1samp(sample_data, popmean = 330)
print(p_value)

0.5372052570493318
0.4250186718429414


In [None]:
import numpy as np
from scipy.stats import ttest_1samp
from scipy.stats import shapiro

#sample data of 15 cans
sample_data = np.array([332, 329, 331, 328, 334, 330, 327, 333, 329, 331, 332, 328, 330, 334, 329])

shapiro_stat, p_value_shapiro = shapiro(sample_data)
print(p_value_shapiro)

#perform the t-test using ttest-1samp function
t_statistic, p_value = ttest_1samp(sample_data, popmean = 330, alternative = 'two-sided')

print(p_value)

0.5372052570493318
0.4250186718429414


---

#### **2. Two Sample t-test**

**Business Problem Statement**

An HR department wants to compare the average monthly salaries of employees in two different departments (A and B) to determine if there is a significant difference.

**Hypotheses**

 - Null hypothesis ($H_0$): The mean salary in Department A equals that in Department B ($\mu_1 = \mu_2$).

 - Alternative hypothesis ($H_1$): The mean salaries are different ($\mu_1 \neq \mu_2$).

Before performing a two-sample t-test, it is important to check two key assumptions:

- Normality: Each sample should be approximately normally distributed.

- Equality of Variances: The two samples should have similar variances (for the standard t-test).

In [8]:
import numpy as np

# Monthly salaries in USD for two departments
dept_a = np.array([4500, 4700, 4200, 4800, 4600, 4400, 4550, 4650, 4750, 4300])
dept_b = np.array([4000, 4150, 4100, 3950, 4200, 4050, 4100, 4000, 4150, 4050])

from scipy.stats import shapiro

# Shapiro-Wilk test for normality
stat_a, p_a = shapiro(dept_a)
stat_b, p_b = shapiro(dept_b)
print(p_a, p_b)

from scipy.stats import ttest_ind

# Two-sample t-test (Welch's if variances unequal)
t_stat, p_value = ttest_ind(dept_a, dept_b, equal_var = False)
print(p_value)

0.7713665744080458 0.8485975564649946
0.029281204462727414
1.4988271248674483e-05


**1. The sample data**

In [None]:
import numpy as np

# Monthly salaries in USD for two departments
dept_a = np.array([4500, 4700, 4200, 4800, 4600, 4400, 4550, 4650, 4750, 4300])
dept_b = np.array([4000, 4150, 4100, 3950, 4200, 4050, 4100, 4000, 4150, 4050])

**2. Use the Shapiro-Wilk test to check if each sample follow a normal distribution.**

In [None]:
from scipy.stats import shapiro

# Shapiro-Wilk test for normality
stat_a, p_a = shapiro(dept_a)
stat_b, p_b = shapiro(dept_b)

print(f"Dept A: p-value = {p_a:.4f}")
print(f"Dept B: p-value = {p_b:.4f}")

if p_a > 0.05:
    print("Dept A: Sample looks normal.")
else:
    print("Dept A: Sample does NOT look normal.")

if p_b > 0.05:
    print("Dept B: Sample looks normal.")
else:
    print("Dept B: Sample does NOT look normal.")

Dept A: p-value = 0.7714
Dept B: p-value = 0.8486
Dept A: Sample looks normal.
Dept B: Sample looks normal.


If the p-value of shapiro wilk test is greater than 0.05, the sample can be considered normally distributed.

**3. Use Levene's test to check if the variances of the two samples are equal.**

In [None]:
from scipy.stats import levene

# Levene's test for equal variances
stat_levene, p_levene = levene(dept_a, dept_b)

print(f"Levene's test p-value = {p_levene:.4f}")

if p_levene > 0.05:
    print("Variances are equal.")
    equal_var = True
else:
    print("Variances are NOT equal.")
    equal_var = False

Levene's test p-value = 0.0293
Variances are NOT equal.


If the p-value of levene test is greater than 0.05, you can assume equal variances

**4. Choose the appropriate t-test based on the result of Levene's test.**

In [None]:
from scipy.stats import ttest_ind

# Two-sample t-test (Welch's if variances unequal)
t_stat, p_value = ttest_ind(dept_a, dept_b, equal_var=False)

print(f"T-statistic: {t_stat:.2f}")
print(f"P-value: {p_value:.4f}")

T-statistic: 7.02
P-value: 0.0000


If the p-value is less than 0.05, you can conclude there is a significant difference in means.

---

### **ANOVA**

Before performing a one-way ANOVA, it is essential to check the following key assumptions:

- Normality: Each group should be approximately normally distributed.

- Homogeneity of Variances: All groups should have similar variances.

- Independence: Observations should be independent (usually ensured by study design).

**Marketing Strategy Effectiveness**

**Problem Statement**

A retail company wants to evaluate the effectiveness of three different marketing strategies (Email, Social Media, and TV Ads) on monthly sales. The company runs each campaign in different regions and records the monthly sales generated from each strategy. Management wishes to know if there is a statistically significant difference in mean sales across these marketing strategies.

**Hypotheses**

- **Null Hypothesis ($H_0$)**: The mean sales are the same for all marketing strategies ($\mu_1 = \mu_2 = \mu_3$).

- **Alternative Hypothesis ($H_1$)**: At least one marketing strategy has a different mean sales.

In [None]:
import numpy as np
from scipy.stats import f_oneway, shapiro, levene

# Sample sales data (in thousands)
email_sales = np.array([52, 55, 53, 57, 54])
social_media_sales = np.array([60, 62, 61, 59, 63])
tv_ads_sales = np.array([58, 56, 57, 55, 54])

# 1. Check normality for each group (Shapiro-Wilk test)
print("Normality (Shapiro-Wilk):")
for name, group in zip(['Email', 'Social Media', 'TV Ads'], [email_sales, social_media_sales, tv_ads_sales]):
    stat, p = shapiro(group)
    print(f"{name}: p-value = {p:.4f}")

# 2. Check homogeneity of variances (Levene's test)
stat, p = levene(email_sales, social_media_sales, tv_ads_sales)
print(f"\nLevene's test for equal variances: p-value = {p:.4f}")

# 3. Perform one-way ANOVA
f_stat, p_value = f_oneway(email_sales, social_media_sales, tv_ads_sales)
print(f"\nANOVA: F-statistic = {f_stat:.2f}, p-value = {p_value:.4f}")

Normality (Shapiro-Wilk):
Email: p-value = 0.9276
Social Media: p-value = 0.9672
TV Ads: p-value = 0.9672

Levene's test for equal variances: p-value = 0.9290

ANOVA: F-statistic = 21.40, p-value = 0.0001


---

**Comparing Customer Satisfaction Across Service Centers**

**Problem Statement**

An automotive company operates four service centers in different cities. The company collects customer satisfaction scores (on a scale of 1 to 10) from recent customers at each center. Management wants to determine if customer satisfaction differs significantly across the service centers.

**Hypotheses**

- **Null Hypothesis ($H_0$)**: The mean customer satisfaction scores are equal across all service centers ($\mu_1 = \mu_2 = \mu_3 = \mu_4$).

- **Alternative Hypothesis ($H_1$)**: At least one service center has a different mean satisfaction score.

In [None]:
import numpy as np
from scipy.stats import f_oneway, shapiro, levene

# Satisfaction scores for each service center
center_1 = np.array([8, 9, 7, 8, 9])
center_2 = np.array([6, 7, 6, 8, 7])
center_3 = np.array([9, 8, 9, 10, 9])
center_4 = np.array([7, 6, 7, 8, 7])

# 1. Check normality for each group (Shapiro-Wilk test)
print("Normality (Shapiro-Wilk):")
for name, group in zip(['Center 1', 'Center 2', 'Center 3', 'Center 4'], [center_1, center_2, center_3, center_4]):
    stat, p = shapiro(group)
    print(f"{name}: p-value = {p:.4f}")

# 2. Check homogeneity of variances (Levene's test)
stat, p = levene(center_1, center_2, center_3, center_4)
print(f"\nLevene's test for equal variances: p-value = {p:.4f}")

# 3. Perform one-way ANOVA
f_stat, p_value = f_oneway(center_1, center_2, center_3, center_4)
print(f"\nANOVA: F-statistic = {f_stat:.2f}, p-value = {p_value:.4f}")

**How to Interpret the results**

1. If all Shapiro-Wilk p-values > 0.05: Groups are normally distributed.

2. If Levene’s p-value > 0.05: Variances are equal (homogeneity holds).

3. If ANOVA p-value < 0.05: At least one group mean is significantly different.

### **Chi-Squared Test**

**Problem Statement:**

A marketing team at a software company wants to determine if there is a relationship between the type of advertising medium (online, print, television) and software purchases. They collect data from a sample of customers, noting the advertising medium each customer was exposed to and whether they purchased the software.

**Hypotheses:**

- Null Hypothesis ($H_0$): There is no association between the type of advertising medium and software purchases. The variables are independent.
- Alternative Hypothesis ($H_1$): There is an association between the type of advertising medium and software purchases. The variables are not independent.

In [9]:
import numpy as np
from scipy.stats import chi2_contingency

# Observed frequency data in a contingency table
# Rows in the data array represent: Advertising Medium (Online, Print, Television)
# Columns in the data array represent: Purchase (Yes, No)
data = np.array([[30, 10],  # Online
                 [20, 20],  # Print
                 [50, 30]]) # Television

# Perform Chi-Square Test of Independence
chi2_stat, p_value, dof, expected = chi2_contingency(data)

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We have sufficient evidence to reject the null hypothesis.")
    print("There is a significant association between the advertising medium and software purchases.")
else:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We do not have sufficient evidence to reject the null hypothesis.")
    print("There is no significant association between the advertising medium and software purchases.")

Chi-Square Statistic: 5.3333, p-value: 0.0695
We do not have sufficient evidence to reject the null hypothesis.
There is no significant association between the advertising medium and software purchases.


**Problem Statement 2:**
A restaurant chain wants to determine if there is an association between the type of cuisine offered (Italian, Chinese, Mexican) and customer satisfaction levels (satisfied, neutral, dissatisfied). They conduct a survey among customers across several locations to collect this data.

**Hypotheses:**
- Null Hypothesis ($H_0$): There is no association between the type of cuisine and customer satisfaction levels. The variables are independent.
- Alternative Hypothesis ($H_1$): There is an association between the type of cuisine and customer satisfaction levels. The variables are not independent.

In [None]:
import numpy as np
from scipy.stats import chi2_contingency

# Observed frequency data in a contingency table
# Rows: Type of Cuisine (Italian, Chinese, Mexican)
# Columns: Customer Satisfaction (Satisfied, Neutral, Dissatisfied)
data = np.array([[40, 30, 10],  # Italian
                 [35, 25, 20],  # Chinese
                 [25, 30, 15]]) # Mexican

# Perform Chi-Square Test of Independence
chi2_stat, p_value, dof, expected = chi2_contingency(data)

# Interpret the results
alpha = 0.05
if p_value < alpha:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We have sufficient evidence to reject the null hypothesis.")
    print("There is a significant association between the type of cuisine and customer satisfaction levels.")
else:
    print(f"Chi-Square Statistic: {chi2_stat:.4f}, p-value: {p_value:.4f}")
    print("We do not have sufficient evidence to reject the null hypothesis.")
    print("There is no significant association between the type of cuisine and customer satisfaction levels.")

Chi-Square Statistic: 6.4983, p-value: 0.1649
We do not have sufficient evidence to reject the null hypothesis.
There is no significant association between the type of cuisine and customer satisfaction levels.
