# Customer Analysis - Statistical Hypothesis Testing

In [33]:
import pandas as pd
import numpy as np
from scipy import stats
import statsmodels.api as sm
from statsmodels.stats.weightstats import ztest
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Customer Analysis: Hypothesis Testing

This notebook demonstrates various **statistical tests** used in customer analysis:

1. **Z-Test**
2. **T-Test** (Independent and Paired/Related)
3. **ANOVA** (Analysis of Variance)
4. **Chi-Square Test**

We will also define the **null hypothesis (H0)** and **alternative hypothesis (H1)** for each test.


In [34]:

np.random.seed(42)

# Create Customer Dataset with differences in Spending Score by Region
data = pd.DataFrame({
    'CustomerID': range(1, 21),
    'Age': np.random.randint(18, 60, 20),
    'Annual_Income': np.random.randint(30000, 120000, 20),
    'Gender': np.random.choice(['Male', 'Female'], 20),
    'Region': np.random.choice(['North', 'South', 'East', 'West'], 20)
})

# Assign Spending Scores with different means for each region
data.loc[data['Region']=='North', 'Spending_Score'] = np.random.randint(60, 80, data[data['Region']=='North'].shape[0])
data.loc[data['Region']=='South', 'Spending_Score'] = np.random.randint(30, 50, data[data['Region']=='South'].shape[0])
data.loc[data['Region']=='East', 'Spending_Score'] = np.random.randint(40, 60, data[data['Region']=='East'].shape[0])
data.loc[data['Region']=='West', 'Spending_Score'] = np.random.randint(70, 90, data[data['Region']=='West'].shape[0])

data.head()

Unnamed: 0,CustomerID,Age,Annual_Income,Gender,Region,Spending_Score
0,1,56,97969,Male,West,87.0
1,2,46,35311,Male,East,47.0
2,3,32,113104,Male,West,84.0
3,4,25,83707,Male,North,77.0
4,5,38,115305,Male,West,82.0


# Z-Test

**Definition:** Z-Test is used to determine if the mean of a sample differs from a known population mean.  

**Assumptions:**  
- Large sample size (n > 30) or known population variance  

**Example:** Test if the **average spending score** differs from 50.

**Hypotheses:**  
- Null Hypothesis (H0): Mean Spending Score = 50  
- Alternative Hypothesis (H1): Mean Spending Score ≠ 50


In [35]:
# Z-Test
spending_scores = data['Spending_Score']

z_stat, p_val = ztest(spending_scores, value=50)
print("Z-Test statistic:", z_stat)
print("P-value:", p_val)

if p_val < 0.05:
    print("Reject null hypothesis: Mean spending score is significantly different from 50")
else:
    print("Fail to reject null hypothesis: No significant difference from 50")


Z-Test statistic: 2.067106715493516
P-value: 0.03872409945835298
Reject null hypothesis: Mean spending score is significantly different from 50


# T-Test

**Definition:** T-Test is used to compare means when the population variance is unknown or the sample size is small.

---

## Independent T-Test

**Example:** Compare **Annual Income** of **Male vs Female** customers.  

**Hypotheses:**  
- H0: Mean income of Male = Mean income of Female  
- H1: Mean income of Male ≠ Mean income of Female


In [36]:
# Independent T-Test
male_income = data[data['Gender']=='Male']['Annual_Income']
female_income = data[data['Gender']=='Female']['Annual_Income']

t_stat, p_val = stats.ttest_ind(male_income, female_income)
print("Independent T-Test statistic:", t_stat)
print("P-value:", p_val)

if p_val < 0.05:
    print("Reject null hypothesis: Income differs by Gender")
else:
    print("Fail to reject null hypothesis: No significant income difference by Gender")


Independent T-Test statistic: 1.106574013488821
P-value: 0.2830482404163571
Fail to reject null hypothesis: No significant income difference by Gender


## Paired / Related T-Test

**Example:** Compare **Spending Score** vs **scaled Annual Income** of the same customer.  

**Hypotheses:**  
- H0: Mean difference = 0  
- H1: Mean difference ≠ 0


In [37]:
# Paired T-Test
income_scaled = (data['Annual_Income'] - data['Annual_Income'].mean()) / 1000

t_stat, p_val = stats.ttest_rel(data['Spending_Score'], income_scaled)
print("Paired T-Test statistic:", t_stat)
print("P-value:", p_val)

if p_val < 0.05:
    print("Reject null hypothesis: Spending Score and scaled Income differ significantly")
else:
    print("Fail to reject null hypothesis: No significant difference between Spending Score and scaled Income")


Paired T-Test statistic: 9.459600133796954
P-value: 1.278056056330631e-08
Reject null hypothesis: Spending Score and scaled Income differ significantly


# ANOVA Test

**Definition:** ANOVA (Analysis of Variance) compares means of **three or more groups**.  

**Example:** Compare **Spending Score** across different **Regions**.  

**Hypotheses:**  
- H0: Mean spending score is the same across all regions  
- H1: At least one region has a different mean spending score


In [38]:
# ANOVA Test
regions = [group['Spending_Score'].values for name, group in data.groupby('Region')]
f_stat, p_val = stats.f_oneway(*regions)
print("ANOVA F-statistic:", f_stat)
print("P-value:", p_val)

if p_val < 0.05:
    print("Reject null hypothesis: Spending Score differs by Region")
else:
    print("Fail to reject null hypothesis: No significant difference by Region")



ANOVA F-statistic: 87.47868862986823
P-value: 3.866487142272411e-10
Reject null hypothesis: Spending Score differs by Region


# Post-hoc Test: Tukey HSD

**Definition:** After ANOVA indicates a significant difference, Tukey HSD helps identify **which pairs of groups differ significantly**.  

**Example:** Compare **Spending Score** between different **Regions** to see which regions differ.  

**Hypotheses for Tukey test (for each pair of groups):**  
- H0: Mean Spending Score of group1 = Mean Spending Score of group2  
- H1: Mean Spending Score of group1 ≠ Mean Spending Score of group2


In [39]:
# Tukey HSD Test
from statsmodels.stats.multicomp import pairwise_tukeyhsd

# Prepare data
spending_scores = data['Spending_Score']
regions = data['Region']

# Perform Tukey HSD
tukey_result = pairwise_tukeyhsd(endog=spending_scores, groups=regions, alpha=0.05)
print(tukey_result)


 Multiple Comparison of Means - Tukey HSD, FWER=0.05  
group1 group2 meandiff p-adj   lower    upper   reject
------------------------------------------------------
  East  North  22.6667 0.0011   9.0872  36.2462   True
  East  South -11.9583 0.0174 -22.0292  -1.8875   True
  East   West  29.6667    0.0  19.4015  39.9318   True
 North  South  -34.625    0.0 -46.3852 -22.8648   True
 North   West      7.0 0.3659   -4.927   18.927  False
 South   West   41.625    0.0  33.9261  49.3239   True
------------------------------------------------------


# Interpretation

- The Tukey HSD table shows **all pairwise comparisons** between groups (Regions).  
- Columns explained:  
  - **meandiff:** Difference between group means  
  - **p-adj:** Adjusted p-value for multiple comparisons  
  - **reject:** True if H0 is rejected (means significantly different)  

From this, we can identify **which specific regions have significantly different Spending Scores**.

# Remarks on Tukey HSD Test Results

- The **Tukey HSD post-hoc test** identifies which specific regions have significantly different Spending Scores after a significant ANOVA result.  
- **Significant differences between regions (reject H0):**  
  - East vs North  
  - East vs South  
  - East vs West  
  - North vs South  
  - South vs West  
- **No significant difference (fail to reject H0):**  
  - North vs West  
- **Interpretation:**  
  - Customers in some regions have Spending Scores that are **significantly higher or lower** than others.  
  - For example, North and West regions do not differ significantly, while South and West differ strongly.  
- These insights can help in **targeted marketing, regional promotions, or resource allocation** for customer engagement.


# Chi-Square Test

**Definition:** Chi-Square Test checks the **association between two categorical variables**.  

**Example:** Test association between **Gender** and **Region**.  

**Hypotheses:**  
- H0: Gender and Region are independent  
- H1: Gender and Region are dependent


In [40]:
# Chi-Square Test
contingency_table = pd.crosstab(data['Gender'], data['Region'])

chi2_stat, p_val, dof, expected = stats.chi2_contingency(contingency_table)
print("Chi-Square statistic:", chi2_stat)
print("P-value:", p_val)
print("Degrees of freedom:", dof)
print("Expected frequencies:\n", expected)

if p_val < 0.05:
    print("Reject null hypothesis: Gender and Region are associated")
else:
    print("Fail to reject null hypothesis: Gender and Region are independent")


Chi-Square statistic: 2.9761904761904763
P-value: 0.3953107229602416
Degrees of freedom: 3
Expected frequencies:
 [[1.5 1.  4.  3.5]
 [1.5 1.  4.  3.5]]
Fail to reject null hypothesis: Gender and Region are independent


# Summary

- **Z-Test:** Checks if a sample mean differs from a known population mean.  
- **T-Test:** Compares means of two groups (independent or paired).  
- **ANOVA:** Compares means across 3+ groups.  
- **Chi-Square:** Tests association between categorical variables.  

Each test includes:
- **Null hypothesis (H0)**
- **Alternative hypothesis (H1)**
- **Test statistic**
- **P-value interpretation**
