#### Chi Square Test
The Chi-Square (χ²) test is a statistical hypothesis test used for categorical variables to determine whether there is a significant difference between observed and expected frequencies.

**A. Test of Independence**

Purpose: Checks if two categorical variables are related or independent

Example: Is there a relationship between Gender and Spending Category?

Hypotheses:

H₀ (Null): The variables are independent (no relationship).

H₁ (Alternate): The variables are associated.

**B. Goodness of Fit Test**
Purpose: Checks if the distribution of a single categorical variable fits a theoretical or expected distribution.

Example: Is the gender distribution in the sample 50:50 (Male:Female)?

Hypotheses:

H₀ (Null): Observed distribution matches expected.

H₁ (Alternate): Observed distribution is significantly different from expected.

**Assumptions**
1. Observations are independent.
2. Sample size should be large enough (typically expected frequency ≥ 5 for most cells).

3. Variables must be categorical (convert numerical to categories if needed).



In [1]:
import pandas as pd
pd.set_option('display.max_columns', None)

pd.set_option('display.max_rows', None)

pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)
import numpy as np

In [2]:
data = pd.read_csv(r'data\Mall_Customers.csv')
data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 200 entries, 0 to 199
Data columns (total 5 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   CustomerID              200 non-null    int64 
 1   Gender                  200 non-null    object
 2   Age                     200 non-null    int64 
 3   Annual Income (k$)      200 non-null    int64 
 4   Spending Score (1-100)  200 non-null    int64 
dtypes: int64(4), object(1)
memory usage: 7.9+ KB


In [7]:
# Use Case 1: Test of Independence (Gender vs Spending Category)
# HO: Gender and Spending Category are independent.
# H1: There is a relation between gender and spending score
# If p < 0.05, reject H₀.

# Spending score is a continuous variable, so convert to categories first
bins = [0,33,66,100]
labels = ['low','medium','high']
data['spending_category'] = pd.cut(data['Spending Score (1-100)'],bins=bins,labels=labels)

In [8]:
data.head()

Unnamed: 0,CustomerID,Gender,Age,Annual Income (k$),Spending Score (1-100),spending_category
0,1,Male,19,15,39,medium
1,2,Male,21,15,81,high
2,3,Female,20,16,6,low
3,4,Female,23,16,77,high
4,5,Female,31,17,40,medium


In [9]:
#### now to do chi square test, we need contingency table
data_crosstab = pd.crosstab(data['Gender'],data['spending_category'])
data_crosstab

spending_category,low,medium,high
Gender,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Female,25,54,33
Male,24,40,24


In [18]:
from scipy.stats import chi2_contingency
chi2, p, dof, expected = chi2_contingency(data_crosstab)
# Step 3: Print test results
print(f"\nChi-Square Statistic: {chi2:.2f}")
print(f"P-value: {p:.4f}")
print(f"Degrees of Freedom: {dof}")
print("\nExpected Frequencies:\n", pd.DataFrame(expected, 
                                                index=data_crosstab.index, 
                                                columns=data_crosstab.columns))

# Step 4: Conclusion
alpha = 0.05
if p < alpha:
    print("\n✅ Reject Null Hypothesis: There IS a significant association between Gender and Spending Category.")
else:
    print("\n❌ Fail to Reject Null Hypothesis: There is NO significant association between Gender and Spending Category.")



Chi-Square Statistic: 0.66
P-value: 0.7204
Degrees of Freedom: 2

Expected Frequencies:
 spending_category    low  medium   high
Gender                                 
Female             27.44   52.64  31.92
Male               21.56   41.36  25.08

❌ Fail to Reject Null Hypothesis: There is NO significant association between Gender and Spending Category.


#### Interpretion
1. There is no significant relationship between gender and spending category
2. The difference that we see in spending categories between male and female can be due to random sampling.
3. So, we cannot conclude that males and females spend differently in this dataset — at least not in a statistically significant way.
4. If the null hypothesis is true (i.e., Gender and Spending Category are independent), then there is a 72% chance of observing a difference in counts this extreme or more extreme just due to random variation.
5. A p-value of 0.72 means that the data is very compatible with the null hypothesis.
The difference you're seeing is so likely to happen randomly, that there's no statistical reason to believe there's a real association.


#### A Common Misunderstanding
Even if the raw counts seem different in the contingency table, the Chi-Square test asks:

Are these differences big enough to be meaningful, or are they just small random fluctuations?

Any differences observed are **likely due to random variation**, not a true relationship.

#### How expected got calcualted in this case

| Concept                | Formula/Logic                                                            |
| ---------------------- | ------------------------------------------------------------------------ |
| **Expected Count**     | $\frac{\text{Row Total} \times \text{Column Total}}{\text{Grand Total}}$ |
| **Degrees of Freedom** | $(\text{Rows} - 1) \times (\text{Columns} - 1)$                          |
| **Goal**               | To test whether the two variables are associated or not                  |


#### Goodness of Fit Test

In [20]:
from scipy.stats import chisquare

# Count observed frequencies
observed_counts = data['spending_category'].value_counts().sort_index()
print("Observed counts:\n", observed_counts)

# Expected counts — equal distribution
expected_counts = [len(data) / 3] * 3  # assuming equal expected for 3 categories
print("\nExpected counts:\n", expected_counts)

# Perform Chi-Square Goodness of Fit Test
chi2_stat, p_val = chisquare(f_obs=observed_counts, f_exp=expected_counts)

print(f"\nChi-square Statistic: {chi2_stat:.2f}")
print(f"P-value: {p_val:.4f}")

# Interpretation
if p_val < 0.05:
    print("✅ Reject H₀: The distribution of spending categories is significantly different from expected.")
else:
    print("❌ Fail to reject H₀: The observed distribution fits the expected distribution.")


Observed counts:
 spending_category
low       49
medium    94
high      57
Name: count, dtype: int64

Expected counts:
 [66.66666666666667, 66.66666666666667, 66.66666666666667]

Chi-square Statistic: 17.29
P-value: 0.0002
✅ Reject H₀: The distribution of spending categories is significantly different from expected.
