# Chi-Square Test

The **Chi-Square ($\chi^2$) Test** is a non-parametric statistical test used to determine if there is a significant association between two categorical variables.

## Types of Chi-Square Tests
1.  **Test for Independence:** Determines if two categorical variables are related (e.g., Gender vs. Voting Preference).
2.  **Goodness of Fit Test:** Determines if a sample distribution matches a population distribution (e.g., Is a die fair?).

## Formula
$$\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}$$

Where:
*   $O_i$: Observed frequency
*   $E_i$: Expected frequency

## Assumptions
1.  Data is categorical.
2.  Observations are independent.
3.  Expected frequency in each cell should be at least 5.

## Test for Independence Example
*   **H0:** Variables are independent (No relationship).
*   **H1:** Variables are dependent (There is a relationship).

In [None]:
import pandas as pd
from scipy.stats import chi2_contingency

# Example: Gender vs Product Preference
# Data: Contingency Table
data = [[30, 10],  # Men: 30 like A, 10 like B
        [20, 20]]  # Women: 20 like A, 20 like B

# Create DataFrame for better view
df = pd.DataFrame(data, columns=['Product A', 'Product B'], index=['Men', 'Women'])
print("Observed Data:")
print(df)

# Perform Chi-Square Test of Independence
stat, p, dof, expected = chi2_contingency(data)

print(f"\nChi-Square Statistic: {stat:.4f}")
print(f"P-value: {p:.4f}")
print("Expected Frequencies:")
print(expected)

if p < 0.05:
    print("\nReject H0: There is a relationship between Gender and Product Preference.")
else:
    print("\nFail to Reject H0: No significant relationship found.")