# Example (Chi-Square Test)
Using the following data, determine if there is any relationship between qualification and marital status.

![image-2.png](attachment:image-2.png)

In [52]:
import pandas as pd
from scipy.stats import chi2_contingency

In [68]:
# Given data

data = pd.read_csv("chi2.csv")

# Contingency Table

In the context of a chi-square test, a contingency table is a table that displays the observed frequencies of two categorical variables. Each cell in the table represents the count of observations falling into a specific combination of categories for the two variables. The purpose of creating a contingency table is to assess the association or independence between the two categorical variables.

In Python, the creation of a contingency table is often done using the pd.crosstab function from the pandas library. This table is then used as input for the chi-square test.

The pd.pivot_table function is used to create a pivot table from a DataFrame (df). A pivot table is a way of summarizing data in a tabular form, particularly for categorical variables. In this context, it's being used to create a contingency table for the purpose of conducting a chi-square test.

Here's a breakdown of the parameters:

df: The DataFrame containing the data.

values='count': Specifies the column to aggregate. In this case, it's the 'count' column from the DataFrame.

index='qualification': Specifies the variable to be used as the index (rows) of the pivot table. In this case, it's 'qualification'.

columns='marital_status': Specifies the variable to be used as the columns of the pivot table. In this case, it's 'marital_status'.

In [82]:
df = pd.DataFrame(data)

# Create a contingency table
contingency_table = pd.pivot_table(df, values='count', index='qualification', columns='marital_status')

In [83]:
# Print the contingency table
print("Contingency Table:")
print(contingency_table)

Contingency Table:
marital_status  divorced  married  not married  windowed
qualification                                           
bachelor               9       45           21         9
high                   9       36           36         9
master                 3       36            9         6
middle                 6       12           18         3
phd                    3       21            6         3


In [84]:
# Perform the chi-square test
chi2, p, _, _ = chi2_contingency(contingency_table)

# Print the results
print(f"\nChi-Square Value: {chi2}")
print(f"P-Value: {p}")


Chi-Square Value: 23.566899766899766
P-Value: 0.02328101955707239


In [75]:
# Check the significance level (let's assume alpha = 0.05)
alpha = 0.05
if p < alpha:
    print("\nReject the null hypothesis. There is a significant relationship between qualification and marital status.")
else:
    print("\nFail to reject the null hypothesis. There is no significant relationship between qualification and marital status.")


Reject the null hypothesis. There is a significant relationship between qualification and marital status.
