# Two sample Chi Square Test

Problem Statement:

A company wants to understand if there's a significant relationship between the payment mode users choose (Cash, Credit Card, or Online Payment) and their geographic region (North, South, East, or West). By examining this relationship, the company aims to tailor its payment processing and marketing strategies according to regional preferences.

To test this, a Chi-square test for independence will be performed on a dataset containing user transactions across various regions and payment modes. The null and alternative hypotheses are as follows:

    Null Hypothesis (H₀): There is no association between the payment mode and the user’s geographic region; any observed differences are due to random chance.
    Alternative Hypothesis (H₁): There is a significant association between the payment mode and the user’s geographic region.

The results of this test will help determine if user preferences for payment methods vary significantly by region, which could inform targeted regional strategies for payment options.

In [12]:
import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency

In [13]:
observed_data = pd.read_csv("contingency_data.csv",index_col=0)
observed_data

Unnamed: 0_level_0,East,North,South,West
PaymentMode,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Cash,11,8,8,5
Credit Card,3,9,10,8
Online Payment,3,8,19,8


In [14]:
alpha=0.05

In [16]:
chi2, p, dof, expected= chi2_contingency(observed_data)
print("chi2:",chi2)
print("p-value:,",p)
print("degree of freedom:",dof)
print("Expected:",expected)

chi2: 12.92649237030042
p-value:, 0.044218218761019314
degree of freedom: 6
Expected: [[ 5.44  8.   11.84  6.72]
 [ 5.1   7.5  11.1   6.3 ]
 [ 6.46  9.5  14.06  7.98]]


In [18]:
if p<=alpha:
  print("Reject the null hypothesis: There is a significant association between the payment mode and the user’s geographic region.")
else:
  print("Failed to reject the null hypothesis: There is no association between the payment mode and the user’s geographic region; any observed differences are due to random chance.")

Reject the null hypothesis: There is a significant association between the payment mode and the user’s geographic region.
