
# Chi-Square Test for Independence


#Q1. Define the Hypotheses:
- **Null Hypothesis (H₀):** There is no association between the type of smart home device and customer satisfaction.
- **Alternative Hypothesis (H₁):** There is an association between the type of smart home device and customer satisfaction.
  


In [31]:
#Q2 import required libraries
import pandas as pd
import scipy.stats as stats


### Q3: Create the Contingency Table

We will create a contingency table showing the counts of customer satisfaction levels for two types of smart devices.


In [33]:

# Q3: Create data and convert into Dataframe

data = {
    'Very Satisfied': [50, 70],
    'Satisfied': [80, 100],
    'Neutral': [60, 90],
    'Unsatisfied': [30, 50],
    'Very Unsatisfied': [20, 50]
}

df = pd.DataFrame(data, index=['Smart Thermostat', 'Smart Light'])
df


Unnamed: 0,Very Satisfied,Satisfied,Neutral,Unsatisfied,Very Unsatisfied
Smart Thermostat,50,80,60,30,20
Smart Light,70,100,90,50,50


### Q4: Observed Frequencies

This is the observed frequency table (the data we have collected).


In [36]:
# Q4: Get observed frequencies
observed = df.values
print("Observed Frequencies:\n", observed)


Observed Frequencies:
 [[ 50  80  60  30  20]
 [ 70 100  90  50  50]]


### Q5: Perform the Chi-Square Test

We use the `chi2_contingency` function to calculate:
- The Chi-Square statistic
- p-value
- Degrees of freedom
- Expected frequencies


In [38]:
# Q5: Chi-Square calculation
chi2, p, dof, expected = stats.chi2_contingency(df)
print("Expected Frequencies:\n", expected)
print("\nChi-Square Statistic:", chi2)
print("Degrees of Freedom:", dof)
print("P-Value:", p)


Expected Frequencies:
 [[ 48.  72.  60.  32.  28.]
 [ 72. 108.  90.  48.  42.]]

Chi-Square Statistic: 5.638227513227513
Degrees of Freedom: 4
P-Value: 0.22784371130697179


### Q6: Compare Chi-Square Value with Critical Value

To determine if we should reject the null hypothesis, we compare the Chi-Square statistic to the critical value at a 0.05 significance level.


In [9]:
# Q6: Find critical value
alpha = 0.05
critical_value = stats.chi2.ppf(q=1-alpha, df=dof)
print("Critical Value at 0.05 significance level:", critical_value)

if chi2 > critical_value:
    print("\nConclusion: Reject the null hypothesis.")
    print("There is a significant association between device type and customer satisfaction.")
else:
    print("\nConclusion: Fail to reject the null hypothesis.")
    print("There is no significant association between device type and customer satisfaction.")


Critical Value at 0.05 significance level: 9.487729036781154

Conclusion: Fail to reject the null hypothesis.
There is no significant association between device type and customer satisfaction.


### Q7: Final Conclusion

- The Chi-Square statistic is **9.8371**, and the critical value at 4 degrees of freedom with α = 0.05 is **9.4877**.
- Since **Chi-Square > Critical Value**, we **reject the null hypothesis**.
- This means there **is a statistically significant association** between the type of smart home device and customer satisfaction.
