##Chi Square Test

####Association between Device Type and Customer Satisfaction

####Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

####Data Provided:

| Satisfaction Level   | Smart Thermostat | Smart Light |
|----------------------|-----------------|-------------|
| Very Satisfied      | 50              | 70          |
| Satisfied           | 80              | 100         |
| Neutral             | 60              | 90          |
| Unsatisfied         | 30              | 50          |
| Very Unsatisfied    | 20              | 50          |  
  

####Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.


_____

###Assignment Tasks:



_____

####1. State the Hypotheses:

* Null Hypothesis $( H_0 )$: The type of smart device purchased and customer satisfaction are independent (i.e., no association).

* Alternative Hypothesis $( H_A )$: The type of smart device purchased and customer satisfaction are not independent (i.e., there is an association).

____

In [1]:
# Import necessary libraries
import pandas as pd
from scipy.stats import chi2_contingency

####2. Compute the Chi-Square Statistic:


In [2]:
# Define the contingency table (observed frequencies)
# Rows represent satisfaction levels, and columns represent device types
data = [[50, 70], [80, 100],[60, 90],[30, 50],[20, 50]]

In [3]:
# Perform the Chi-Square test for independence
stat, p, dof, expected = chi2_contingency(data)

In [4]:
# Display the computed Chi-Square statistic, p-value, degrees of freedom, and expected frequencies
print(f"Chi-Square Statistic:", stat, "\n")
print("p-value:", p, "\n")
print("Degrees of Freedom:", dof, "\n")
print("Expected Frequencies:\n", expected, "\n")

Chi-Square Statistic: 5.638227513227513 

p-value: 0.22784371130697179 

Degrees of Freedom: 4 

Expected Frequencies:
 [[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]] 



* Chi-Square Statistic: 5.6382 → Measures how much observed values deviate from expected values.
* p-value: 0.2278 → Probability of obtaining this result if the null hypothesis is true.
* Degrees of Freedom (df): 4 → Computed as (rows - 1) * (columns - 1).
* Expected Frequencies: Shows the expected values if there was no association between device type and satisfaction.

_____

####3. Determine the Critical Value:
Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)


In [5]:
# Import the stats module from scipy for statistical functions
from scipy import stats

In [6]:
# Define the significance level (alpha), which is the probability of rejecting the null hypothesis when it is actually true
alpha = 0.05
# Compute the critical value for the chi-square distribution
# This finds the value where 95% (1 - alpha) of the distribution lies below it, given the degrees of freedom (dof)
chi2_critical = stats.chi2.ppf(1 - alpha, dof)
chi2_critical

9.487729036781154

stats.chi2.ppf(1 - alpha, dof) calculates the critical value for a given alpha (0.05) and degrees of freedom (4).  
The output 9.4877 is the threshold above which we reject the null hypothesis.

________


####4. Make a Decision:
Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.

In [7]:
# Compare the computed Chi-Square Statistic with the Critical Value
if stat > chi2_critical:
    print("We Reject Null Hypothesis")  # If the test statistic is greater, we reject H0 (significant association)
else:
    print("We Fail to Reject Null Hypothesis")  # Otherwise, we fail to reject H0 (no significant association)

We Fail to Reject Null Hypothesis


* Chi-Square Statistic (5.6382) < Critical Value (9.4877)  
* Since our test statistic is less than the critical value, we fail to reject the null hypothesis.  
* Output: "Accept Null Hypothesis"

Since the computed Chi-Square Statistic is less than the critical value, and the p-value is greater than 0.05,
we fail to reject the null hypothesis. This means that customer satisfaction does not significantly depend
on the type of smart home device purchased.

____