**CHI-SQUARE TEST**


Association between Device Type and Customer Satisfaction

Background:

Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

Data Provided:

The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:

|Satisfaction	| Smart Thermostat |	Smart Light	Total |
|:------------ |:------------ |:------------ |
|Very Satisfied	| 50	| 70	| 120 |
|Satisfied	| 80 |	100 |	180|
|Neutral	| 60	| 90	| 150 |
|Unsatisfied |	30	 |50	| 80 |
|Very Unsatisfied	| 20	|50	|70 |
|Total	|240	|360	| 600|


Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.
Assignment Tasks:
1. State the Hypotheses:
2. Compute the Chi-Square Statistic:
3. Determine the Critical Value:
Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)

4. Make a Decision:

Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.

Submission Guidelines:

•	Provide a detailed report of your analysis, including each step outlined in the assignment tasks in a python file.

•	Include all calculations, the Chi-Square statistic, the critical value, and your conclusion.


1. State the Hypotheses:


Null Hypothesis (
𝐻
0
): There is no significant association between the type of smart home device purchased and the customer satisfaction level.

Alternative Hypothesis (
𝐻
𝐴
): There is a significant association between the type of smart home device purchased and the customer satisfaction level.

2. Compute the Chi-Square Statistic:

We'll calculate the chi-square statistic using the observed frequencies from the contingency table and the expected frequencies.

In [1]:
import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency

# Observed frequencies
data = np.array([[50, 70],
                 [80, 100],
                 [60, 90],
                 [30, 50],
                 [20, 50]])

# Create a DataFrame for better visualization
df = pd.DataFrame(data, columns=['Smart Thermostat', 'Smart Light'],
                  index=['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'])

# Perform the Chi-Square test
chi2, p, dof, expected = chi2_contingency(df)

print(f"Chi-Square Statistic: {chi2}")
print(f"P-value: {p}")
print(f"Degrees of Freedom: {dof}")
print("Expected Frequencies:\n", expected)


Chi-Square Statistic: 5.638227513227513
P-value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies:
 [[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


3. Determine the Critical Value:

Using a significance level (alpha) of 0.05 and the degrees of freedom (𝑑𝑓), we can find the critical value from the chi-square distribution table. The degrees of freedom for this test are calculated as:

df =(𝑟−1)×(𝑐−1)

where 𝑟 is the number of rows, and 𝑐 is the number of columns.

For our data:

df = (5−1)×(2−1)=4

At alpha = 0.05 and df = 4, the critical value can be looked up in a chi-square distribution table.

In [2]:
from scipy.stats import chi2

alpha = 0.05
critical_value = chi2.ppf(1 - alpha, dof)
print(f"Critical Value: {critical_value}")


Critical Value: 9.487729036781154


4. Make a Decision:

Compare the Chi-Square statistic to the critical value.

- If the Chi-Square statistic is greater than the critical value, we reject the null hypothesis.

- If the Chi-Square statistic is less than or equal to the critical value, we fail to reject the null hypothesis.

In [3]:
import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency, chi2

# Observed frequencies
data = np.array([[50, 70],
                 [80, 100],
                 [60, 90],
                 [30, 50],
                 [20, 50]])

# Create a DataFrame for better visualization
df = pd.DataFrame(data, columns=['Smart Thermostat', 'Smart Light'],
                  index=['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'])

# Perform the Chi-Square test
chi2_stat, p, dof, expected = chi2_contingency(df)

# Calculate the critical value
alpha = 0.05
critical_value = chi2.ppf(1 - alpha, dof)

# Print results
print(f"Chi-Square Statistic: {chi2_stat}")
print(f"P-value: {p}")
print(f"Degrees of Freedom: {dof}")
print("Expected Frequencies:\n", expected)
print(f"Critical Value: {critical_value}")

# Conclusion
if chi2_stat > critical_value:
    print("Reject the null hypothesis - There is a significant association between the type of device and customer satisfaction level.")
else:
    print("Fail to reject the null hypothesis - No significant association between the type of device and customer satisfaction level.")


Chi-Square Statistic: 5.638227513227513
P-value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies:
 [[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]
Critical Value: 9.487729036781154
Fail to reject the null hypothesis - No significant association between the type of device and customer satisfaction level.
