# CHI-SQUARE TEST

## Association between Device Type and Customer Satisfaction

### Background:

### Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

### Data Provided:

### The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:


### Objective:
### To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.


### Assignment Tasks:
### 1. State the Hypotheses:
### 2. Compute the Chi-Square Statistic:
### 3. Determine the Critical Value:
### Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)
### 4. Make a Decision:
### Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.

In [1]:
# chi_square_analysis.py

import pandas as pd
import scipy.stats as stats

# Step 1: Define the observed data
data = {
    "Satisfaction": ["Very Satisfied", "Satisfied", "Neutral", "Unsatisfied", "Very Unsatisfied"],
    "Smart Thermostat": [50, 80, 60, 30, 20],
    "Smart Light": [70, 100, 90, 50, 50]
}

df = pd.DataFrame(data)
df.set_index("Satisfaction", inplace=True)

# Add a Total column (sum of Smart Thermostat and Smart Light)
df["Total"] = df["Smart Thermostat"] + df["Smart Light"]

# Step 2: Create the contingency table (only the first two columns)
observed = df[["Smart Thermostat", "Smart Light"]].values

# Step 3: Perform the Chi-Square Test
chi2_stat, p_val, dof, expected = stats.chi2_contingency(observed)

# Step 4: Determine the critical value
alpha = 0.05
critical_value = stats.chi2.ppf(1 - alpha, dof)

# Step 5: Print the results
print("Chi-Square Test for Independence")
print("----------------------------------")
print("\nObserved Frequencies with Totals:")
print(df)
print("----------------------------------")
print("\nExpected Frequencies:")
expected_df = pd.DataFrame(expected, index=df.index, columns=["Smart Thermostat", "Smart Light"])
print(expected_df.round(2))
print("----------------------------------")
print(f"\nChi-Square Statistic: {chi2_stat:.4f}")
print(f"Degrees of Freedom: {dof}")
print(f"Critical Value (α = 0.05): {critical_value:.4f}")
print(f"P-Value: {p_val:.4f}")
print("----------------------------------")

# Step 6: Conclusion
if chi2_stat > critical_value:
    print("\nConclusion: Reject the null hypothesis.")
    print("There is a significant association between device type and customer satisfaction.")
else:
    print("\nConclusion: Fail to reject the null hypothesis.")
    print("There is no significant association between device type and customer satisfaction.")


Chi-Square Test for Independence
----------------------------------

Observed Frequencies with Totals:
                  Smart Thermostat  Smart Light  Total
Satisfaction                                          
Very Satisfied                  50           70    120
Satisfied                       80          100    180
Neutral                         60           90    150
Unsatisfied                     30           50     80
Very Unsatisfied                20           50     70
----------------------------------

Expected Frequencies:
                  Smart Thermostat  Smart Light
Satisfaction                                   
Very Satisfied                48.0         72.0
Satisfied                     72.0        108.0
Neutral                       60.0         90.0
Unsatisfied                   32.0         48.0
Very Unsatisfied              28.0         42.0
----------------------------------

Chi-Square Statistic: 5.6382
Degrees of Freedom: 4
Critical Value (α = 0.05): 9.48