# CHI-SQUARE TEST

## Association between Device Type and Customer Satisfaction
### Background:
#### Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.
### Data Provided:
#### The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:


### Satisfaction  Smart Thermostat  Smart Light  Total
### Very Satisfied 50 70 120
### Satisfied 80 100 180
### Neutral 60 90 150
### Unsatisfied 30 50	80
### Very Unsatisfied 20 50	70
### Total 240 360 600

### Objective:
#### To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.


### Assignment Tasks:
### 1. State the Hypotheses:
### Null Hypothesis (H₀):
#### There is no association between the type of smart home device (Smart Thermostat vs. Smart Light) and customer satisfaction level. In other words customer satisfaction is independent of the device type.

### Alternative Hypothesis (H₁):
#### There is an association between the type of smart home device and customer satisfaction level meaning customer satisfaction depends on the device type.


### 2. Compute the Chi-Square Statistic:

# χ2=∑(Oij−Eij)^2 / Eij

## where:

### Oij= observed frequency
### Eij= expected frequency



In [1]:
import numpy as np
import scipy.stats as stats

observed = np.array([[50, 70],
                     [80, 100],
                     [60, 90],
                     [30, 50],
                     [20, 50]])

row_totals = observed.sum(axis=1, keepdims=True)
col_totals = observed.sum(axis=0, keepdims=True)
total = observed.sum()

expected = (row_totals @ col_totals) / total

chi_square_stat = ((observed - expected) ** 2 / expected).sum()

df = (observed.shape[0] - 1) * (observed.shape[1] - 1)

p_value = 1 - stats.chi2.cdf(chi_square_stat, df)

chi_square_stat, df, p_value


(5.638227513227513, 4, 0.22784371130697179)

### 3. Determine the Critical Value:
#### Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)

In [3]:
import scipy.stats as stats
alpha = 0.05

df = 4

critical_value = stats.chi2.ppf(1 - alpha, df)
critical_value

9.487729036781154

### 4. Make a Decision:
#### Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.


In [4]:
if chi_square_stat > critical_value:
    decision = "Reject the null hypothesis: There is a significant association between device type and customer satisfaction."
else:
    decision = "Fail to reject the null hypothesis: There is no significant association between device type and customer satisfaction."

decision


'Fail to reject the null hypothesis: There is no significant association between device type and customer satisfaction.'

#### Chi-Square Statistic (χ²): 5.64
#### Critical Value: 9.49
#### Decision Rule:
#### If χ² ≥ 9.49, reject the null hypothesis (H₀).
#### If χ² < 9.49, fail to reject the null hypothesis (H₀).
#### Since 5.64 < 9.49, we fail to reject the null hypothesis.

### Conclusion:
#### There is no statistically significant association between the type of smart home device (Smart Thermostat vs. Smart Light) and customer satisfaction level at the 0.05 significance level. This means that satisfaction levels appear to be independent of the device type.