## Assignment No.4 Part-I :- Chisquare_test 

# Chi-Square Test Analysis
## Association between Device Type and Customer Satisfaction

### Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: **Smart Thermostats** and **Smart Lights**. They aim to determine if there's a significant association between the type of device purchased and the customer satisfaction level.

### Data Provided:

| Satisfaction       | Smart Thermostat | Smart Light | Total |
|--------------------|------------------|-------------|-------|
| Very Satisfied      | 50               | 70          | 120   |
| Satisfied           | 80               | 100         | 180   |
| Neutral             | 60               | 90          | 150   |
| Unsatisfied         | 30               | 50          | 80    |
| Very Unsatisfied    | 20               | 50          | 70    |
| **Total**           | **240**          | **360**     | **600** |

### Objective:
To use the **Chi-Square test for independence** to determine if there is a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.

---

### Steps Involved in the Chi-Square Test:

#### 1. Hypotheses:

- **Null Hypothesis (H₀):**  
   *There is no association between the type of smart home device (Smart Thermostat or Smart Light) and customer satisfaction level.*
  
- **Alternative Hypothesis (H₁):**  
   *There is an association between the type of smart home device and customer satisfaction level.*

#### 2. Compute the Chi-Square Statistic:

Using the observed and expected frequencies, calculate the Chi-Square statistic using the formula:

\[
\chi^2 = \sum \frac{(O - E)^2}{E}
\]

Where:
- \(O\) = Observed frequency
- \(E\) = Expected frequency

#### 3. Determine the Critical Value:

- Significance level (\(\alpha\)) = 0.05
- Degrees of freedom (df) = (Number of rows - 1) * (Number of columns - 1)
  \[
  df = (5 - 1)(2 - 1) = 4
  \]

Using the Chi-Square distribution table, find the critical value for \(df = 4\) and \(\alpha = 0.05\).

#### 4. Make a Decision:

- Compare the calculated Chi-Square statistic to the critical value.
- If the Chi-Square statistic > Critical value: Reject the null hypothesis.
- If the Chi-Square statistic ≤ Critical value: Fail to reject the null hypothesis.

### Conclusion:
Based on the results, state whether there is a significant association between the type of device and customer satisfaction.

---


In [22]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from scipy.stats import chi2_contingency


In [23]:

# Data provided in the contingency table
observed_data = np.array([[50, 70],  # Very Satisfied
                          [80, 100], # Satisfied
                          [60, 90],  # Neutral
                          [30, 50],  # Unsatisfied
                          [20, 50]]) # Very Unsatisfied


In [24]:
observed_data


array([[ 50,  70],
       [ 80, 100],
       [ 60,  90],
       [ 30,  50],
       [ 20,  50]])

In [25]:
# Sum of each row and column
row_totals = np.sum(observed_data, axis=1)
column_totals = np.sum(observed_data, axis=0)
total = np.sum(row_totals)


In [26]:
# Calculating the expected frequencies
expected_data = np.outer(row_totals, column_totals) / total


In [27]:

# Performing the Chi-Square test
chi2_statistic, p_value, dof, expected = chi2_contingency(observed_data)


In [29]:
chi2_statistic


5.638227513227513

In [30]:
expected


array([[ 48.,  72.],
       [ 72., 108.],
       [ 60.,  90.],
       [ 32.,  48.],
       [ 28.,  42.]])

In [44]:
expected_df = pd.DataFrame(observed_data, columns= ["Smart Thermostat", "Smart Light"])
expected_df["Satisfaction Level"] = ["Very Satisfied","Satisfied","Neutral","Unsatisfied","Very Unsatisfied"]
expected_df["Smart Thermostat Expected"] = [expected[i][0] for i in range(len(expected))]
expected_df["Smart Light Expected"] = [expected[i][1] for i in range(len(expected))]


In [45]:
expected_df


Unnamed: 0,Smart Thermostat,Smart Light,Satisfaction Level,Smart Thermostat Expected,Smart Light Expected
0,50,70,Very Satisfied,48.0,72.0
1,80,100,Satisfied,72.0,108.0
2,60,90,Neutral,60.0,90.0
3,30,50,Unsatisfied,32.0,48.0
4,20,50,Very Unsatisfied,28.0,42.0


In [33]:
p_value


0.22784371130697179

In [34]:
alpha = 0.05


In [35]:
if p_value < alpha:
    print("\nConclusion: Reject the null hypothesis (H₀). There is a significant association between the type of device and customer satisfaction.")
else:
    print("\nConclusion: Fail to reject the null hypothesis (H₀). No significant association found between the type of device and customer satisfaction.")



Conclusion: Fail to reject the null hypothesis (H₀). No significant association found between the type of device and customer satisfaction.
