# 1. State the Hypothesis
- Null Hypothesis (H₀): There is no association between the type of smart home device purchased (Smart Thermostat vs. Smart Light) and customer satisfaction level.
- Alternative Hypothesis (H₁): There is an association between the type of smart home device purchased and customer satisfaction level.

In [63]:
import pandas as pd
import researchpy as rp

In [64]:
# Data : Creating a DataFrame from the observed frequencies

data = {
    'Device_Type': ['Smart Thermostat'] * 5 + ['Smart Light'] * 5,
    'Satisfaction': ['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'] * 2,
    'Count': [50, 80, 60, 30, 20, 70, 100, 90, 50, 50]
}
df = pd.DataFrame(data)
df

Unnamed: 0,Device_Type,Satisfaction,Count
0,Smart Thermostat,Very Satisfied,50
1,Smart Thermostat,Satisfied,80
2,Smart Thermostat,Neutral,60
3,Smart Thermostat,Unsatisfied,30
4,Smart Thermostat,Very Unsatisfied,20
5,Smart Light,Very Satisfied,70
6,Smart Light,Satisfied,100
7,Smart Light,Neutral,90
8,Smart Light,Unsatisfied,50
9,Smart Light,Very Unsatisfied,50


In [65]:
# Creating the contingency table to match the format provided in the question
contingency_table = pd.crosstab(df['Satisfaction'], df['Device_Type'], values=df['Count'], aggfunc='sum', margins=True)
contingency_table

Device_Type,Smart Light,Smart Thermostat,All
Satisfaction,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Neutral,90,60,150
Satisfied,100,80,180
Unsatisfied,50,30,80
Very Satisfied,70,50,120
Very Unsatisfied,50,20,70
All,360,240,600


# 2. Compute the Chi-Square Statistic:

In [66]:
# Now expand the data based on the 'Count' column
df_expanded = df.loc[df.index.repeat(df['Count'])].reset_index(drop=True)

# Perform the chi-square test using researchpy
table, results = rp.crosstab(df_expanded['Satisfaction'], df_expanded['Device_Type'], test='chi-square')
table 

# Print the chi-square test results
print("\nChi-Square Test Results:")
print(results)


Chi-Square Test Results:
                Chi-square test  results
0  Pearson Chi-square ( 4.0) =    5.6382
1                    p-value =    0.2278
2                 Cramer's V =    0.0969


  table.iloc[0,0] = f"Pearson Chi-square ({dof: .1f}) = "


In [67]:
chi_square_stat = results['results'].iloc[0]
chi_square_stat

5.6382

# 3. Determine the Critical Value
- The degrees of freedom (𝑑𝑓) for a chi-square test is calculated as:
- 𝑑𝑓=(number of rows−1)×(number of columns−1)
- For this table: Rows = 5 (Satisfaction categories), Columns = 2 (Device types)
- So,𝑑𝑓=(5−1)×(2−1)=4
- Using a significance level : 𝛼=0.05 we will look up the critical value from the chi-square distribution table for 4 degrees of freedom

In [68]:
import scipy.stats as stats
critical_value = stats.chi2.ppf(q = 0.95, df = 4)
critical_value.round(2)    

9.49

# 4. Decision Making

In [69]:
# Decision based on the results

if chi_square_stat > critical_value:
    print("\nDecision: Reject the null hypothesis. There is a significant association between device type and satisfaction level.")
else:
    print("\nDecision: Fail to reject the null hypothesis. No significant association between device type and satisfaction level.")


Decision: Fail to reject the null hypothesis. No significant association between device type and satisfaction level.


# 5. Conclusion

In [70]:
# conclusions based on above analysis

print("Since the chi-square statistic (5.64) is less than the critical value (9.49), and the p-value (0.228) is greater than 0.05, we fail to reject the null hypothesis. This means there is no significant association between the type of smart home device purchased and the customer satisfaction level at the 0.05 significance level.")

Since the chi-square statistic (5.64) is less than the critical value (9.49), and the p-value (0.228) is greater than 0.05, we fail to reject the null hypothesis. This means there is no significant association between the type of smart home device purchased and the customer satisfaction level at the 0.05 significance level.
