## **Chi-Square Test**

In [1]:
import pandas as pd
import numpy as np
import scipy.stats as stats
data =pd.read_excel("/content/chi-square.xlsx")
data


Unnamed: 0,Satisfaction,Smart Thermostat,Smart Light,Total
0,Very Satisfied,50,70,120
1,Satisfied,80,100,180
2,Neutral,60,90,150
3,Unsatisfied,30,50,80
4,Very Unsatisfied,20,50,70
5,Total,240,360,600


 Null Hypothesis (H0):
There is no significant association between the type of device purchased and customer satisfaction level.

Alternative Hypothesis (H1):
There is a significant association between the type of device purchased and customer satisfaction level.

In [2]:
data=np.array([[50, 70],   # Very Satisfied
                 [80, 100],  # Satisfied
                 [60, 90],   # Neutral
                 [30, 50],   # Unsatisfied
                 [20, 50]])  # Very Unsatisfied

data

array([[ 50,  70],
       [ 80, 100],
       [ 60,  90],
       [ 30,  50],
       [ 20,  50]])

In [3]:
from scipy import stats
from scipy.stats import chi2_contingency

#chi-Square Statistic, p-value, DOF , Expected
chi2_stat,p_value,dof,expected = stats.chi2_contingency(data)
print("Chi-Square Statistic:", chi2_stat)
print("P-Value:", p_value)
print("Degrees of Freedom:", dof)
print("Expected Frequencies Table:")
print(expected)


Chi-Square Statistic: 5.638227513227513
P-Value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies Table:
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


In [4]:
alpha = 0.05  #Significance Level
#chi-Square Critical Value
critical_value = stats.chi2.ppf(1 - 0.05, dof)
print("Chi-Square Critical:", critical_value)
p_value = 1 - stats.chi2.cdf(chi2_stat, dof)
print("P-Value:", p_value)

Chi-Square Critical: 9.487729036781154
P-Value: 0.22784371130697179


In [15]:
#Print Results
print("Chi-Square Statistic:", chi2_stat)
print("Chi-Square Critical:", critical_value)
print("P-Value:", p_value)
print("Degrees of Freedom:", dof)
print("Expected Frequencies Table:")
print(expected)
print(" ")
#Make a Decision and Conclusion
if (chi2_stat >= critical_value) and (p_value <= alpha):
    print("There is a significant association between the type of device purchased and customer satisfaction level.")
    print("Also the p-value of",p_value, "is lesser than significance level of 0.05, hence Alternate Hypothesis is true")
else:
    print("There is no significant association between the type of device purchased and customer satisfaction level.")
    print("Also the p-value of",p_value, "is Greater than significance level of 0.05, hence Null Hypothesis is true")


Chi-Square Statistic: 5.638227513227513
Chi-Square Critical: 1.7108820799094275
P-Value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies Table:
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]
 
There is no significant association between the type of device purchased and customer satisfaction level.
Also the p-value of 0.22784371130697179 is Greater than significance level of 0.05, hence Null Hypothesis is true


## **Hypothesis Testing**

In [6]:
import numpy as np
import pandas as pd
import scipy.stats as stats

Null Hypothesis (H0):The weekly operating costs are equal to the model(W = $1,000 + $5x)

Alternate Hypothesis(H1):The weekly operating costs are higher than the model


In [7]:
#Given data
sample_mean = 3050     # Sample mean weekly cost
n = 25                 # sample size
std_dev_units = 25     #std  dev of units
X = 600                # number of units
fixed_cost = 1000      # Fixed cost
var_cost_per_unit = 5  # Variable cost per unit


In [8]:
#calculate theoretical mean weekly cost
theoretical_mean = fixed_cost + var_cost_per_unit * X
print("Theoretical Mean:", theoretical_mean)


Theoretical Mean: 4000


In [9]:
#Calculate the std_dev of the weekly cost
std_dev = std_dev_units * np.sqrt(var_cost_per_unit)
print("Standard Deviation:", std_dev)

Standard Deviation: 55.90169943749474


In [10]:
std_error = std_dev / np.sqrt(n)
print("Standard Error:", std_error)

Standard Error: 11.180339887498949


In [11]:
#test statistic (t)
t = (sample_mean - theoretical_mean) / (std_dev / np.sqrt(n))
print("Test Statistic (t):", t)

Test Statistic (t): -84.970583144992


In [12]:
# Determine the critical value
alpha = 0.05   #Significance level
df = n - 1     #Degrees of freedom (df) = 25 - 1 = 24
critical_value = stats.t.ppf(1 - alpha, df)
print("Critical Value:", critical_value)

Critical Value: 1.7108820799094275


In [13]:
# Make a Decision
#compare the test statistic with the critical value
reject_null_hypothesis = t > critical_value
print("Reject Null Hypothesis:", reject_null_hypothesis)

Reject Null Hypothesis: False


In [14]:
# Conclusion and output results
print("Test Statistic (t):", t)
print("Critical Value:", critical_value)
print("Reject Null Hypothesis:", reject_null_hypothesis)
print("Conclusion:")
if reject_null_hypothesis:
    print("There is enough evidence to conclude that the weekly operating costs are higher than the model.")
else:
    print("There is not enough evidence to conclude that the weekly operating costs are higher than the model.")

Test Statistic (t): -84.970583144992
Critical Value: 1.7108820799094275
Reject Null Hypothesis: False
Conclusion:
There is not enough evidence to conclude that the weekly operating costs are higher than the model.
