# Chi-Square test
Association between Device Type and Customer Satisfaction
Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.
Data Provided:
The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:


Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.
Assignment Tasks:
1. State the Hypotheses:
2. Compute the Chi-Square Statistic:
3. Determine the Critical Value:
Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)
4. Make a Decision:
Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.

In [1]:
import numpy as np
import scipy.stats as stats

In [2]:
# Data provided
observed = np.array([[50, 70],
                     [80, 100],
                     [60, 90],
                     [30, 50],
                     [20, 50]])

# 1. State the Hypotheses:
# Null Hypothesis (H0): There is no significant association between the type of smart home device purchased and customer satisfaction level.
# Alternative Hypothesis (H1): There is a significant association between the type of smart home device purchased and customer satisfaction level.

In [3]:
# 2. Compute the Chi-Square Statistic:
chi2, p, dof, expected = stats.chi2_contingency(observed)

In [4]:
# 3. Determine the Critical Value:
alpha = 0.05
critical_value = stats.chi2.ppf(1 - alpha, dof)

# 4. Make a Decision:
# Compare the Chi-Square statistic with the critical value
decision = "reject" if chi2 > critical_value else "fail to reject"

In [5]:
# Report
print("Chi-Square Test for Independence")
print(f"Observed Data:\n{observed}")
print(f"Expected Data (under H0):\n{expected}")
print(f"Chi-Square Statistic: {chi2:.2f}")
print(f"Degrees of Freedom: {dof}")
print(f"Critical Value at alpha={alpha}: {critical_value:.2f}")
print(f"P-value: {p:.4f}")
print(f"Decision: {decision} the null hypothesis.")

if decision == "reject":
    print("Conclusion: There is a significant association between the type of smart home device purchased and customer satisfaction level.")
else:
    print("Conclusion: There is no significant association between the type of smart home device purchased and customer satisfaction level.")

Chi-Square Test for Independence
Observed Data:
[[ 50  70]
 [ 80 100]
 [ 60  90]
 [ 30  50]
 [ 20  50]]
Expected Data (under H0):
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]
Chi-Square Statistic: 5.64
Degrees of Freedom: 4
Critical Value at alpha=0.05: 9.49
P-value: 0.2278
Decision: fail to reject the null hypothesis.
Conclusion: There is no significant association between the type of smart home device purchased and customer satisfaction level.


# Hypothesis Testing
Background:
Bombay hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England. The operating cost for a franchise in a week (W) is given by the equation W = $1,000 + $5X, where X represents the number of units produced in a week. Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.
Objective:
To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.
Data Provided:
	•	The theoretical weekly operating cost model: W = $1,000 + $5X
	•	Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
	•	Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units
Assignment Tasks:
1. State the Hypotheses statement:
2. Calculate the Test Statistic:
Use the following formula to calculate the test statistic (t):
where:
	•	ˉxˉ = sample mean weekly cost (Rs. 3,050)
	•	μ = theoretical mean weekly cost according to the cost model (W = $1,000 + $5X for X = 600 units)
	•	σ = 5*25 units
	•	n = sample size (25 restaurants)
3. Determine the Critical Value:
Using the alpha level of 5% (α = 0.05), determine the critical value from the standard normal (Z) distribution table.
4. Make a Decision:
Compare the test statistic with the critical value to decide whether to reject the null hypothesis.
5. Conclusion:
Based on the decision in step 4, conclude whether there is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.


In [6]:
import numpy as np
import scipy.stats as stats

# 1. State the Hypotheses:
# Null Hypothesis (H0): The theoretical mean weekly cost model is accurate.
# Alternative Hypothesis (H1): The mean weekly cost is higher than the theoretical mean weekly cost model.

In [7]:
# Given data
sample_mean = 3050  # Sample mean weekly cost
theoretical_mean_cost_model = 1000 + 5 * 600  # Theoretical mean weekly cost
standard_deviation_units = 25  # Standard deviation of units produced
sample_size = 25  # Sample size

# Calculate the theoretical mean weekly cost
theoretical_mean = theoretical_mean_cost_model
# Calculate the population standard deviation for weekly cost
sigma = 5 * standard_deviation_units


In [8]:
# 2. Calculate the Test Statistic:
# Test statistic (t) formula: (sample_mean - theoretical_mean) / (sigma / sqrt(n))
test_statistic = (sample_mean - theoretical_mean) / (sigma / np.sqrt(sample_size))


In [9]:

# 3. Determine the Critical Value:
alpha = 0.05
# Since we are conducting a one-tailed test, we use the alpha level directly to get the critical value from the Z-distribution table.
critical_value = stats.norm.ppf(1 - alpha)


In [10]:
# 4. Make a Decision:
# Compare the test statistic with the critical value to decide whether to reject the null hypothesis.
decision = "reject" if test_statistic > critical_value else "fail to reject"


In [11]:
# 5. Conclusion:
conclusion = (
    "There is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests."
    if decision == "reject"
    else "There is not enough evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests."
)


In [12]:
# Report
print("Hypothesis Testing for Weekly Operating Costs")
print(f"Sample Mean: {sample_mean}")
print(f"Theoretical Mean: {theoretical_mean}")
print(f"Population Standard Deviation: {sigma}")
print(f"Sample Size: {sample_size}")
print(f"Test Statistic: {test_statistic:.2f}")
print(f"Critical Value at alpha={alpha}: {critical_value:.2f}")
print(f"Decision: {decision} the null hypothesis.")
print("Conclusion:", conclusion)

Hypothesis Testing for Weekly Operating Costs
Sample Mean: 3050
Theoretical Mean: 4000
Population Standard Deviation: 125
Sample Size: 25
Test Statistic: -38.00
Critical Value at alpha=0.05: 1.64
Decision: fail to reject the null hypothesis.
Conclusion: There is not enough evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.
