# HYPOTHESIS TESTING
# Background:
Bombay hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England. The operating cost for a franchise in a week (W) is given by the equation W = $1,000 + $5X, where X represents the number of units produced in a week. Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.
# Objective:
To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.
# Data Provided:
* The theoretical weekly operating cost model: W = $1,000 + $5X
* Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
* Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units
#  Assignment Tasks:
1. State the Hypotheses statement:
2. Calculate the Test Statistic:
Use the following formula to calculate the test statistic (t):
where:
* ˉxˉ = sample mean weekly cost (Rs. 3,050)
* μ = theoretical mean weekly cost according to the cost model (W = $1,000 + $5X for X = 600 units)
* σ = 5*25 units
* n = sample size (25 restaurants)
3. Determine the Critical Value:
Using the alpha level of 5% (α = 0.05), determine the critical value from the standard normal (Z) distribution table.
4. Make a Decision:
Compare the test statistic with the critical value to decide whether to reject the null hypothesis.
5. Conclusion:
Based on the decision in step 4, conclude whether there is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.


# ANSWERS:

# 1. State the Hypotheses statement:
* Null hypothesis (H0): The weekly operating costs are in line with the theoretical model, i.e., the mean weekly cost is equal to Rs. 1,000 + 5X, where X is the number of units produced in a week.
* Alternative hypothesis (H1): The weekly operating costs are higher than predicted by the theoretical model, i.e., the mean weekly cost is greater than Rs. 1,000 + 5X.

# 2. Calculate the Test Statistic:

In [10]:
import math
from scipy.stats import norm

# Sample mean weekly cost
x_bar = 3050

# Theoretical mean weekly cost according to the model
mu = 1000 + 5 * 600
print("mu : ",mu)

# Standard deviation
sigma = 5 * 25
print("sigma : ", sigma)
# Sample size
n = 25

# Calculate the test statistic
t = (x_bar - mu) / (sigma / math.sqrt(n))
print("t : ", t)

mu :  4000
sigma :  125
t :  -38.0


# 3. Determine the Critical Value:
* Since the alternative hypothesis is one-tailed (claiming that the costs are higher), we will use the right-tailed test.
* Using an alpha level of 5% (α = 0.05), the critical value from the standard normal distribution table corresponds to a Z-score of approximately 1.645.

In [11]:
# Determine the critical value (right-tailed test, alpha = 0.05)
alpha = 0.05
critical_value = norm.ppf(1 - alpha) 

# Output the results
print("Test Statistic (t):", round(t, 2))
print("Critical Value (Z):", critical_value)

Test Statistic (t): -38.0
Critical Value (Z): 1.6448536269514722


# 4. Make a Decision:

In [9]:
if t > critical_value:
    print("Decision: Reject the null hypothesis")
    print("Conclusion: There is strong evidence to support the restaurant owners' claim.")
else:
    print("Decision: Fail to reject the null hypothesis")
    print("Conclusion: There is not enough evidence to support the restaurant owners' claim.")

Decision: Fail to reject the null hypothesis
Conclusion: There is not enough evidence to support the restaurant owners' claim.


# 5. Conclusion:

Based on the decision Fail to reject the null hypothesis, There is not enough evidence to support the restaurant owners' claim.

# CHI-SQUARE TEST

Association between Device Type and Customer Satisfaction
# Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.
# Data Provided:
The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:
![image.png](attachment:image.png)

# Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.
# Assignment Tasks:
1. State the Hypotheses:
2. Compute the Chi-Square Statistic:
3. Determine the Critical Value:
Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)
4. Make a Decision:
Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.



# Answers

1. State the Hypotheses:
* Null hypothesis (H0): There is no association between the type of smart home device purchased and the customer's satisfaction level.
* Alternative hypothesis (H1): There is an association between the type of smart home device purchased and the customer's satisfaction level

2. Compute the Chi-Square Statistic:
* The formula for the chi-square statistic in a contingency table is:
![image-2.png](attachment:image-2.png)

3. Determine the Critical Value:
* We will use the significance level (alpha) of 0.05.
* Degrees of freedom = (Number of rows - 1) * (Number of columns - 1)

In [18]:
import numpy as np
from scipy.stats import chi2

# Define the observed frequencies
observed = np.array([[50, 70],
                      [80, 100],
                      [60, 90],
                      [30, 50],
                      [20, 50]])

# Calculate row totals
row_totals = np.sum(observed, axis=1)
print("row_totals : ",row_totals)

# Calculate column totals
col_totals = np.sum(observed, axis=0)
print("col_totals : ",col_totals)

# Calculate the grand total
grand_total = np.sum(observed)
print("grand_total : ",grand_total)

# Calculate the expected frequencies
expected = np.outer(row_totals, col_totals) / grand_total
print("expected : ",expected)

# Calculate the Chi-Square statistic
chi_squared = np.sum((observed - expected)**2 / expected)
print("chi_squared : ",chi_squared)

# Determine the degrees of freedom
degrees_of_freedom = (observed.shape[0] - 1) * (observed.shape[1] - 1)
print("degrees_of_freedom : ",degrees_of_freedom)

# Determine the critical value (alpha = 0.05)
alpha = 0.05
critical_value = chi2.ppf(1 - alpha, degrees_of_freedom)
print("critical_value : ",critical_value)

# Print the results
print("Chi-Square Statistic:", chi_squared)
print("Critical Value:", critical_value)

# Make a decision
if chi_squared > critical_value:
    print("Decision: Reject the null hypothesis")
    print("Conclusion: There is a significant association between the type of smart home device purchased and the customer's satisfaction level.")
else:
    print("Decision: Fail to reject the null hypothesis")
    print("Conclusion: There is no significant association between the type of smart home device purchased and the customer's satisfaction level.")


row_totals :  [120 180 150  80  70]
col_totals :  [240 360]
grand_total :  600
expected :  [[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]
chi_squared :  5.638227513227513
degrees_of_freedom :  4
critical_value :  9.487729036781154
Chi-Square Statistic: 5.638227513227513
Critical Value: 9.487729036781154
Decision: Fail to reject the null hypothesis
Conclusion: There is no significant association between the type of smart home device purchased and the customer's satisfaction level.
