# HYPOTHESIS TESTING

**Data Provided:**
•	The theoretical weekly operating cost model: W = $1,000 + $5X
•	Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
•	Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units


# 1.State the Hypotheses statement:


Null Hypothesis (H₀):
The current cost formula is correct. The weekly cost to run a franchise is $1,000 plus $5 for each unit produced.

Alternative Hypothesis (H₁):
The current cost formula is not correct. The real weekly costs are different from $1,000 plus $5 per unit.

# 2. Calculate the Test Statistic:

In [20]:
import scipy.stats as stats
import numpy as np

# Given values
sample_mean = 3050         # x̄
theoretical_mean = 4000    # μ
std_dev = 5 * 25           # σ = 5 * standard deviation of X
n = 25                     # Sample size
confidence = 0.95          # Confidence level

# Calculate the standard error

std_error = std_dev / np.sqrt(n)

# Calculate the t-statistic
t_stat = (sample_mean - theoretical_mean) / std_error

t_critical = stats.t.ppf((1 + confidence) / 2, df=n - 1)

margin_of_error = t_critical * std_error

# Calculate the degrees of freedom
df = n - 1
# printing lower limit and upper limit
print("lower limit",sample_mean-margin_of_error)
print("upper limit",sample_mean+margin_of_error)
# calculate the p value
p_value = (1 - stats.t.cdf(abs(t_stat), df))*2
# Print results
print(f"T-statistic: {t_stat}")
print(f"Degrees of freedom: {df}")
print(f"P-value: {p_value}")

# Interpretation
if p_value < 0.05:
    print("Result: Reject the null hypothesis (significant difference).")
else:
    print("Result: Fail to reject the null hypothesis (no significant difference).")

lower limit 2998.4025359592997
upper limit 3101.5974640407003
T-statistic: -38.0
Degrees of freedom: 24
P-value: 0.0
Result: Reject the null hypothesis (significant difference).


**We reject the null hypothesis.**

The data provides strong evidence that the actual weekly operating cost is significantly different from ₹4,000 — in fact, it is much lower. Therefore, the cost model is incorrect for current conditions.

# 3. Determine the Critical Value:
Using the alpha level of 5% (α = 0.05), determine the critical value from the standard normal (Z) distribution table.

In [22]:
# Given alpha level
alpha = 0.05

# Calculate the critical value for a two-tailed test
z_critical = stats.norm.ppf(1 - alpha / 2)

print(f"Z-critical value for α = 0.05 (two-tailed): ±{z_critical:.2f}")
# calculation of z score
z_score = (sample_mean - theoretical_mean) / (std_dev / np.sqrt(n))
print("Z-score:", z_score)

Z-critical value for α = 0.05 (two-tailed): ±1.96
Z-score: -38.0


# 4. Make a Decision:

If the calculated Z-value is less than –1.96 or greater than +1.96, we reject the null hypothesis (H₀).

Since:
Z=−38.0<−1.96,
we reject the null hypothesis.

# CHI-SQUARE TEST

# 1. State the Hypotheses:

Null Hypothesis (H₀):
The type of smart home device a customer buys does not affect how satisfied they are.
(Customer satisfaction is independent of the device type.)

Alternative Hypothesis (H₁):
The type of smart home device a customer buys does affect how satisfied they are.
(Customer satisfaction is related to the device type.)

# Compute the Chi-Square Statistic:

In [35]:
import scipy.stats as stats
import pandas as pd

# Create the observed data (contingency table)
# Each row represents a satisfaction level
# Each column represents a device type: Smart Thermostat or Smart Light

observed_data = [
    [50, 70],   # Very Satisfied
    [80, 100],  # Satisfied
    [60, 90],   # Neutral
    [30, 50],   # Unsatisfied
    [20, 50]    # Very Unsatisfied
]

# Define labels for clarity
satisfaction_levels = ["Very Satisfied", "Satisfied", "Neutral", "Unsatisfied", "Very Unsatisfied"]
device_types = ["Smart Thermostat", "Smart Light"]

# Perform the Chi-Square Test of Independence
chi2 = stats.chi2_contingency(observed_data)[0]
p_value = stats.chi2_contingency(observed_data)[1]
expected_data = stats.chi2_contingency(observed_data)[3]
# Convert expected data into a DataFrame for better display
data_present = pd.DataFrame(observed_data, index=satisfaction_levels, columns=device_types)
data_expected = pd.DataFrame(expected_data, index=satisfaction_levels, columns=device_types)
# Display expected frequencies
print("\ndata present in tables:")
print(data_present.round(2))
print("\nObserved Frequencies Table :")
print(data_expected.round(2))

#  Print the results
print("=== Chi-Square Test of Independence ===")
print(f"Chi-square statistic      : {chi2:.2f}")
print(f"P-value                   : {p_value:.4f}")

# Decision rule at alpha = 0.05
alpha = 0.05
if p_value < alpha:
    print("\nConclusion: Reject the null hypothesis.")
    print("There IS a significant association between device type and satisfaction level.")
else:
    print("\nConclusion: Fail to reject the null hypothesis.")
    print("There is NO significant association between device type and satisfaction level.")




Expected Frequencies Table :
                  Smart Thermostat  Smart Light
Very Satisfied                  50           70
Satisfied                       80          100
Neutral                         60           90
Unsatisfied                     30           50
Very Unsatisfied                20           50

Observed Frequencies Table :
                  Smart Thermostat  Smart Light
Very Satisfied                48.0         72.0
Satisfied                     72.0        108.0
Neutral                       60.0         90.0
Unsatisfied                   32.0         48.0
Very Unsatisfied              28.0         42.0
=== Chi-Square Test of Independence ===
Chi-square statistic      : 5.64
P-value                   : 0.2278

Conclusion: Fail to reject the null hypothesis.
There is NO significant association between device type and satisfaction level.


# Determine the Critical Value:

In [37]:
from scipy.stats import chi2
alpha = 0.05
df = 4
critical_value = chi2.ppf(1 - alpha, df)
print(f"Critical value at α = {alpha} and df = {df} is: {critical_value:.4f}")

Critical value at α = 0.05 and df = 4 is: 9.4877


Chi-Square test statistic is less compared to Critical value so it Failed to reject the null hypothesis.

# Make a Decision:

Because:

Chi-Square Statistic (5.64)
<
Critical Value (9.488)


OR

p-value (0.2278)
>
α (0.05)

We fail to reject the null hypothesis.