##  CHI -Square Test

Association between Device Type and Customer Satisfaction

Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

Data Provided:
The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:

Satisfaction	SmartThermostat SmartLight	Total

Very Satisfied 50	 70  120
Satisfied	     80	 100 180
Neutral      	60	90	150
Unsatisfied 	30	50	80
Very Unsatisfied	20	50	70
Total	        240	360	600

Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.


In [None]:
import pandas as pd
from scipy.stats import chi2_contingency

# Define the contingency table
data = {'Smart Thermostat': [50, 80, 60, 30, 20],
        'Smart Light': [70, 100, 90, 50, 50]}
contingency_table = pd.DataFrame(data, index=['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'])

# Display the contingency table
print("Contingency Table:")
display(contingency_table)

# Perform the Chi-Square test
chi2, p, dof, expected = chi2_contingency(contingency_table)

# Print the results
print("\nChi-Square Test Results:")
print(f"Chi-Square Statistic: {chi2}")
print(f"P-value: {p}")
print(f"Degrees of Freedom: {dof}")
print("Expected Frequencies Table:")
display(pd.DataFrame(expected, columns=contingency_table.columns, index=contingency_table.index))

# Interpret the results
alpha = 0.05
print("\nInterpretation:")
if p < alpha:
    print(f"Since the p-value ({p:.4f}) is less than the significance level ({alpha}), we reject the null hypothesis.")
    print("There is a significant association between the type of smart home device and customer satisfaction.")
else:
    print(f"Since the p-value ({p:.4f}) is greater than the significance level ({alpha}), we fail to reject the null hypothesis.")
    print("There is no significant association between the type of smart home device and customer satisfaction.")

Contingency Table:


Unnamed: 0,Smart Thermostat,Smart Light
Very Satisfied,50,70
Satisfied,80,100
Neutral,60,90
Unsatisfied,30,50
Very Unsatisfied,20,50



Chi-Square Test Results:
Chi-Square Statistic: 5.638227513227513
P-value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies Table:


Unnamed: 0,Smart Thermostat,Smart Light
Very Satisfied,48.0,72.0
Satisfied,72.0,108.0
Neutral,60.0,90.0
Unsatisfied,32.0,48.0
Very Unsatisfied,28.0,42.0



Interpretation:
Since the p-value (0.2278) is greater than the significance level (0.05), we fail to reject the null hypothesis.
There is no significant association between the type of smart home device and customer satisfaction.


Assigment Taks:

In [None]:
from scipy.stats import chi2

# Determine the critical value
critical_value = chi2.ppf(1 - alpha, dof)

print(f"Critical Value: {critical_value}")

Critical Value: 9.487729036781154


In [None]:
import pandas as pd
from scipy.stats import chi2_contingency, chi2

# --- Assignment Tasks ---

# 1. State the Hypotheses:
print("1. Hypotheses:")
print("   Null Hypothesis (H₀): There is no significant association between the type of smart home device and customer satisfaction level.")
print("   Alternative Hypothesis (H₁): There is a significant association between the type of smart home device and customer satisfaction level.")
print("-" * 30)

# Define the contingency table
data = {'Smart Thermostat': [50, 80, 60, 30, 20],
        'Smart Light': [70, 100, 90, 50, 50]}
contingency_table = pd.DataFrame(data, index=['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'])

print("\nContingency Table:")
print(contingency_table)
print("-" * 30)


# 2. Compute the Chi-Square Statistic:
print("\n2. Chi-Square Test Results:")
chi2_statistic, p_value, dof, expected_frequencies = chi2_contingency(contingency_table)
print(f"   Chi-Square Statistic: {chi2_statistic}")
print(f"   P-value: {p_value}")
print(f"   Degrees of Freedom: {dof}")
print("\n   Expected Frequencies Table:")
print(pd.DataFrame(expected_frequencies, columns=contingency_table.columns, index=contingency_table.index))
print("-" * 30)

# 3. Determine the Critical Value:
alpha = 0.05
critical_value = chi2.ppf(1 - alpha, dof)
print("\n3. Critical Value:")
print(f"   Significance Level (alpha): {alpha}")
print(f"   Degrees of Freedom: {dof}")
print(f"   Critical Value: {critical_value}")
print("-" * 30)

# 4. Compare the Chi-Square Statistic with the Critical Value and 5. Draw a Conclusion:
print("\n4. Comparison and 5. Conclusion:")
print(f"   Calculated Chi-Square Statistic: {chi2_statistic:.4f}")
print(f"   Critical Value: {critical_value:.4f}")

if chi2_statistic > critical_value:
    print("   Decision: Reject the null hypothesis.")
    print("   Conclusion: There is a significant association between the type of smart home device and customer satisfaction.")
else:
    print("   Decision: Fail to reject the null hypothesis.")
    print("   Conclusion: There is no significant association between the type of smart home device and customer satisfaction at the 0.05 significance level.")

print("-" * 30)

1. Hypotheses:
   Null Hypothesis (H₀): There is no significant association between the type of smart home device and customer satisfaction level.
   Alternative Hypothesis (H₁): There is a significant association between the type of smart home device and customer satisfaction level.
------------------------------

Contingency Table:
                  Smart Thermostat  Smart Light
Very Satisfied                  50           70
Satisfied                       80          100
Neutral                         60           90
Unsatisfied                     30           50
Very Unsatisfied                20           50
------------------------------

2. Chi-Square Test Results:
   Chi-Square Statistic: 5.638227513227513
   P-value: 0.22784371130697179
   Degrees of Freedom: 4

   Expected Frequencies Table:
                  Smart Thermostat  Smart Light
Very Satisfied                48.0         72.0
Satisfied                     72.0        108.0
Neutral                       60.0      

##   Hypothesis Testing

Background:

Bombay hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England. The operating cost for a franchise in a week (W) is given by the equation W = $1,000 + $5X, where X represents the number of units produced in a week. Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.

Objective:

To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.

Data Provided:

•	The theoretical weekly operating cost model: W = $1,000 + $5X
•	Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
•	Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units


### 1. State the Hypotheses

*   **Null Hypothesis (H₀):** The true mean weekly operating cost is equal to the cost predicted by the theoretical model (W = $1,000 + $5X).
*   **Alternative Hypothesis (H₁):** The true mean weekly operating cost is greater than the cost predicted by the theoretical model (W > $1,000 + $5X).

Assigment Task:

## 1.state the hypotheses statement
##  2.calculate the test statistic

In [None]:
# Data provided
sample_mean_cost = 3050  # ˉx
sample_size = 25         # n
mean_units_produced = 600 # μ for X
std_dev_units_produced = 25 # σ for X

# Theoretical cost model: W = 1000 + 5X
# Calculate theoretical mean weekly cost (μ)
theoretical_mean_cost = 1000 + 5 * mean_units_produced

# Calculate the standard deviation of the cost (σ)
# Since W = 1000 + 5X, the variance of W is Var(W) = Var(1000 + 5X) = 5^2 * Var(X)
# The standard deviation of W is σ_W = sqrt(Var(W)) = sqrt(25 * Var(X)) = 5 * std_dev_units_produced
std_dev_cost = 5 * std_dev_units_produced

# Calculate the standard error of the mean cost (σ_ˉx)
standard_error = std_dev_cost / (sample_size**0.5)

# Calculate the test statistic (t)
t_statistic = (sample_mean_cost - theoretical_mean_cost) / standard_error

print(f"Theoretical Mean Weekly Cost (μ): {theoretical_mean_cost}")
print(f"Standard Deviation of Weekly Cost (σ_W): {std_dev_cost}")
print(f"Standard Error of the Mean Cost: {standard_error}")
print(f"Test Statistic (t): {t_statistic}")

Theoretical Mean Weekly Cost (μ): 4000
Standard Deviation of Weekly Cost (σ_W): 125
Standard Error of the Mean Cost: 25.0
Test Statistic (t): -38.0


##  Determine the critical values:

In [None]:
from scipy.stats import norm

# Significance level
alpha = 0.05

# For a one-tailed test (greater than), the critical value is at 1 - alpha
critical_z_value = norm.ppf(1 - alpha)

print(f"Significance Level (alpha): {alpha}")
print(f"Critical Z-Value: {critical_z_value}")

Significance Level (alpha): 0.05
Critical Z-Value: 1.6448536269514722



##  4.Make a Decision

As I explained in the previous turn, to make a decision, we compare the calculated test statistic to the critical value.

Your calculated test statistic is approximately -38.0.
The critical Z-value for a one-tailed test at a 0.05 significance level is approximately 1.645.
Since the test statistic (-38.0) is less than the critical value (1.645), we fail to reject the null hypothesis.



##  5.conclusion

In [None]:
import numpy as np
from scipy.stats import norm

# --- Hypothesis Testing for Restaurant Operating Costs ---

# 1. State the Hypotheses:
print("1. Hypotheses:")
print("   Null Hypothesis (H₀): The true mean weekly operating cost is equal to the cost predicted by the theoretical model (W = $1,000 + $5X).")
print("   Alternative Hypothesis (H₁): The true mean weekly operating cost is greater than the cost predicted by the theoretical model (W > $1,000 + $5X).")
print("-" * 30)

# Data provided
sample_mean_cost = 3050  # ˉx
sample_size = 25         # n
mean_units_produced = 600 # μ for X
std_dev_units_produced = 25 # σ for X

# Theoretical cost model: W = 1000 + 5X
# Calculate theoretical mean weekly cost (μ)
theoretical_mean_cost = 1000 + 5 * mean_units_produced

# Calculate the standard deviation of the cost (σ_W)
# Since W = 1000 + 5X, the variance of W is Var(W) = Var(1000 + 5X) = 5^2 * Var(X)
# The standard deviation of W is σ_W = sqrt(Var(W)) = sqrt(25 * Var(X)) = 5 * std_dev_units_produced
std_dev_cost = 5 * std_dev_units_produced

# Calculate the standard error of the mean cost (σ_ˉx)
standard_error = std_dev_cost / (sample_size**0.5)

# 2. Calculate the Test Statistic:
# Since the population standard deviation of X is known and X is normally distributed,
# and the sample size is reasonably large (n=25), we can use a Z-test.
z_statistic = (sample_mean_cost - theoretical_mean_cost) / standard_error

print("\n2. Test Statistic Calculation:")
print(f"   Theoretical Mean Weekly Cost (μ): {theoretical_mean_cost}")
print(f"   Standard Deviation of Weekly Cost (σ_W): {std_dev_cost}")
print(f"   Standard Error of the Mean Cost: {standard_error}")
print(f"   Calculated Test Statistic (Z): {z_statistic}")
print("-" * 30)

# 3. Determine the Critical Value:
alpha = 0.05
# For a one-tailed test (greater than), the critical value is at 1 - alpha
critical_z_value = norm.ppf(1 - alpha)

print("\n3. Critical Value Determination:")
print(f"   Significance Level (alpha): {alpha}")
print(f"   Critical Z-Value (for a right-tailed test): {critical_z_value}")
print("-" * 30)

# 4. Make a Decision and 5. Draw a Conclusion:
print("\n4. Decision and 5. Conclusion:")
print(f"   Calculated Test Statistic (Z): {z_statistic:.4f}")
print(f"   Critical Z-Value: {critical_z_value:.4f}")

# Compare the test statistic to the critical value
if z_statistic > critical_z_value:
    print("   Decision: Reject the null hypothesis.")
    print("   Conclusion: There is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests at the 0.05 significance level.")
else:
    print("   Decision: Fail to reject the null hypothesis.")
    print("   Conclusion: There is not enough statistical evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests at the 0.05 significance level.")

print("-" * 30)

1. Hypotheses:
   Null Hypothesis (H₀): The true mean weekly operating cost is equal to the cost predicted by the theoretical model (W = $1,000 + $5X).
   Alternative Hypothesis (H₁): The true mean weekly operating cost is greater than the cost predicted by the theoretical model (W > $1,000 + $5X).
------------------------------

2. Test Statistic Calculation:
   Theoretical Mean Weekly Cost (μ): 4000
   Standard Deviation of Weekly Cost (σ_W): 125
   Standard Error of the Mean Cost: 25.0
   Calculated Test Statistic (Z): -38.0
------------------------------

3. Critical Value Determination:
   Significance Level (alpha): 0.05
   Critical Z-Value (for a right-tailed test): 1.6448536269514722
------------------------------

4. Decision and 5. Conclusion:
   Calculated Test Statistic (Z): -38.0000
   Critical Z-Value: 1.6449
   Decision: Fail to reject the null hypothesis.
   Conclusion: There is not enough statistical evidence to support the restaurant owners' claim that the weekly oper