# CHI-SQUARE TEST

#### CHI-SQUARE TEST
Association between Device Type and Customer Satisfaction
Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.
Data Provided:
The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:
Satisfaction	Smart Thermostat	Smart Light	Total
Very Satisfied	50	70	120
Satisfied	80	100	180
Neutral	60	90	150
Unsatisfied	30	50	80
Very Unsatisfied	20	50	70
Total	240	360	600
Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.
Assignment Tasks:
1. State the Hypotheses:
2. Compute the Chi-Square Statistic:
3. Determine the Critical Value:
Using the significance level (alpha) of 0.05 and the degrees of freedom (which is the number of categories minus 1)
4. Make a Decision:
Compare the Chi-Square statistic with the critical value to decide whether to reject the null hypothesis.
Submission Guidelines:
•	Provide a detailed report of your analysis, including each step outlined in the assignment tasks in a python file.
•	Include all calculations, the Chi-Square statistic, the critical value, and your conclusion.


In [15]:
import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency, chi2

In [16]:
# 1. State the Hypotheses:
# H0: There is no significant association between the type of device purchased and the customer's satisfaction level.
# H1: There is a significant association between the type of device purchased and the customer's satisfaction level.

# Data Provided
data = {
    "Satisfaction": ["Very Satisfied", "Satisfied", "Neutral", "Unsatisfied", "Very Unsatisfied"],
    "Smart Thermostat": [50, 80, 60, 30, 20],
    "Smart Light": [70, 100, 90, 50, 50]
}

# Creating the contingency table
df = pd.DataFrame(data)
df.set_index("Satisfaction", inplace=True)
print("Contingency Table:")
print(df)

Contingency Table:
                  Smart Thermostat  Smart Light
Satisfaction                                   
Very Satisfied                  50           70
Satisfied                       80          100
Neutral                         60           90
Unsatisfied                     30           50
Very Unsatisfied                20           50


In [17]:
# 2. Compute the Chi-Square Statistic:
chi2_stat, p, dof, expected = chi2_contingency(df)

print("\nChi-Square Test Results:")
print(f"Chi-Square Statistic: {chi2_stat}")
print(f"p-value: {p}")
print(f"Degrees of Freedom: {dof}")
print("Expected Frequencies:")
print(pd.DataFrame(expected, index=df.index, columns=df.columns))


Chi-Square Test Results:
Chi-Square Statistic: 5.638227513227513
p-value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies:
                  Smart Thermostat  Smart Light
Satisfaction                                   
Very Satisfied                48.0         72.0
Satisfied                     72.0        108.0
Neutral                       60.0         90.0
Unsatisfied                   32.0         48.0
Very Unsatisfied              28.0         42.0


In [18]:
# 3. Determine the Critical Value:
alpha = 0.05
critical_value = chi2.ppf(1 - alpha, dof)
print(f"\nCritical Value (alpha = {alpha}): {critical_value}")


Critical Value (alpha = 0.05): 9.487729036781154


In [19]:
# 4. Make a Decision:
if chi2_stat > critical_value:
    decision = "Reject the null hypothesis. There is a significant association between the type of device purchased and customer satisfaction."
else:
    decision = "Fail to reject the null hypothesis. There is no significant association between the type of device purchased and customer satisfaction."

print("\nDecision:")
print(decision)



Decision:
Fail to reject the null hypothesis. There is no significant association between the type of device purchased and customer satisfaction.


#### The detailed report as a summary
#Summary Report:

    Association between Device Type and Customer Satisfaction

    Hypotheses:
    - Null Hypothesis (H0): There is no significant association between the type of device purchased and the customer's satisfaction level.
    - Alternative Hypothesis (H1): There is a significant association between the type of device purchased and the customer's satisfaction level.

    Contingency Table:
                      Smart Thermostat  Smart Light
Satisfaction                                   
Very Satisfied                  50           70
Satisfied                       80          100
Neutral                         60           90
Unsatisfied                     30           50
Very Unsatisfied                20           50

    Chi-Square Test Results:
    - Chi-Square Statistic: 5.638227513227513
    - p-value: 0.22784371130697179
    - Degrees of Freedom: 4
    - Expected Frequencies:
                      Smart Thermostat  Smart Light
Satisfaction                                   
Very Satisfied                48.0         72.0
Satisfied                     72.0        108.0
Neutral                       60.0         90.0
Unsatisfied                   32.0         48.0
Very Unsatisfied              28.0         42.0

    Critical Value (alpha = 0.05): 9.487729036781154

    Conclusion:
    Fail to reject the null hypothesis. There is no significant association between the type of device purchased and customer satisfaction.
    

## HYPOTHESIS TESTING

##### 
Background:
Bombay hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England. The operating cost for a franchise in a week (W) is given by the equation W = $1,000 + $5X, where X represents the number of units produced in a week. Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.
Objective:
To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.
Data Provided:
•	The theoretical weekly operating cost model: W = $1,000 + $5X
•	Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
•	Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units
Assignment Tasks:
1. State the Hypotheses statement:
2. Calculate the Test Statistic:
Use the following formula to calculate the test statistic (t):
where:
•	ˉxˉ = sample mean weekly cost (Rs. 3,050)
•	μ = theoretical mean weekly cost according to the cost model (W = $1,000 + $5X for X = 600 units)
•	σ = 5*25 units
•	n = sample size (25 restaurants)
3. Determine the Critical Value:
Using the alpha level of 5% (α = 0.05), determine the critical value from the standard normal (Z) distribution table.
4. Make a Decision:
Compare the test statistic with the critical value to decide whether to reject the null hypothesis.
5. Conclusion:
Based on the decision in step 4, conclude whether there is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.

Submission Guidelines:
•	Prepare python file detailing each step of your hypothesis testing process.
•	Include calculations for the test statistic and the critical value.
•	Provide a clear conclusion based on your analysis.



In [2]:
import numpy as np
from scipy.stats import norm

In [3]:

# Data Provided
sample_mean = 3050
X_mean = 600  # Mean number of units produced
X_std = 25    # Standard deviation of units produced
n = 25        # Sample size

# Calculate the theoretical mean weekly cost
mu = 1000 + 5 * X_mean  # Theoretical mean weekly cost
sigma = 5 * X_std       # Standard deviation of weekly cost

# 2. Calculate the Test Statistic:
# Test Statistic (t) = (ˉx - μ) / (σ / sqrt(n))
t_statistic = (sample_mean - mu) / (sigma / np.sqrt(n))
print(f"Test Statistic: {t_statistic:.4f}")

# 3. Determine the Critical Value:
# Using a one-tailed test with alpha = 0.05
alpha = 0.05
critical_value = norm.ppf(1 - alpha)
print(f"Critical Value (alpha = {alpha}): {critical_value:.4f}")

# 4. Make a Decision:
if t_statistic > critical_value:
    decision = "Reject the null hypothesis. There is significant evidence to support the claim that the weekly operating costs are higher than the model suggests."
else:
    decision = "Fail to reject the null hypothesis. There is no significant evidence to support the claim that the weekly operating costs are higher than the model suggests."

print("\nDecision:")
print(decision)

# The detailed report as a summary
def hypothesis_test_summary():
    summary = f"""
Hypothesis Testing on Weekly Operating Costs

Hypotheses:
- Null Hypothesis (H0): The actual mean weekly operating cost is equal to the theoretical mean weekly cost (μ = {mu}).
- Alternative Hypothesis (H1): The actual mean weekly operating cost is greater than the theoretical mean weekly cost (μ > {mu}).

Provided Data:
- Sample Mean Weekly Cost (ˉx): {sample_mean}
- Theoretical Mean Weekly Cost (μ): {mu}
- Standard Deviation (σ): {sigma}
- Sample Size (n): {n}

Calculations:
- Test Statistic (t): {t_statistic:.4f}
- Critical Value (alpha = {alpha}): {critical_value:.4f}

Conclusion:
{decision}
"""
    return summary

print("\nSummary Report:")
print(hypothesis_test_summary())


Test Statistic: -38.0000
Critical Value (alpha = 0.05): 1.6449

Decision:
Fail to reject the null hypothesis. There is no significant evidence to support the claim that the weekly operating costs are higher than the model suggests.

Summary Report:

Hypothesis Testing on Weekly Operating Costs

Hypotheses:
- Null Hypothesis (H0): The actual mean weekly operating cost is equal to the theoretical mean weekly cost (μ = 4000).
- Alternative Hypothesis (H1): The actual mean weekly operating cost is greater than the theoretical mean weekly cost (μ > 4000).

Provided Data:
- Sample Mean Weekly Cost (ˉx): 3050
- Theoretical Mean Weekly Cost (μ): 4000
- Standard Deviation (σ): 125
- Sample Size (n): 25

Calculations:
- Test Statistic (t): -38.0000
- Critical Value (alpha = 0.05): 1.6449

Conclusion:
Fail to reject the null hypothesis. There is no significant evidence to support the claim that the weekly operating costs are higher than the model suggests.

