#                                                Hypothesis Testing 

## Hypothesis Testing

### 1. State the Hypotheses Statement:
- Null Hypothesis (H0): The mean weekly operating cost is as per the theoretical model, i.e., (𝜇 = $1,000 + $5X ).
- Alternative Hypothesis ((H_1): The mean weekly operating cost is higher than the theoretical model, i.e., 𝜇 > $1,000 + $5X ).

### 2. Calculate the Test Statistic:
The theoretical mean weekly cost (\( \mu \)) can be calculated using the provided cost model:
μ=$1,000+$5×600=$4,000


In [33]:
import scipy.stats as st
import numpy as np
import pandas as pd

In [34]:
# Given data in problem statement
samp_mean = 3050 
pop_mean = 4000
pop_std = 5 * 25
n = 25
alpha = 0.05

In [35]:
# Calculate the t-statistic value
tvalue = (samp_mean - pop_mean) / (pop_std / np.sqrt(n))
print(f"Test Statistic (t-value): {tvalue}")

Test Statistic (t-value): -38.0


In [36]:
# Calculate the p-value using scipy.stats model
pvalue = st.t.cdf(tvalue, df=n-1)
print(f"P-value: {pvalue}")

P-value: 2.96078108081779e-23


In [37]:
# Decision based on p-value
if pvalue < alpha:
    print('We reject the Null hypothesis, i.e., the weekly operating cost is not equal to the theoretical weekly operating cost.')
else:
    print('Fail to reject the Null Hypothesis, i.e., the weekly operating cost is equal to the theoretical weekly operating cost.')


We reject the Null hypothesis, i.e., the weekly operating cost is not equal to the theoretical weekly operating cost.


In [38]:
# Determine the critical value for alpha = 0.05 (one-tailed test)
critical_value = st.norm.ppf(alpha)
print(f"Critical Value: {critical_value}")


Critical Value: -1.6448536269514729


### Results and Conclusion:
1. **Test Statistic Calculation:**
   - Test Statistic (\(t\)) = -38.0
2. **P-value Calculation:**
   - P-value = \(2.9607810808177907e-23\)
3. **Decision:**
   - Since the p-value (\(2.9607810808177907e-23\)) is less than the alpha level (0.05), we reject the null hypothesis.
4. **Conclusion:**
   - There is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.

This code covers the hypothesis testing as required, with a clear decision-making process based on the calculated p-value and the given level of significance.

# Chi Square test :

### Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.


### Objective: 
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level. .

### State the null hypothesis

Null Hypothesis(H0) = The columns (device type and satisfaction level) are independent, i.e., there is no significant association between the type of smart home device purchased and customer satisfaction level.

Alternate Hypothesis (H_1)= The Columns are Dependent i.e there is a relationship between two columns.

In [39]:
import numpy as np
import pandas as pd
from scipy.stats import chi2_contingency
import scipy.stats as st

In [40]:
# Given data
data = {
    'Satisfaction': ['Very Satisfied', 'Satisfied', 'Neutral', 'Unsatisfied', 'Very Unsatisfied'],
    'Smart Thermostat': [50, 80, 60, 30, 20],
    'Smart Light': [70, 100, 90, 50, 50]
}


In [41]:
data


{'Satisfaction': ['Very Satisfied',
  'Satisfied',
  'Neutral',
  'Unsatisfied',
  'Very Unsatisfied'],
 'Smart Thermostat': [50, 80, 60, 30, 20],
 'Smart Light': [70, 100, 90, 50, 50]}

In [42]:
# Convert the data to a DataFrame
df = pd.DataFrame(data)


In [43]:
df

Unnamed: 0,Satisfaction,Smart Thermostat,Smart Light
0,Very Satisfied,50,70
1,Satisfied,80,100
2,Neutral,60,90
3,Unsatisfied,30,50
4,Very Unsatisfied,20,50


In [44]:
# Display the data
print("Data:")
print(df)

Data:
       Satisfaction  Smart Thermostat  Smart Light
0    Very Satisfied                50           70
1         Satisfied                80          100
2           Neutral                60           90
3       Unsatisfied                30           50
4  Very Unsatisfied                20           50


In [45]:
# Create the contingency table
contingency_table = df[['Smart Thermostat', 'Smart Light']].values

In [46]:
# Perform the Chi-Square test for independence
chi2_statistic, p_value, degrees_of_freedom, expected_frequencies = chi2_contingency(contingency_table)


In [47]:
# Output the Chi-Square statistic, p-value, degrees of freedom, and expected frequencies
print(f"\nChi-Square Statistic: {chi2_statistic}")
print(f"P-value: {p_value}")
print(f"Degrees of Freedom: {degrees_of_freedom}")
print(f"Expected Frequencies:\n{expected_frequencies}")


Chi-Square Statistic: 5.638227513227513
P-value: 0.22784371130697179
Degrees of Freedom: 4
Expected Frequencies:
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


In [48]:
# Determine the critical value for alpha = 0.05
alpha = 0.05
critical_value = st.chi2.ppf(1 - alpha, degrees_of_freedom)
print(f"Critical Value: {critical_value}")

Critical Value: 9.487729036781154


In [49]:
# Make the decision
if chi2_statistic > critical_value:
    print("Reject the null hypothesis: There is a significant association between the type of smart home device purchased and the customer satisfaction level.")
else:
    print("Fail to reject the null hypothesis: There is no significant association between the type of smart home device purchased and the customer satisfaction level.")

Fail to reject the null hypothesis: There is no significant association between the type of smart home device purchased and the customer satisfaction level.


### Results and Conclusion
Chi-Square Statistic Calculation:

Chi-Square Statistic: 5.638
P-value: 0.2278
Degrees of Freedom: 4
Critical Value: α=0.05 and 4 degrees of freedom, the critical value is 9.488.

Decision:
Since the Chi-Square statistic (5.638) is less than the critical value (9.488), we fail to reject the null hypothesis.

Conclusion:
There is no significant association between the type of smart home device purchased and the customer satisfaction level.