# Hypothesis Testing

### 1. State the Hypotheses Statement:

Null Hypothesis (H0): The mean weekly operating cost is as per the theoretical model, i.e., (𝜇 =1000+ 
5X ).


Alternative Hypothesis ((H1): The mean weekly operating cost is higher than the theoretical model, i.e., 𝜇 > 
1000+5X )

### 2. Calculate the Test Statistic:



The theoretical mean weekly cost (( \mu )) can be calculated using the provided cost model: μ=1000+5×600=$4,000

In [18]:
import numpy as np 
import pandas as pd
import scipy.stats as st

In [19]:
# Given data in problem statement
sample_mean = 3050
population_mean = 4000
population_std = 125
n = 25
alpha = 0.05

In [22]:
# Calculate the t-statistic value
t_value = (sample_mean-population_mean) / (population_std / np.sqrt(n))
print(f'test_statistic : {t_value}')

test_statistic : -38.0


In [24]:
# Calculate the p-value using scipy.stats model
p_value = st.t.cdf(t_value,df = n-1)
print(f'p_value : {p_value}')

p_value : 2.9607810808177907e-23


In [26]:
# Decision based on p-value
if p_value > alpha :
    print('We reject the Null hypothesis, i.e., the weekly operating cost is not equal to the theoretical weekly operating cost.')
else:
    print('Fail to reject the Null Hypothesis, i.e., the weekly operating cost is equal to the theoretical weekly operating cost.')


Fail to reject the Null Hypothesis, i.e., the weekly operating cost is equal to the theoretical weekly operating cost.


In [28]:
# Determine the critical value for alpha = 0.05 (one-tailed test)
critical_value = st.norm.ppf(alpha)
print(f'critical_value :{critical_value}')

critical_value :-1.6448536269514729


# Results and conclusion

### 1.Test Statistic Calculation:

Test Statistic ((t)) = -38.0


### P-value Calculation:


P-value = (2.9607810808177907e-23)


### Decision: 

Since the p-value ((2.9607810808177907e-23)) is less than the alpha level (0.05), we reject the null hypothesis.


### Conclusion: 

There is strong evidence to support the restaurant owners' claim that the weekly operating costs are higher than the model suggests.


This code covers the hypothesis testing as required, with a clear decision-making process based on the calculated p-value and the given level of significance.


# Chi Square test :

### Background:

Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

### Objective:

To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level. .

### State the null hypothesis

Null Hypothesis(H0) = The columns (device type and satisfaction level) are independent, i.e., there is no significant association between the type of smart home device purchased and customer satisfaction level.

Alternate Hypothesis (H_1)= The Columns are Dependent i.e there is a relationship between two columns.

In [1]:
import numpy as np 
import pandas as pd
from scipy.stats import chi2_contingency
import scipy.stats as st

In [2]:
# Given data
data = {
    'satisfaction' : ['very satisfied','satisfied','neutral','unsatisfied','very unsatisfied'],
    'smart_Thermostat' : [50,80,60,30,20],
    'Smart_light' : [70,100,90,50,50]
}

In [3]:
# Convert the data to a DataFrame
data = pd.DataFrame(data)

In [4]:
data.head()

Unnamed: 0,satisfaction,smart_Thermostat,Smart_light
0,very satisfied,50,70
1,satisfied,80,100
2,neutral,60,90
3,unsatisfied,30,50
4,very unsatisfied,20,50


In [5]:
# Display the data
print(data)

       satisfaction  smart_Thermostat  Smart_light
0    very satisfied                50           70
1         satisfied                80          100
2           neutral                60           90
3       unsatisfied                30           50
4  very unsatisfied                20           50


In [6]:
# Create the contingency table
contingency_table = data[['smart_Thermostat','Smart_light']]

In [9]:
# Perform the Chi-Square test for independence
chi2_statistic,p_value,dof,expected_frequency = chi2_contingency(contingency_table)

In [11]:
# Output the Chi-Square statistic, p-value, degrees of freedom, and expected frequencies
print(f'chi2_statistic : {chi2_statistic}')
print(f'p_value : {p_value}')
print(f'dof : {dof}')
print(f'expected_frequency : {expected_frequency}')

chi2_statistic : 5.638227513227513
p_value : 0.22784371130697179
dof : 4
expected_frequency : [[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


In [13]:
# Determine the critical value for alpha = 0.05
alpha =0.05
critical_val = st.chi2.ppf(1-alpha,dof)
print(f' critical val : {critical_val}')

 critical val : 9.487729036781154


In [16]:
# Make the decision
if chi2_statistic > critical_val:
    print('Reject the null hypothesis: There is a significant association between the type of smart home device purchased and the customer satisfaction level.')
else:
    print('Fail to reject the null hypothesis: There is no significant association between the type of smart home device purchased and the customer satisfaction level.')

Fail to reject the null hypothesis: There is no significant association between the type of smart home device purchased and the customer satisfaction level.


### Results and Conclusion :

Chi-Square Statistic Calculation:

Chi-Square Statistic: 5.638 P-value: 0.2278 Degrees of Freedom: 4 Critical Value: α=0.05 and 4 degrees of freedom, the critical value is 9.488.

Decision: Since the Chi-Square statistic (5.638) is less than the critical value (9.488), we fail to reject the null hypothesis.

Conclusion: There is no significant association between the type of smart home device purchased and the customer satisfaction level.