#Hypothesis Testing

##Background:
Bombay hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England. The operating cost for a franchise in a week (W) is given by the equation W = $1,000 + $5X, where X represents the number of units produced in a week. Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.

##Objective:
To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.

##Data Provided:
The theoretical weekly operating cost model: W = $1,000 + $5X

Sample of 25 restaurants with a mean weekly cost of Rs. 3,050

Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units




# Stating Hypothesis Statement

H_0 (Null Hypothesis) : mu = 1000 + 5 (600) = 4000

H_1 (Alternate Hypothesis : mu > 4000


#Finding T statistic

In [3]:
# finding t_stat
import numpy as np
x_bar = 3050 # given
mu = 4000 # W for 600 units
s = 5*25  # given
n = 25
t_stat = (x_bar - mu)/(s/np.sqrt(n))
t_stat

-38.0

#Determining Critical Value

we need to find z - critical value for one tailed test (as we are intersted in looking for cost to be higher or not) at alpha = 0.05.

In [4]:
# for larger sample z = t
# finding z critical
import scipy.stats as stats
stats.norm.ppf(0.95)

1.6448536269514722

#Making decision

As mod of -38 (t_stat) is higher than t_critical (= z_critical for larger n), we reject null hypothesis.

#Conclusion

As we reject null hypothesis, we can strongly say that this model is no longer accurate as owners suggest. But strong negative t_stat says that sample mean is much lower than the mean given by model, which is contradictory to statement given by owners.

#ChiSquare Test
##Background:
Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.
##Data Provided:
The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:
<table>
  <tr>
    <th>Satisfaction</th>
    <th>Smart Thermostat</th>
    <th>Smart Light</th>
    <th>Total</th>
  </tr>
  <tr>
    <td>Very Satisfied</td>
    <td>50</td>
    <td>70</td>
    <td>120</td>
  </tr>
  <tr>
    <td>Satisfied</td>
    <td>80</td>
    <td>100</td>
    <td>180</td>
  </tr><tr>
    <td>Neutral</td>
    <td>60</td>
    <td>90</td>
    <td>150</td>
  </tr><tr>
    <td>Unsatisfied</td>
    <td>30</td>
    <td>50</td>
    <td>80</td>
  </tr><tr>
    <td>Very unatisfied</td>
    <td>20</td>
    <td>50</td>
    <td>70</td>
  </tr>
</table>

##Objective:
To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.

#Stating Hypothesis Statement

Null : No association between device type and satisfaction.

Alternate : There is an association


#Computing chiSquare statistic

In [7]:
# compute results
obs = [[50,70],[80,100],[60,90],[30,50],[20,50]]
chi2, p, dof, expected = stats.chi2_contingency(obs)
print(chi2)
print(p)
print(dof)
print(expected)

5.638227513227513
0.22784371130697179
4
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


#Finding critical value

In [8]:
# finding critical value
stats.chi2.ppf(0.95,4)

9.487729036781154

chi2_stat < chi2_critical, Fail to reject null hypothesis. No evidence to show device type and its association with satisfaction.