# HYPOTHESIS TESTING

## Bombay Hospitality Ltd. Weekly Operating Cost Analysis

**Background**

- Bombay Hospitality Ltd. operates a franchise model for producing exotic Norwegian dinners throughout New England.
- The operating cost for a franchise in a week (W) is given by the equation `W = $1,000 + $5X`, where X represents the number of units produced in a week.
- Recent feedback from restaurant owners suggests that this cost model may no longer be accurate, as their observed weekly operating costs are higher.

**Objective**

- To investigate the restaurant owners' claim about the increase in weekly operating costs using hypothesis testing.

**Data Provided**

- Theoretical weekly operating cost model: `W = $1,000 + $5X`
- Sample of 25 restaurants with a mean weekly cost of Rs. 3,050
- Number of units produced in a week (X) follows a normal distribution with a mean (μ) of 600 units and a standard deviation (σ) of 25 units

### **1. State the Hypotheses statement:**

To investigate the claim that the actual weekly operating costs are higher than the theoretical model suggests, we can set up the hypotheses for hypothesis testing.

#### Hypotheses Statements:

**Null Hypothesis (H₀):**
The mean weekly operating cost for the restaurants is equal to the theoretical cost model, i.e., the observed cost does not differ from the expected cost based on the model.

$[ H₀: \mu_W = 1,000 + 5X ]$

**Alternative Hypothesis (H₁):**
The mean weekly operating cost for the restaurants is higher than the theoretical cost model, i.e., the observed cost is greater than the expected cost based on the model.

$[ H₁: \mu_W > 1,000 + 5X ]$

In this case, the hypothesis testing will determine whether there is significant evidence to support the restaurant owners' claim that the weekly operating costs have increased beyond what the theoretical model predicts.

### **2. Test Statistic Calculation:**

In [2]:
# Sample mean (x_bar)

sample_mean = 3050
print('Sample Mean = ',sample_mean)

Sample Mean =  3050


In [3]:
# Calculating theoratical mean (W = $1,000 + $5X)
X= 600             # Given in the statement 
W = 1000 + 5 * X  # Calculating W using the formula
print("Theoretical mean (W) is: ", W)  # Printing the result

Theoretical mean (W) is:  4000


**Population Standard Deviation ($ \sigma_W $):**
Since the number of units produced follows a normal distribution with $( \sigma_X = 25 )$ units, the standard deviation of the weekly operating cost can be calculated as:

$\sigma_W = 5 \times \sigma_X$

In [5]:
# Population Standard Deviation 
pop_std = 5*25
print("Population Standard Deviation: ", pop_std)

Population Standard Deviation:  125


In [6]:
# given Sample size (n) 

n = 25
print('Sample sizee (n) =',n)


Sample sizee (n) = 25


#### Test Statistic:

The test statistic for a one-sample z-test is calculated as:

$z = \frac{\bar{X} - \mu_W}{\frac{\sigma_W}{\sqrt{n}}}$

In [13]:
# Test statistic (z-test)
import numpy as np

z_statistic = (sample_mean-W)/(pop_std/np.sqrt(n))
print('Test Statistic (Z) =',z_statistic)

Test Statistic (Z) = -38.0


### **3. Determine the Critical Value**

In [11]:
# Given Significance level (alpha)

alpha = 0.05
print('Significance Level (alpha) =',alpha)

Significance Level (alpha) = 0.05


- Since the alternative hypothesis is $( H_1: \mu_W > 4,000 )$, this is a one-tailed test (right-tailed).

- The critical value $( z_{\text{critical}} )$ is the z-score that corresponds to the upper tail probability of $( \alpha = 0.05 )$ in a standard normal distribution.

In [12]:
# Z-critical Calculation

from scipy import stats
Z_critical = stats.norm.ppf(1-alpha)
print('Z Critical =',round(Z_critical,3))

Z Critical = 1.645


### **4. Making Decision**

When comparing the test statistic $( z_{\text{statistic}} )$ to the critical value $( z_{\text{critical}} )$, the relationship between them determines whether we reject or fail to reject the null hypothesis.

In [14]:
z_statistic < Z_critical

True

### Interpretation:
- **$( z_{\text{statistic}} < z_{\text{critical}} )$**: 
  - In this case, $( z_{\text{statistic}} = -38 )$ is much less than $( z_{\text{critical}} = 1.645 )$.
  - Since the test statistic is far less than the critical value, it lies in the left tail of the normal distribution.

  **Therefore, we fail to reject the null hypothesis**

### **5. Conclusion**

- **Fail to Reject the Null Hypothesis**: In this context, because $( z_{\text{statistic}} )$ is less than $( z_{\text{critical}} )$, we do **not** have sufficient evidence to reject the null hypothesis $( H_0 )$.
- This means there is no statistical evidence to support the claim that the actual mean weekly operating costs are higher than the theoretical model predicts.

# CHI-SQUARE TEST

## Chi-Square Test for Association Between Device Type and Customer Satisfaction

**Background**

Mizzare Corporation has collected data on customer satisfaction levels for two types of smart home devices: Smart Thermostats and Smart Lights. They want to determine if there's a significant association between the type of device purchased and the customer's satisfaction level.

**Data Provided**

The data is summarized in a contingency table showing the counts of customers in each satisfaction level for both types of devices:

| Satisfaction       | Smart Thermostat | Smart Light | Total |
|--------------------|------------------|-------------|-------|
| Very Satisfied      | 50               | 70          | 120   |
| Satisfied           | 80               | 100         | 180   |
| Neutral             | 60               | 90          | 150   |
| Unsatisfied         | 30               | 50          | 80    |
| Very Unsatisfied    | 20               | 50          | 70    |
| **Total**           | **240**          | **360**     | **600**|

**Objective**

To use the Chi-Square test for independence to determine if there's a significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level.


### 1. **Formulate Hypotheses**
   - **Null Hypothesis $(H_0)$**: There is no association between the type of device purchased and customer satisfaction level. In other words, the type of device and customer satisfaction are independent.
   
   - **Alternative Hypothesis $(H_1)$**: There is an association between the type of device purchased and customer satisfaction level. The type of device and customer satisfaction are not independent.

### 2. **Chi-Square Statistic Calculation**

#### Importing libraries and creating numpy array of observed frequencies

In [20]:
import scipy.stats as stats
import numpy as np

# Observed frequencies
observed = np.array([
    [50, 70],   # Very Satisfied
    [80, 100],  # Satisfied
    [60, 90],   # Neutral
    [30, 50],   # Unsatisfied
    [20, 50]    # Very Unsatisfied
])
print('Observed Frequencies:')
print(observed)


Observed Frequencies:
[[ 50  70]
 [ 80 100]
 [ 60  90]
 [ 30  50]
 [ 20  50]]


**Expected Frequencies**

   For each cell in the table, calculate the expected frequency using the formula:

   $E_{ij} = \frac{(Row\ Total \times Column\ Total)}{Grand\ Total}$

**Chi-Square Statistic Calculation**

   The Chi-Square statistic is calculated as:

   $\chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$

   Where $(O_{ij})$ is the observed frequency and $(E_{ij})$ is the expected frequency.

**Degrees of Freedom**

   The degrees of freedom for this test is calculated as:

   $df = (r-1) \times (c-1)$

   Where $(r)$ is the number of rows and $(c)$ is the number of columns in the contingency table.

**P-Value and Conclusion**

   Using the Chi-Square statistic and the degrees of freedom, determine the p-value from the Chi-Square distribution. Compare the p-value to the significance level (commonly ($alpha = 0.05)$) to decide whether to reject the null hypothesis.

**Let's perform the calculations now.**

In [30]:
# Perform Chi-Square test for independence
chi2_statistics, p_value, dof, expected = stats.chi2_contingency(observed)

# Output the results
print(f"Chi-Square Statistic: {round(chi2_statistics,3)}")
print(f"Degrees of Freedom: {dof}")
print(f"P-Value: {round(p_value,3)}")
print("Expected Frequencies:")
print(expected)

Chi-Square Statistic: 5.638
Degrees of Freedom: 4
P-Value: 0.228
Expected Frequencies:
[[ 48.  72.]
 [ 72. 108.]
 [ 60.  90.]
 [ 32.  48.]
 [ 28.  42.]]


### 3. **Determining the Critical Value:**

Using the calculated degrees of freedom and a chosen significance level (commonly 0.05), we can find the critical Chi-Square value from the scipy.stats module.

In [25]:
from scipy.stats import chi2

# Calculating chi_critical
significance_level = 0.05
critical_chi_square = chi2.ppf(1 - significance_level, dof)

# Print the critical Chi-Square value
print("Critical Chi-Square Value:", round(critical_chi_square,3))

Critical Chi-Square Value: 9.488


### 4. **Making Decision:**

If the calculated Chi-Square statistic is greater than the critical value, or if the p-value is less than the chosen significance level, we reject the null hypothesis. Otherwise, we fail to reject the null hypothesis.

In [28]:
chi2_statistics < critical_chi_square

True

In [31]:
p_value > significance_level

True

### **Conclusion**

- The chi-square statistics is less than the critical chi-square value, and the p-value is greater than the significance level. 

- This indicates that we **fail to reject the null hypothesis $(H_0)$**. There is no significant association between the variables. 

- There is no statistically significant association between the type of smart home device purchased (Smart Thermostats vs. Smart Lights) and the customer satisfaction level. In other words, customer satisfaction appears to be independent of the type of device purchased based on the data provided by Mizzare Corporation.

####  **Author Information:**
- **Author:-**  Er.Pradeep Kumar
- **LinkedIn:-**  [https://www.linkedin.com/in/pradeep-kumar-1722b6123/](https://www.linkedin.com/in/pradeep-kumar-1722b6123/)

#### **Disclaimer:**
This Jupyter Notebook and its contents are shared for educational purposes. The author, Pradeep Kumar, retains ownership and rights to the original content. Any modifications or adaptations should be made with proper attribution and permission from the author.