# Lab | Inferential statistics

## Instructions

1. It is assumed that the mean systolic blood pressure is `μ = 120 mm Hg`. In the Honolulu Heart Study, a sample of `n = 100` people had an average systolic blood pressure of `130.1 mm Hg` with a standard deviation of `21.21 mm Hg`. Is the group significantly different (with respect to systolic blood pressure!) from the regular population?

   - Set up the hypothesis test.
   - Write down all the steps followed for setting up the test.
   - Calculate the test statistic by hand and also code it in Python. It should be 4.76190. We will take a look at how to make decisions based on this calculated value.

In [1]:
# Libraries
import pandas as pd # manipulate dataframes
import numpy as np # numerical python
import math # numerical python
import matplotlib.pyplot as plt # viz

# New libraries
import scipy.stats as stats 
import statsmodels.api as sm
import statsmodels.formula.api as smf

### Null and Alternative Hypothesis

In [3]:
# Hypotheses
null_hypothesis = "The mean systolic blood pressure is 120 mm Hg."  # H₀
alt_hypothesis = "The mean systolic blood pressure of the Honolulu Heart Study is not 120 mm Hg."  # H₁

print(f"Null Hypothesis (H₀): {null_hypothesis}")
print(f"Alternate Hypothesis (H₁): {alt_hypothesis}")

Null Hypothesis (H₀): The mean systolic blood pressure is 120 mm Hg.
Alternate Hypothesis (H₁): The mean systolic blood pressure of the Honolulu Heart Study is not 120 mm Hg.


### Level of Significance

In [4]:
# Significance level
alpha = 0.05
print(f"Level of Significance (α): {alpha}")

Level of Significance (α): 0.05


### Calculate Test Statistic
The formula for the z-statistic is:
$$ z = \frac{\bar{x} - \mu}{\sigma / \sqrt{n}} $$
where:
- $\bar{x} $ is the sample mean
- $ \mu $ is the population mean
- $ \sigma $ is the population standard deviation
- $ n $ is the sample size
- $ \sqrt{n} $ is the square root of the sample size

In this example, we have:
- Sample mean $\bar{x}$ = 130.1
- Population mean $\mu$ = 120
- Population standard deviation $\sigma$ = 21.21
- Sample size $n$ = 100

So, the z-statistic is calculated as:
$$ z = \frac{130.1 - 120}{ 21.21 / \sqrt{100}} $$

In [5]:
sample_mean = 130.1
pop_mean = 120
pop_std = 21.21
n = 100

In [6]:
# Calculate the test statistic (z-score)
z_stat = (sample_mean - pop_mean) / (pop_std / (n**0.5))
print(f"Test Statistic (z): {z_stat:.2f}")

Test Statistic (z): 4.76


### Calculte the P-value

In [9]:
p_value = stats.norm.cdf(z_stat)  # lower-tailed test
print(f"P-Value: {p_value:.4f}")

if p_value < alpha:
    print("Reject the Null Hypothesis: Significant result.")
else:
    print("Fail to Reject the Null Hypothesis: Not a significant result.")


P-Value: 1.0000
Fail to Reject the Null Hypothesis: Not a significant result.


### Critical Value

In [10]:
# Critical Value for a one-tailed test
critical_value = stats.norm.ppf(alpha)
print(f"Critical Value: {critical_value:.2f}")

Critical Value: -1.64
