# Lab | Inferential statistics

## Instructions
1. It is assumed that the mean systolic blood pressure is μ = 120 mm Hg. In the Honolulu Heart Study, a sample of n = 100 people had an average systolic blood pressure of 130.1 mm Hg with a standard deviation of 21.21 mm Hg. Is the group significantly different (with respect to systolic blood pressure!) from the regular population?

2. Set up the hypothesis test.
   
3. Write down all the steps followed for setting up the test.
   
4. Calculate the test statistic by hand and also code it in Python. It should be 4.76190. We will take a look at how to make decisions based on this calculated value.


### Import Libraries

In [10]:
import math
import numpy as np
from scipy import stats

### Given

In [3]:
pop_mean = 120
n = 100
samp_mean = 130.1
samp_std = 21.21

### Hypotheses

- H0 : samp_mean <= 120
- Ha : samp_mean > 120

### Test Statistics

In [4]:
# Manually
statistic = (samp_mean - pop_mean)/(samp_std/math.sqrt(n))
statistic

4.761904761904759

In [5]:
# Looking for the p-value eqiuivalent of this statistic
p_value = stats.t.sf(abs(statistic), n-1) 
p_value

3.2813509086043083e-06

> Less than 0.05 so we will reject the null hypothesis that the group with systolic blood pressure is not significantly different from the regular population?

### Critical Value for alpha = 0.05 and dof 99

In [6]:
# Python Code
statistic = stats.t.ppf(1-0.05, n-1)
statistic

1.6603911559963895

> Checking the critical value of alpha = 0.05 with 99 degrees of freedom, we get the value 1.66. Because the statistic is much higher in comparison, we will reject the null hypothesis.

**End of lab**

Trying to see below how this test would work with a generated random variable sample...

In [43]:
# generate samples
rvs = stats.norm.rvs(loc=samp_mean, scale=samp_std, size=n, random_state= 7)
rvs

array([165.95605018, 120.21746837, 130.79611567, 138.74342036,
       113.36694256, 130.1438108 , 130.08111492,  92.88229746,
       151.6845263 , 142.83657352, 116.83465146, 126.46146138,
       140.81739973, 124.55663043, 124.95129204,  99.27674964,
       141.86264842, 132.727514  , 135.92129498,  97.72241468,
       165.11134045, 133.37345671, 121.8887618 , 173.1366218 ,
       129.13736231,  99.33110479, 121.50511719,  81.56483669,
       152.35770081, 121.2665797 , 114.35043973, 152.84709149,
        95.08068675, 141.45645665,  86.31376203, 116.05560041,
       104.55849707, 161.10850305, 167.56027222, 123.11313432,
       147.93195207, 126.28248843, 142.14859263, 114.13232306,
        93.8661255 ,  91.85627745, 138.22601448, 177.77149107,
       135.81422068, 118.97313602, 170.65392007, 135.13317217,
       132.25141483, 135.45717377, 127.29227964, 123.53600682,
        99.66442489, 140.73944765, 128.0898127 , 155.40535241,
       122.27736028,  89.66589488, 127.9872585 , 166.14

In [44]:
new_samp_mean = np.mean(rvs).round()
new_samp_mean

130.3559

In [59]:
t_statistic, p_value = stats.ttest_1samp(a=rvs, popmean=pop_mean)
print("%.2f" % t_statistic,"%.4f" % p_value)

4.77 0.0000
