# Instructions

It is assumed that the mean systolic blood pressure is μ = 120 mm Hg. In the Honolulu Heart Study, a sample of n = 100 people had an average systolic blood pressure of 130.1 mm Hg with a standard deviation of 21.21 mm Hg. 

Is the group significantly different (with respect to systolic blood pressure!) from the regular population?
        Set up the hypothesis test.
        
Write down all the steps followed for setting up the test.

Calculate the test statistic by hand and also code it in Python. It should be 4.76190. We will take a look at how to make decisions based on this calculated value.


In [15]:
import math

sample_mean = 130.1
pop_mean = 120
sample_std = 21.21
n = 100
statistic = (sample_mean - pop_mean)/(sample_std/math.sqrt(n))
print("Statistic is: ", statistic)

Statistic is:  4.761904761904759


In [16]:
from scipy import stats
from numpy.random import normal


samples = {}

for i in range(10):
    sample_name = "sample_" + str(i)
    samples[sample_name] = normal(loc = 130.1, scale = 21.21, size = 100)
    sample_mean = "sample_" + str(i) + "_mean"
    samples[sample_mean] = np.mean(samples[sample_name])
    sample_std = "sample_" + str(i) + "_std"
    samples[sample_std] = np.std(samples[sample_name],ddof=1)
    sample_statistic = "sample_" + str(i) + "_t-statistic"
    samples[sample_statistic] = (samples[sample_mean]- pop_mean)/(samples[sample_std]/math.sqrt(n)) 
    print("The t-statistic for the sample {} is: {}".format(i,samples[sample_statistic]))


The t-statistic for the sample 0 is: 4.604322773453268
The t-statistic for the sample 1 is: 4.850078365553911
The t-statistic for the sample 2 is: 4.359033901995202
The t-statistic for the sample 3 is: 3.7768897525732164
The t-statistic for the sample 4 is: 4.30875320775256
The t-statistic for the sample 5 is: 5.35015822812754
The t-statistic for the sample 6 is: 3.984059283824009
The t-statistic for the sample 7 is: 4.790820395711883
The t-statistic for the sample 8 is: 4.000141677811876
The t-statistic for the sample 9 is: 5.855107319520905


In [18]:
print("Assuming a significance level of 0.05")
print()

for i in range(10):
    sample_name = "sample_" + str(i)
    print("The p-value of sample {} is: {:-5.3}".format(i,stats.ttest_1samp(samples[sample_name],120)[1]))
    if ( stats.ttest_1samp(samples[sample_name],120)[1] < 0.05 ):
        print("Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample {} given Ho.".format(i))
    print()

Assuming a significance level of 0.05

The p-value of sample 0 is: 1.23e-05
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 0 given Ho.

The p-value of sample 1 is: 4.59e-06
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 1 given Ho.

The p-value of sample 2 is: 3.2e-05
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 2 given Ho.

The p-value of sample 3 is: 0.000271
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 3 given Ho.

The p-value of sample 4 is: 3.88e-05
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 4 given Ho.

The p-value of sample 5 is: 5.68e-07
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 5 given Ho.

The p-value of sample 6 is: 0.00013
Therefore we discard the null hypothesis Ho, as it's very unlikely to get sample 6 given Ho.

The p-value of sample 7 is: 5.84e-06
Therefore