# What is the True Normal Human Body Temperature? 

#### Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. But, is this value statistically correct?

<h3>Exercises</h3>

<p>In this exercise, you will analyze a dataset of human body temperatures and employ the concepts of hypothesis testing, confidence intervals, and statistical significance.</p>

<p>Answer the following questions <b>in this notebook below and submit to your Github account</b>.</p> 

<ol>
<li>  Is the distribution of body temperatures normal? 
    <ul>
    <li> Although this is not a requirement for the Central Limit Theorem to hold (read the introduction on Wikipedia's page about the CLT carefully: https://en.wikipedia.org/wiki/Central_limit_theorem), it gives us some peace of mind that the population may also be normally distributed if we assume that this sample is representative of the population.
    <li> Think about the way you're going to check for the normality of the distribution. Graphical methods are usually used first, but there are also other ways: https://en.wikipedia.org/wiki/Normality_test
    </ul>
<li>  Is the sample size large? Are the observations independent?
    <ul>
    <li> Remember that this is a condition for the Central Limit Theorem, and hence the statistical tests we are using, to apply.
    </ul>
<li>  Is the true population mean really 98.6 degrees F?
    <ul>
    <li> First, try a bootstrap hypothesis test.
    <li> Now, let's try frequentist statistical testing. Would you use a one-sample or two-sample test? Why?
    <li> In this situation, is it appropriate to use the $t$ or $z$ statistic? 
    <li> Now try using the other test. How is the result be different? Why?
    </ul>
<li>  Draw a small sample of size 10 from the data and repeat both frequentist tests. 
    <ul>
    <li> Which one is the correct one to use? 
    <li> What do you notice? What does this tell you about the difference in application of the $t$ and $z$ statistic?
    </ul>
<li>  At what temperature should we consider someone's temperature to be "abnormal"?
    <ul>
    <li> As in the previous example, try calculating everything using the boostrap approach, as well as the frequentist approach.
    <li> Start by computing the margin of error and confidence interval. When calculating the confidence interval, keep in mind that you should use the appropriate formula for one draw, and not N draws.
    </ul>
<li>  Is there a significant difference between males and females in normal temperature?
    <ul>
    <li> What testing approach did you use and why?
    <li> Write a story with your conclusion in the context of the original problem.
    </ul>
</ol>

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources

+ Information and data sources: http://www.amstat.org/publications/jse/datasets/normtemp.txt, http://www.amstat.org/publications/jse/jse_data_archive.htm
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

#### Data Setup

In [None]:
import pandas as pd

df = pd.read_csv('data/human_body_temperature.csv')

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
from scipy import stats
%matplotlib inline

Use Seaborn style for graphic representation

In [None]:
sns.set()

Review columns info and understanding the shape of the sample set. It has 130 rows and 3 columns

In [None]:
print(df.info())
print(df.shape)

#### Question 1

Use graphical method by plotting the temperature data against a random sample normal distributed dataset in CDF form

In [None]:
# Rule of thumb is to set bins as the square root of the numbers of data points
bins = np.sqrt(len(df)).astype('int')
print(bins)

In [None]:
_ = plt.hist('temperature', data=df, bins=bins)
_ = plt.xlabel('Temp')
_ = plt.ylabel('counts')

Initial view suggests that data are normally distributed. We need to further confirm that by drawing a normal distributed samples from numpy

In [None]:
mu = np.mean(df.temperature)
sigma = np.std(df.temperature)
samples = np.random.normal(mu, sigma, 10000)

In [None]:
# Create a function to save typing time
def ecdf(data):
    n = len(data)
    x = np.sort(data)
    y = np.arange(1, n+1) / n
    return x, y

In [None]:
x, y = ecdf(df.temperature)
x_theor, y_theor = ecdf(samples)

In [None]:
_ = plt.plot(x, y, marker='.', linestyle='none', label='original data')
_ = plt.plot(x_theor, y_theor, label='sample data')
_ = plt.xlabel('Temp')
_ = plt.ylabel('probability')
_ = plt.legend(loc='best')
plt.show()

The CDF curve for the dataset matches pretty well with the normal distributed sample set CDF.
We can conclude that the data collected are normal distributed.

#### Question 2

Per [t-distribution table](http://www.sthda.com/english/wiki/t-distribution-table) , a sample size need to be at least 25-30 to achieve normal distribution. Sample size >120 will have very similar distribution shape as infinite sample size. The dataset in this exercise has 130 samples. We can assume that the data are normal distributed.

If the data were collected from babies or small children, the dependent variable will be ages. If the data were collected from a gym, the dependent variable will be level of exercising. Since there is no specific information with regarding the conditions on how the data were collected, we assume that observations are independent. 

#### Question 3

Null Hypothesis - Sample mean is equal to mean population temperature of 98.6F

First shifted the sample mean to be equal to that of the population temperature mean

In [None]:
sample_mean = np.mean(df.temperature)
print(sample_mean)
sample_shifted = df.temperature - sample_mean + 98.6

Create a function on drawing bootstrap samples

In [None]:
def bootstrap_replicate_1d(data, func):
    return func(np.random.choice(data, size=len(data)))

Then, create another function to generate many bootstrap replicates 

In [None]:
def draw_bs_reps(data, func, size):
    bs_replicates = np.empty(size)
    for i in range(size):
        bs_replicates[i] = bootstrap_replicate_1d(data, func)
    return bs_replicates

Get the set of means for the 10000 bootstrap replicates

In [None]:
bs_replicates = draw_bs_reps(sample_shifted, np.mean, 10000)

Identify the 95% confident interval among the bs_replicates and p-value

In [None]:
conf_lv_95 = np.percentile(bs_replicates, [2.5, 97.5])
print(conf_lv_95)

In [None]:
p = np.sum(bs_replicates <= sample_mean)/len(bs_replicates)
print(p)

From the 95% interval range, it does not contain sample mean of 98.2F, which mean that there is less than 5% chance of getting 98.2F as a mean to the population with temperature mean of 98.6F. In fact, the chance of getting the sample_mean is zero. 

We will reject the null hypothesis that the sample mean is equal to the population mean of 98.6F. This implies two possibilities:
1. Population temperature mean is no longer 98.6F
2. Sample data are not representative to true population

Given the large sample set, z test is used

In [None]:
z = (sample_mean -98.6)/ (np.std(df.temperature) / np.sqrt(len(df)))
# we want area on the left tail
pval = 2 * (stats.norm.cdf(z))
print('z-score: {:.2f}'.format(z))
print('p-value: {:.2f}'.format(pval))

The z score is < 3 and p-value is zero. It suggest significant evidence to reject the null hypothesis.

t-test, one-sample

In [None]:
t_statistics = stats.ttest_1samp(df.temperature, 98.6)
print('t-score: {:.2f}'.format(t_statistics.statistic))
print('p-value: {:.2f}'.format(t_statistics.pvalue))

t score <5 and p-value is zero, we will reject the null hypothesis

#### Question 4

Null Hypothesis - Sample mean is equal to mean population temperature of 98.6F

Drawing ten sample from the dataset and repeat all the steps that were run in question 3

In [None]:
ten_sample = df['temperature'].sample(n=10, random_state=3)
ten_sample_mean = np.mean(ten_sample)

z-test

In [None]:
z1 = (ten_sample_mean -98.6)/ (np.std(ten_sample) / np.sqrt(10))
# we want area on the left tail
pval1 = 2 * (stats.norm.cdf(z1))
print('z-score: {:.2f}'.format(z1))
print('p-value: {:.2f}'.format(pval1))

t-test

In [None]:
t_ten_sample = stats.ttest_1samp(ten_sample, 98.6)
print('t-score: {:.2f}'.format(t_ten_sample.statistic))
print('p-value: {:.2f}'.format(t_ten_sample.pvalue))

We will reject the null hypothesis under z-test, but fail to reject under t-test under 95% confidence level. t-test would provide a more conservative result for smaller sample. Nonetheless, both tests gave higher p-value when the sample size is reduced, which make it hard to reject null hypothesis.

#### Question 5

Under 95% confident level, we would regard abnormal temperature be 2 standard deviations from the mean of the population

In [None]:
conf_lv_95 = np.percentile(bs_replicates, [2.5, 97.5])
print('Bootstrap confident interval: {}'.format(conf_lv_95))

In [None]:
# For z-test will need to calculiate margin of error of using origin sample, using the formula of z * std/sqrt(n)
margin_error = 1.96 * np.std(df.temperature)/ np.sqrt(len(df.temperature))
confid_lv_z = [sample_mean - margin_error, sample_mean + margin_error]
print ('Margin of error: {}'.format(margin_error))
print ('z-test Confidence interval: {}'.format(confid_lv_z))

z test method indicates similar temperature range with 95% confidence, which is below 98.13F or 98.38F.

t-test for the sample of 10. df = 9, t score is 2.263

In [None]:
margin_of_error_1 = 2.263 * np.std(ten_sample)/np.sqrt(len(ten_sample))
confid_lv_t = [ten_sample_mean - margin_of_error_1, ten_sample_mean + margin_of_error_1]
print ('Margin of error: {}'.format(margin_of_error_1))
print ('Confidence interval: {}'.format(confid_lv_t))

Note that the margin of error is bigger in the t-test and so does the confidence interval. This is due to small sample size.

#### Question 6

Null Hypothesis - males and females have the same mean body temperature.

We will use two samples bootstrap method

In [None]:
# Obtain samples for males and females
males = df[df.gender == 'M'].temperature
females = df[df.gender == 'F'].temperature
# Observed difference from the existing sample
mean_diff = np.mean(males) - np.mean(females)
print(mean_diff)

In [None]:
# Shift sample
males_shifted = males - np.mean(males) + mu
females_shifted = females - np.mean(females) + mu

In [None]:
# Draw replicates
bs_rep_males = draw_bs_reps(males_shifted, np.mean, 10000)
bs_rep_females = draw_bs_reps(females_shifted, np.mean, 10000)

In [None]:
# Calculate difference in mean among the replicates
bs_rep_diff = np.abs(bs_rep_males - bs_rep_females)

# Calculiate p value on showing the observed difference is a result of chance
p = np.sum(bs_rep_diff >= np.abs(mean_diff)) / len(bs_rep_diff)
print ('p-value: {}'.format(p))

It has 2% chance of getting the observed difference. We have significant evidence to reject the null hypothesis under 95% confidence level

Apply z-test. Using 95% confidence level

In [None]:
z_diff_in_mean = mean_diff/(np.sqrt(np.std(males)**2/len(males)+ np.std(females)**2/len(females)))
p_val_diff_in_mean = 2 * (1-stats.norm.cdf(np.abs(z_diff_in_mean)))
print ('z-score: {:.2f}'.format(z_diff_in_mean))
print ('p-value:{:.2f}'.format(p_val_diff_in_mean))

In [None]:
# for validation purposes, calculating z-test using statsmodels
from statsmodels.stats.weightstats import ztest
z_test = ztest(males, females)
print ('z-score: {:.2f}'.format(z_test[0]))
print ('p-value: {:.2f}'.format(z_test[1]))

p value is <5%, we will reject the null hypothesis under 95% confidence level

In [None]:
# Use scipy.stats to run the t-test again to comfirm output
t_test1 = stats.ttest_ind(males, females)
print ('t-score: {:.2f}'.format(t_test1.statistic))
print ('p-value: {:.2f}'.format(t_test1.pvalue))

p-value is at 2%, we have significant evidence to reject the null hypothesis under 95% confidence level

#### Conclusion

The mean normal body temperature of 98.6F is not necessarily true in today's population. In fact, the sample temperature mean is 98.2F, which is 0.4F below the historical mean. 

Why the decrease? One area which would worth exploring is the current living environment. Perhaps more people have moved to warmer areas, in which their body temperatures don't need to stay that high to remain warm. With the advance in technology, a lot of labor intensive works are replaced by machine. As a result, people no longer need to eat that much food to perform daily tasks, hence less energy needed in the body, hence the decrease. In related to that, people today have more clothing and shelters for body protection from the wild environment, the biological part of the body may change over time which need less body fat to maintain high body temperature.