# What is the True Normal Human Body Temperature? 

#### Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. But, is this value statistically correct?

In [14]:
import pandas as pd
from scipy.stats.mstats import normaltest
from scipy import stats
import numpy as np

In [16]:
df = pd.read_csv('human_body_temperature.csv')

In [17]:
df.head()

Unnamed: 0,temperature,gender,heart_rate
0,99.3,F,68.0
1,98.4,F,81.0
2,97.8,M,73.0
3,99.2,F,66.0
4,98.0,F,73.0


In [18]:
df.dtypes

temperature    float64
gender          object
heart_rate     float64
dtype: object

In [19]:
df.describe()

Unnamed: 0,temperature,heart_rate
count,130.0,130.0
mean,98.249231,73.761538
std,0.733183,7.062077
min,96.3,57.0
25%,97.8,69.0
50%,98.3,74.0
75%,98.7,79.0
max,100.8,89.0


### 1. Is the distribution of body temperatures normal?

#### How do we find if the destripution is normal. We need to use Pearson's test for normality. 
#### H0 - distribution is normal, HA - distribution is not normal.

In [7]:
print ('p-value:', normaltest(df.temperature)[1])

('p-value:', 0.2587479863488254)


### The p-value is nearly 26%. That more than large enough to fail to reject the null hypothesis. That being said, let me set the threshold I'll use to be 5%. The assumption that the temperatures are normally distributed hold true.

### 2. Is the sample size large? Are the observations independent? 

#### The sample size is greater tha 30 and it is large enough.

#### Also each observation belongs to an indidual person and independent of other people. So the conclusion is observations are independent.

### 3. Is the true population mean really 98.6 degrees F? 
#### Would you use a one-sample or two-sample test? Why? 
#### In this situation, is it appropriate to use the t or z statistic? 
#### Now try using the other test. How is the result be different? Why? 

We may have to use a Z-Test. We will keep H(0) as 98.6 and H(A) as <> 98.6. The significance level as 0.05


In [4]:
from statsmodels.stats.weightstats import ztest
ztest(df.temperature,value=98.6)

(-5.4548232923645195, 4.9021570141012155e-08)

### The p-value--the second value of the tuple--is miniscule. So we reject the null hypothesis that the true population mean is 98.6 degrees Fahrenheit.

### 4. At what temperature should we consider someone's temperature to be "abnormal"? 
#### Start by computing the margin of error and confidence interval. 

In [8]:
df_mean = np.mean(df.temperature)
df_sd = np.std(df.temperature)
se = df_sd/np.sqrt(len(df))
me = 1.96*se
confidence_interval = [df_mean-me,df_temp_mean+me]
confidence_interval

[98.123679804428193, 98.374781734033363]

In [9]:
stats.norm.interval(.95,loc=df_mean,scale=df_sd/np.sqrt(len(df)))

(98.123682111456645, 98.37477942700491)

### The endpoints of the interval are the same up to four decimal places. If someone's temperature goes out of these bounds, it would be classified as "abnormal".

We will do a two sample hypothesis test.

### 5. Is there a significant difference between males and females in normal temperature? 
#### What test did you use and why? 
#### Write a story with your conclusion in the context of the original problem. 

In [12]:
females = np.array(df.temperature[df.gender=='F'])
males = np.array(df.temperature[df.gender=='M'])

print(len(males))
print(len(females))

65
65


### The samples are still large enough to use a z-test. We will use a t-test this time. This decision is justifiable since the population standard deviation is unknown.

In [13]:
stats.ttest_ind(females,males)

Ttest_indResult(statistic=2.2854345381656103, pvalue=0.023931883122395609)

### With that p-value I will reject the null hypothesis. There is a significant difference between the normal temperature of males and females. With a t-test I would still reject the null hypothesis.

### Write a story with your conclusion in the context of the original problem.

#### Though we are certain that it would not be possible to measure the body temprature of every human being in the planet. However, assuming that the sample dataset represents a mix of people with different background, ethnicity and age, we can make the following conclusions:

#### 1. The distribution of body temperature overall can be considered to be normal.

#### 2. We are 99% confident that the true population mean is between 98.415 degrees and 98.084 degrees.

#### 3. If someone's temprature is less than 98.38 degrees or greater than 98.12 it should be considered abnormal.

#### 4. There is a significant difference between the male population temperature and the female population temperature. Females tend to have a higher body temperature than males.