## What is the true normal human body temperature? 

#### Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. In 1992, this value was revised to 36.8$^{\circ}$C or 98.2$^{\circ}$F. 

#### Exercise
In this exercise, you will analyze a dataset of human body temperatures and employ the concepts of hypothesis testing, confidence intervals, and statistical significance.

Answer the following questions **in this notebook below and submit to your Github account**. 

1.  Is the distribution of body temperatures normal? 
    - Remember that this is a condition for the CLT, and hence the statistical tests we are using, to apply. 
2.  Is the true population mean really 98.6 degrees F?
    - Bring out the one sample hypothesis test! In this situation, is it approriate to apply a z-test or a t-test? How will the result be different?
3.  At what temperature should we consider someone's temperature to be "abnormal"?
    - Start by computing the margin of error and confidence interval.
4.  Is there a significant difference between males and females in normal temperature?
    - Set up and solve for a two sample hypothesis testing.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources

+ Information and data sources: http://www.amstat.org/publications/jse/datasets/normtemp.txt, http://www.amstat.org/publications/jse/jse_data_archive.htm
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [1]:
import pandas as pd
from scipy.stats.mstats import normaltest
from scipy import stats
import numpy as np

In [2]:
df = pd.read_csv('data/human_body_temperature.csv')

In [3]:
df.head()

Unnamed: 0,temperature,gender,heart_rate
0,99.3,F,68.0
1,98.4,F,81.0
2,97.8,M,73.0
3,99.2,F,66.0
4,98.0,F,73.0


## Is the distribution of body temperatures normal?

In order to find out whether the destripution is normal I'll use Pearson's test for normality. H0 - destribution is normal, HA - distr. is not normal

In [4]:
print ('p-value:', normaltest(df.temperature)[1])

p-value: 0.258747986349


### We cannot reject H0. Given desribution is  normal

## Is the true population mean really 98.6 degrees F?

H0 - population mean is 98.6, HA - population mean is not 98.6

In [5]:
SampleMean=df.temperature.mean()
SE=df.temperature.std()/ np.sqrt(len(df))
zScore=(SampleMean-98.6)/SE# using z score because sample size > 30
p_value = stats.norm.sf(abs(zScore))*2# using two-sided test
p_value

4.9021570141133797e-08

### We can reject H0. True population mean does not equals 98.6 degrees F

## At what temperature should we consider someone's temperature to be "abnormal"?

In [6]:
critical_value=stats.norm.ppf(1-0.05/2)# two sided
MI=critical_value*SE
CI=(SampleMean-MI,SampleMean+MI)
CI

(98.123196428181657, 98.375265110279898)

### We can consider abnormal results below 98.1 and higher than 98.4 

## Is there a significant difference between males and females in normal temperature?

H0 - meanMale=meanFemale, HA - meanMale<>meanFemale

In [7]:
meanMale=df[df.gender=='M'].temperature.mean()
varMale=df[df.gender=='M'].temperature.var()
meanFemale=df[df.gender=='F'].temperature.mean()
varFemale=df[df.gender=='F'].temperature.var()
SE=((varMale/len(df[df.gender=='M'])+varFemale/len(df[df.gender=='F'])))**0.5

In [8]:
zScore=(meanMale-meanFemale)/SE# using z score because sample size > 30
p_value = stats.norm.sf(abs(zScore))*2# using two-sided test
p_value

0.02228736076067726

### We can reject H0. Males and female have different tempretures at confidance lavel 0.95