## What is the true normal human body temperature? 

#### Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. In 1992, this value was revised to 36.8$^{\circ}$C or 98.2$^{\circ}$F. 

#### Exercise
In this exercise, you will analyze a dataset of human body temperatures and employ the concepts of hypothesis testing, confidence intervals, and statistical significance.

Answer the following questions **in this notebook below and submit to your Github account**. 

1.  Is the distribution of body temperatures normal? 
    - Remember that this is a condition for the CLT, and hence the statistical tests we are using, to apply. 
2.  Is the true population mean really 98.6 degrees F?
    - Bring out the one sample hypothesis test! In this situation, is it approriate to apply a z-test or a t-test? How will the result be different?
3.  At what temperature should we consider someone's temperature to be "abnormal"?
    - Start by computing the margin of error and confidence interval.
4.  Is there a significant difference between males and females in normal temperature?
    - Set up and solve for a two sample hypothesis testing.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources

+ Information and data sources: http://www.amstat.org/publications/jse/datasets/normtemp.txt, http://www.amstat.org/publications/jse/jse_data_archive.htm
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('data/human_body_temperature.csv')

In [3]:
df.describe()

Unnamed: 0,temperature,heart_rate
count,130.0,130.0
mean,98.249231,73.761538
std,0.733183,7.062077
min,96.3,57.0
25%,97.8,69.0
50%,98.3,74.0
75%,98.7,79.0
max,100.8,89.0


To test if the distribution is normal we can use a chi-square test

In [14]:
import scipy.stats as stats
import numpy as np

In [8]:
z,pval=stats.normaltest(df.temperature)

In [9]:
pval

0.2587479863488254

We can not reject the null hypothesis that  the data come from a normal distribution.

since n>30 we can use a Z-test for the distribution that means  we first caculate the standard error of the mean  and then the z-score. 

In [15]:
stde=stats.sem(df.temperature) # standard error of the mean 
z=(98.6-np.mean(df.temperature))/stde

In [16]:
z

5.4548232923640771

We safely can reject the null hypothesis. Therefore the mean temperature is different than 98.6.

In [17]:
np.mean(df.temperature)

98.24923076923078

In [19]:

stde


0.064304416837891024

The confidence interval is mean+-2 stde. THerefore, here we have a mean of 98.25+- 2*0.06
therefore any temperature smaller than 98.13 or bigger than 98.37 has to be consider as abnormal. 

In [20]:
df.head()

Unnamed: 0,temperature,gender,heart_rate
0,99.3,F,68.0
1,98.4,F,81.0
2,97.8,M,73.0
3,99.2,F,66.0
4,98.0,F,73.0


In [28]:
df2=df[df.gender=='F']
df3=df[df.gender=='M']

In [30]:
stats.ttest_ind(df2.temperature,df3.temperature,equal_var=False)

Ttest_indResult(statistic=2.2854345381656112, pvalue=0.023938264182934196)

Since the pvalue is smaller than 0.05 we can reject the null hypothesis that the two distributions have the same mean therefore man and woman has signi