## What is the true normal human body temperature? 

#### Background

The mean normal body temperature was held to be 37$^{\circ}$C or 98.6$^{\circ}$F for more than 120 years since it was first conceptualized and reported by Carl Wunderlich in a famous 1868 book. In 1992, this value was revised to 36.8$^{\circ}$C or 98.2$^{\circ}$F. 

#### Exercise
In this exercise, you will analyze a dataset of human body temperatures and employ the concepts of hypothesis testing, confidence intervals, and statistical significance.

Answer the following questions **in this notebook below and submit to your Github account**. 

1.  Is the distribution of body temperatures normal? 
    - Remember that this is a condition for the CLT, and hence the statistical tests we are using, to apply. 
2.  Is the true population mean really 98.6 degrees F?
    - Bring out the one sample hypothesis test! In this situation, is it approriate to apply a z-test or a t-test? How will the result be different?
3.  At what temperature should we consider someone's temperature to be "abnormal"?
    - Start by computing the margin of error and confidence interval.
4.  Is there a significant difference between males and females in normal temperature?
    - Set up and solve for a two sample hypothesis testing.

You can include written notes in notebook cells using Markdown: 
   - In the control panel at the top, choose Cell > Cell Type > Markdown
   - Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

#### Resources

+ Information and data sources: http://www.amstat.org/publications/jse/datasets/normtemp.txt, http://www.amstat.org/publications/jse/jse_data_archive.htm
+ Markdown syntax: http://nestacms.com/docs/creating-content/markdown-cheat-sheet

****

In [43]:
import pandas as pd
import numpy as np
from scipy import stats
from scipy.stats import kurtosis
from scipy.stats import skew
from scipy.stats import skewtest
from scipy.stats import  kurtosistest
from scipy.stats import  normaltest
from scipy.stats import ttest_1samp
from scipy.stats import ttest_ind 


In [44]:
df = pd.read_csv('data/human_body_temperature.csv')
df.head(5)


Unnamed: 0,temperature,gender,heart_rate
0,99.3,F,68
1,98.4,F,81
2,97.8,M,73
3,99.2,F,66
4,98.0,F,73


## Is the distribution of body temperatures normal?

In [45]:
## If the distribution is normal then kurtosis will be 3 and skew = 0
k = kurtosis( df['temperature'] , axis=0, fisher=False, bias=True) 
s = skew(df['temperature'], axis=0, bias=True)
print ('Kurtosis is ' , k)
print ('Skew is ' ,s)

z,p = normaltest(df.temperature)

print ( 'P values ', p )       

Kurtosis is  3.7049597854114693
Skew is  -0.004367976879081625
P values  0.258747986349


Here P values is greater than 5%  , so distribution is normal . also skew is near to 0 and kurtosis is equal to 3. From this we can say that the distibution is normal


## Is the true population mean really 98.6 degrees F
Null Hypothesis : Mean will be equal to  98.6
Alternative Hypothesis : Mean will not be equal to 98.6
 


In [52]:
from statsmodels.stats.weightstats import ztest
ztest(df.temperature,value=98.6)

statistic ,pvalue =  ttest_1samp(df.temperature,98.6)
 
statistic ,pvalue


(-5.4548232923645195, 2.4106320415561276e-07)

 In Both case pvalue is greater than .05 . so we can reject the null hypothesis
 

## At what temperature should we consider someone's temperature to be "abnormal"?

In [53]:
mean = np.mean(df.temperature)
mean
sem =   stats.sem(df.temperature, axis=None, ddof=0) 


lower_bound = mean - 1.96* sem
upper_bound = mean + 1.96* sem

lower_bound , upper_bound

(98.123679804428193, 98.374781734033363)

So Any  temperature  less then 98.123679804428193 and greater than 98.374781734033363 will be Abnormal

## Is there a significant difference between males and females in normal temperature

Null Hypothesis : there is no difference between males and females in normal temperature.
Alternative Hypothesis : there is difference between Male and Females in Normal Temperature 


In [54]:
temp_females = df.temperature[df.gender=='F']
temp_males  =  df.temperature[df.gender=='M']
t_static , pvalues = ttest_ind(temp_females,temp_males)
## t_critical values in at @ = .05 . For two tailed test @ -  .025
t_critical = 1.984

t_static , t_critical , pvalues
 



(2.2854345381656103, 1.984, 0.023931883122395609)

 The two-tailed P value equals 0.0239
 By conventional criteria, this difference is considered to be statistically significant. 
 So We can reject the null hypothesis . 
 Hence there is signigicant diffrence in mean temperature between man and woman
 