<a href="https://colab.research.google.com/github/luisfranc123/Tutorials_Statistics_Numerical_Analysis/blob/main/Applied_Statistics/One_sample_t_test.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

###**Body Temperature**
---
Normal human body temperature, as kids are taught in North America, is 98.6°F. But how well is this supported by data? Researchers obtained Body-Temperature measurements on randomly chosen healthy people - which in this case, we're going to simulate that data, using a random generator in Python.

<img src="https://hips.hearstapps.com/hmg-prod/images/face-thermograpy-carotid-royalty-free-image-1578608252.jpg?resize=640:*" width="400" height="300">


**Image taken from**: [Popular-Mechanics](https://www.popularmechanics.com/science/a30459459/body-temperature-decrease/)

**Establish the null and alternative hypothesis**

**H0**: The mean human body temperature is 98.6°F

**Ha**: The mean human body temperature is different from 98.6°F  

###**Libraries**
---

In [None]:
from scipy.stats import norm
from scipy.stats import t
import pandas as pd
import numpy as np

###**1. Simulate Data**
---

In [None]:
# Simulate data
np.random.seed(42)  # For reproducibility
n_sample = 25 # Number of records simulated
mean_simulated = 98.524 # Simulated mean
sd_simulated = 0.678 # Simulated standard deviation
DoF = n_sample - 1 #Define the degrees of freedom
Body_Temp = np.random.normal(mean_simulated, sd_simulated, n_sample) # Data Simulator using a normal distribution approach
df = pd.DataFrame({"Body_Temperature": Body_Temp}) # Set all the records within a DataFrame
print(df.head())

   Body_Temperature
0         98.860772
1         98.430257
2         98.963133
3         99.556614
4         98.365244


###**2.Calculate the Sample Mean and Standard Deviation**
---

In [None]:
sample_mean = df["Body_Temperature"].mean() # Observed sample mean
sample_sd = df["Body_Temperature"].std() # Observed sample SD
print(f"Sample Mean: {sample_mean:.4f}")
print(f"Sample Standard Deviation: {sample_sd:.4f}")

Sample Mean: 98.4131
Sample Standard Deviation: 0.6485


###**3. Define the Claimed Mean**
---

In [None]:
mu_claimed = 98.6  # Hypothesized mean °F

###**4. Test Hypothesis employing the one-sample t-test**
---

\begin{align}
        t_{stat} = \frac{\bar{Y} - \mu}{SE_{\bar{Y}}}
    \end{align}
Where,

$\bar{Y}$ -> Sample mean

$\mu$ -> Population mean

\begin{align}
        {SE_{\bar{Y}}} = \frac{S}{\sqrt{n}}
    \end{align}

In [None]:
# Calculate t-stat
SE = sample_sd/(n_sample**0.5) # Sample error
t_stat = (sample_mean - mu_claimed) / SE

p_value = 2*(t.cdf(-abs(t_stat), DoF))
print(f"t-stat: {t_stat:.2f}, p-value: {p_value:.4f}")
print("------------------------------------")
alpha = 0.05
if p_value > alpha:
  print("We fail to reject the null hypothesis \n(i.e. the mean of human body temperature is on average 98.6 °F)")
else:
  print("We reject the null hypothesis \n(i.e. the mean of human body temperature is different from 98.6 °F)")


t-stat: -1.44, p-value: 0.1626
------------------------------------
We fail to reject the null hypothesis 
(i.e. the mean of human body temperature is on average 98.6 °F)


###**5. The 95% Confidence Interval for the mean**
---
It is defined as the range of values that is likely to contain the true mean of a population **95%** of the time.

In [None]:
# Import Library
import scipy.stats

# To find the t-critical value
t_critic_value = scipy.stats.t.ppf(q = 0.025, df=24)
lower_bound = sample_mean - (abs(t_critic_value))*SE
upper_bound = sample_mean + (abs(t_critic_value))*SE
print(f"The 95% confidence interval of the population mean is: \n{lower_bound:.4f} < μ < {upper_bound:.4f}")

The 95% confidence interval of the population mean is: 
98.1454 < μ < 98.6808
