The T-test is a type of test most commonly used in inferential statistics. This test is most commonly used in scenarios where we need to understand if there is a significant difference between the means of two groups. For example, say we have a dataset of students from certain classes. The dataset contains the height of each student. We are checking whether the average height is 175 cm or not:

- Population: All students in that class
- Parameter of interest: μ, the population of a classroom
- Null hypothesis: The average height is μ = 175
- Alternative hypothesis: μ > 175
- Confidence level: α = 0.05

In [17]:
# import libraries needed
from scipy.stats import ttest_1samp
import numpy as np
import pandas as pd

In [11]:
# create dataset
height = np.array([172, 184, 174, 168, 174, 183, 173, 173, 184, 179, 171, 173, 181, 183, 172, 178, 170, 182, 181, 172, 175, 170, 168, 178, 170, 181, 180, 173, 183, 180, 177, 181, 171, 173, 171, 182, 180, 170, 172, 175, 178, 174, 184, 177, 181, 180, 178, 179, 175, 170, 182, 176, 183, 179, 177])

In [18]:
print(height)
print(type(height))

[172 184 174 168 174 183 173 173 184 179 171 173 181 183 172 178 170 182
 181 172 175 170 168 178 170 181 180 173 183 180 177 181 171 173 171 182
 180 170 172 175 178 174 184 177 181 180 178 179 175 170 182 176 183 179
 177]
<class 'numpy.ndarray'>


In [20]:
df_height = pd.DataFrame(height)

In [12]:
# calcuate mean
height_average = np.mean(height)

In [13]:
print("Average height is = {0:.3f}".format(height_average))

Average height is = 176.545


In [15]:
# use the T-test to compute the new P-value:
tset,pval = ttest_1samp(height, 176)

print("P-value = {}".format(pval))

if pval < 0.05:
 print("We are rejecting the null Hypothesis.")
else:
  print("We are accepting the null hypothesis.")

P-value = 0.3988118409084731
We are accepting the null hypothesis.


Note that our significance level (alpha = 0.05) and the computed P-value is 0.398. Since it is greater than alpha, we are accepting the null hypothesis. This means that the average height of students is 176 cm with a 95% confidence value.

In [22]:
df_height.head()

Unnamed: 0,0
0,172
1,184
2,174
3,168
4,174


In [23]:
df_height.to_csv("heights.csv", index=False)