# T-TEST

T-tests are statistical tools used to compare the means of two groups. There are several types of t-tests, with the most common ones being:
Independent T-Test: Used when you want to compare the means of two different groups independently. For example, you might want to know if there's a difference in test scores between male and female students.
Paired T-Test: Used when you want to compare the means of the same group measured at two different times or conditions. For instance, you might want to determine if a particular therapy improves patients' test scores before and after treatment.
When conducting a t-test, you have two hypotheses:
- Null Hypothesis (H0): This states that there's no significant difference between the two groups being compared.
- Alternative Hypothesis (H1): This states that there is a significant difference between the two groups being compared.
The output of a t-test provides two main components:
- T-Statistic: This measures how large the difference between the means of the two groups is in terms of standard deviation units.
- P-Value: This is the probability of observing a t-statistic as extreme as the one computed, assuming the null hypothesis is true. The smaller the p-value, the stronger the evidence to reject the null hypothesis.
In t-test analysis, you typically set a significance level beforehand (usually 0.05) as the threshold to determine whether the result is statistically significant. If the p-value is less than the chosen threshold, you can reject the null hypothesis and conclude that there's a significant difference between the two groups being compared. However, if the p-value is greater than the threshold, you fail to reject the null hypothesis, indicating that there isn't enough evidence to claim a significant difference.


### Example from the data about Blood Pressure

In [1]:
import pandas as pd
from scipy import stats
from statsmodels.stats import weightstats as stests

In [2]:
df = pd.read_csv('blood_pressure.csv')

In [3]:
df[['bp_before', 'bp_after']].describe()
df.head(5)

Unnamed: 0,patient,sex,agegrp,bp_before,bp_after
0,1,Male,30-45,143,153
1,2,Male,30-45,163,170
2,3,Male,30-45,153,168
3,4,Male,30-45,153,142
4,5,Male,30-45,146,141


In [4]:
df.shape

(120, 5)

In [5]:
ttest, pval = stats.ttest_rel(df['bp_before'], df['bp_after'])
print (pval)

0.0011297914644840823


In [6]:
if pval < 0.05:
    print("reject null hypothesis")
else:
    print("accept null hyphothesis")

reject null hypothesis


_It can be concluded that there is sufficient evidence to support the idea that there is a significant difference between the two groups._

### More example from the data about Fish Market

In [7]:
df = pd.read_csv('Fish.csv')

In [8]:
df[['Length1', 'Length2']].describe()
df.head(5)

Unnamed: 0,Species,Weight,Length1,Length2,Length3,Height,Width
0,Bream,242.0,23.2,25.4,30.0,11.52,4.02
1,Bream,290.0,24.0,26.3,31.2,12.48,4.3056
2,Bream,340.0,23.9,26.5,31.1,12.3778,4.6961
3,Bream,363.0,26.3,29.0,33.5,12.73,4.4555
4,Bream,430.0,26.5,29.0,34.0,12.444,5.134


In [9]:
df.shape

(159, 7)

In [10]:
ttest, pval = stats.ttest_rel(df['Length1'], df['Length2'])
print (pval)

8.98060011577915e-76


In [11]:
if pval < 0.05:
    print("reject null hypothesis")
else:
    print("accept null hyphothesis")

reject null hypothesis


_It can be concluded that there is sufficient evidence to support the idea that there is a significant difference between the two groups._