<a href="https://www.kaggle.com/code/hassaneskikri/t-tests-statistical-tests?scriptVersionId=168274997" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

In [1]:
%%html
<style>
    *{
        font-family: 'Arial', sans-serif;
        align-item : center;
        justifiy-content:center;
        max-width : 1000px;
    }
    h1{
        color: #FFD700;
        border-bottom: 3px solid #FFD700;
        text-align:center;
        padding-bottom: 0.3em;
        font-size:bold;
    }
    h2{
        color:#2dd4bf;
        padding-bottom: 0.3em;
    }
    p, ol, ul {
        font-size: 18px;
        line-height: 1.5;
        color: #eee;
    }
    a {
        color: #d946ef;
        text-decoration: none;
    }
    a:hover {
        text-decoration: underline;
        color : #86198f;
    }
    img{
        display: flex;
        margin-left: auto;
        margin-right: auto;
        width: 700px;
        height: auto;
        text-align: center;
        border-radius: 15px;
    }
    
</style>


A `t test` is a statistical test that is used to compare the means of two groups. It is often used in hypothesis testing to determine whether a process or treatment actually has an effect on the population of interest, or whether two groups are different from one another.

# Types of t-Tests

There are three main types of t-tests, each suited for different testing scenarios:

- `Independent Samples t-Test:` Compares the means between two independent or unrelated groups on the same variable. For example, comparing the average heights of men vs. women.

- `Paired Samples t-Test (Dependent t-Test):` Compares the means from the same group at different times or under two different conditions. For example, measuring students' test scores before and after a training session.

- `One-Sample t-Test:` Tests the mean of a single group against a known mean. For example, comparing the average intelligence quotient (IQ) of a sample to the population mean.

![image.png](attachment:fe6a2852-a580-4588-97b4-5497dd86d7ee.png)

# When to Use a t-Test

The choice to use a t-test depends on the research design and the nature of the data. Here are some criteria:

- Sample Size: Ideally suited for small sample sizes (less than 30).
- Data Type: Used with continuous data (e.g., height, weight).
- Distribution: Assumes that the data follows a normal distribution, though the t-test is relatively robust to violations of this assumption when the sample size is large.
- Variance: Assumes homogeneity of variance when comparing two groups, which can be tested with Levene’s test for equality of variances.

![image.png](attachment:4bd0d8ed-b33f-44c8-b700-51e0f68f89d7.png)

# implementation python

### independent t-test

Imagine we're comparing the average heights of men and women. We'll create two arrays of heights, assuming they follow a normal distribution, with men generally being taller.

In [2]:
import numpy as np
from scipy import stats


np.random.seed(0) 
men_heights = np.random.normal(175, 7, 30)  # Mean = 175cm, SD = 7cm, n = 30
women_heights = np.random.normal(168, 6, 30)  # Mean = 168cm, SD = 6cm, n = 30


t_stat, p_value = stats.ttest_ind(men_heights, women_heights)

print(f"Independent t-test results -- t-statistic: {t_stat}, p-value: {p_value}")


Independent t-test results -- t-statistic: 6.85647405609917, p-value: 5.104108367265407e-09


**Result:** The t-statistic is 6.856, with a p-value of approximately 5.10e-09.

**Conclusion:** There is a statistically significant difference in average heights between men and women, with men being taller on average.

### Paired t-test

 let's consider measuring students' test scores before and after a training session to see if the session had an effect.

In [3]:

np.random.seed(1) 
before_scores = np.random.normal(70, 10, 30)  # Mean score = 70, SD = 10
after_scores = before_scores + np.random.normal(5, 2, 30)  # Improvement after session

t_stat, p_value = stats.ttest_rel(after_scores, before_scores)

print(f"Paired t-test results -- t-statistic: {t_stat}, p-value: {p_value}")


Paired t-test results -- t-statistic: 16.715970130327225, p-value: 1.9912548942679014e-16


**Result:** The t-statistic is 16.716, with a p-value of approximately 1.99e-16.

**Conclusion:** The training session had a statistically significant positive effect on students' test scores.

### One-Sample t-Test
Suppose we want to compare the average IQ of a sample group against the known population mean of 100.

In [4]:

np.random.seed(2)  
sample_iq = np.random.normal(102, 15, 30)


population_mean = 100


t_stat, p_value = stats.ttest_1samp(sample_iq, population_mean)

print(f"One-sample t-test results -- t-statistic: {t_stat}, p-value: {p_value}")


One-sample t-test results -- t-statistic: -1.1199029398662517, p-value: 0.2719448206483876


**Result:** The t-statistic is -1.120, with a p-value of approximately 0.272.

**Conclusion:** There is no statistically significant difference between the sample group's average IQ and the known population mean IQ of 100.

# Resources

- [ t test](https://www.scribbr.com/statistics/t-test/)

- [Z-statistic vs t-statistics](https://www.youtube.com/watch?v=DEkPZv5ppHI&t=100s&ab_channel=AceTutors)