## Two-Sample t-test

The two-sample t-test is a statistical hypothesis test used to determine if two independent samples have different means. It assesses whether the difference in the means of two groups is statistically significant or if it could have occurred by chance.

### How it Works:
- Null Hypothesis (
$H_0$
 ): There is no significant difference between the means of the two groups.

- Alternative Hypothesis (
$H_a$
​
 ): There is a significant difference between the means of the two groups.

- Assumptions:

    - Each group follows a normal distribution.
    - The variances of the two groups are equal (for the equal variance t-test, also known as Welch's t-test if unequal variances).
    - The samples are independent.

- Test Statistic: The test statistic (t-statistic) is calculated based on the difference in sample means adjusted for sample sizes and standard deviations.

- Decision: If the calculated t-statistic is larger than a critical value from the t-distribution (based on chosen significance level, usually 0.05), then we reject the null hypothesis and conclude that there is a significant difference between the means of the two groups.

### Use Cases:

- #### Medical Research:

    - Example: Testing the effectiveness of a new drug by comparing the average recovery time of patients who receive the drug versus those who receive a placebo.
    - Use: Researchers use a two-sample t-test to determine if there is a statistically significant difference in recovery times between the two groups.

- #### Market Research:

    - Example: Comparing the average sales of a product before and after a marketing campaign.
    - Use: Marketers use a two-sample t-test to assess whether the marketing campaign has had a significant impact on sales.

- #### Education:

    - Example: Comparing the exam scores of students who used different study methods (e.g., traditional lectures vs. online tutorials).
    - Use: Educators use a two-sample t-test to evaluate which study method leads to better academic performance.

- #### Quality Control:

    - Example: Comparing the mean strength of two different materials used in manufacturing.
    - Use: Engineers use a two-sample t-test to determine if there is a significant difference in the strength characteristics between the two materials.

Sample 1:

    Sample size n1 = 40
    Sample mean weight x1 = 300
    Sample standard deviation s1 = 18.5

Sample 2:

    Sample size n2 = 38
    Sample mean weight x2 = 305
    Sample standard deviation s2 = 16.7


Calculate the test statistic and p-value.


$$t = \frac{{\mu_1}-{\mu_2}}{s_p\sqrt{1/n_1 + 1/n_2}}$$
    
where $$s_p = \sqrt{\frac{(n_1-1)*s_1^2 + (n_2-1)*s_2^2}{n_1 + n_2 - 2}} $$ 

In [None]:
import numpy as np

# 1. Calculate $s_p$

s_p = np.sqrt(((40-1) * 18.5**2 + (38-1) * 16.7**2)/(40 + 38 - 2))

In [None]:
t = (300 - 305)/(s_p * np.sqrt(1/40 + 1/38))

- degree of freedom: $\text{df} = n_1 + n_2 - 2$

In [None]:
from scipy import stats

z_value = stats.t.ppf(0.05, 40 + 38 - 2)

$$ t_{0.95, 76} = 1.665 $$

Because $ t < z_{0.95, 76} $, we reject the null hypothesis

The following two-sample t-test was generated for the AUTO83B.DAT data set. The data set contains miles per gallon for U.S. cars (sample 1) and for Japanese cars (sample 2); the summary statistics for each sample are shown below.

    SAMPLE 1:
        NUMBER OF OBSERVATIONS      = 249
        MEAN                        =  20.14458
        STANDARD DEVIATION          =   6.41470
        STANDARD ERROR OF THE MEAN  =   0.40652
  
    SAMPLE 2:
        NUMBER OF OBSERVATIONS      = 79
        MEAN                        = 30.48101
        STANDARD DEVIATION          =  6.10771
        STANDARD ERROR OF THE MEAN  =  0.68717

We are testing the hypothesis that the population means are equal for the two samples. 

We assume that the variances for the two samples are equal.

H0:  μ1 = μ2

Ha:  μ1 ≠ μ2

In [None]:
def run_sp(s_1, n_1, s_2, n_2):
    
    s_p = np.sqrt(((n_1-1) * s_1**2 + (n_2-1) * s_2**2)/(n_1 + n_2 - 2))
    
    return s_p
    
s_p = run_sp(6.4147, 249, 6.10771, 79)

## Binomial Distribution Null Hypothesis Test

### Example Scenario
Suppose you are studying a new medication that is claimed to have a 70% success rate in treating a certain condition. You want to test whether this claim is true.

### Null Hypothesis ($H_0$)
The null hypothesis will state that the true probability of success $p$
is equal to the claimed probability. In this case, the null hypothesis is:
$$H_0: p=0.7$$

### Alternative Hypothesis ($H_a$)
The alternative hypothesis could be that the true probability of success is different from the claimed probability. This could be a two-tailed test (not equal to 0.70) or a one-tailed test (greater than or less than 0.70). Here, we'll consider a two-tailed test:
$$H_a: p\neq0.7$$

Suppose you conduct 100 trials and observe 65 successes.


In [None]:
n = 100
p_measured = 65/100
p_0 = 0.7

standard_deviation = np.sqrt(p * (1-p)/n)

In [None]:
z_score = (p_measured - p_0)/standard_deviation

For a 95% confidence level (two-tailed test), the critical z-values are approximately 
$\pm1.96 $

### Example Scenario

To determine if two coins have the same probability of landing heads or tails by flipping them several times and counting the number of heads and tails that appear.

### Null Hypothesis ($H_0$)

The two coins have the same probability of landing heads.

$$H_0: p_1 = p_2$$

### Alternative Hypothesis ($H_a$)

The two coins do not have the same probability of landing heads.

$$H_a: p_1 \neq p_2$$


### Sample Standard Deviation


$$
\text{SE} = \sqrt{\hat{p} \left(1 - \hat{p}\right) \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}
$$

where:
- $\hat{p}_1$ is the sample proportion of heads for the first coin,
- $\hat{p}_2$ is the sample proportion of heads for the second coin,
- $n_1$ is the number of flips for the first coin,
- $n_2$ is the number of flips for the second coin,
- $\hat{p}$ is the pooled proportion, calculated as:

$$
\hat{p} = \frac{X_1 + X_2}{n_1 + n_2}
$$

with $X_1$ and $X_2$ being the number of heads observed for the first and second coin, respectively.


Here $p_1$ and $p_2$ are the probabilities of landing heads for Coin 1 and Coin 2, respectively.

### z-score

$$z = \frac{(\hat{p}_1 - \hat{p}_2)}{\text{SE}}$$


- Coin 1: n = 100, head: 55
- Coin 2: n = 100, head: 50

In [None]:
p_1 = 55/100
p_2 = 50/100

pool_probability = (55 + 50)/(100 + 100)

standard_deviation = np.sqrt(pool_probability * (1-pool_probability) * (1/100 + 1/100))

z = (p_1 - p_2)/standard_deviation

## Chi-Squared Test

The chi-squared test is used to determine if there is a significant association between two categorical variables. It is often used for:

    - Goodness-of-Fit Test: To determine if a sample matches a population with a specific distribution.
    - Test of Independence: To determine if there is an association between two categorical variables in a contingency table.
    


### Example: Do gender and color preference independent?

|  | Blue | Green | Pink |
|----------|----------|----------|----------|
| Boy | 100 | 150 | 20 |
| Girl | 20 | 30 | 180 |


### Null Hypothesis ($H_0$)

For the population of elementary school students, gender and favorite colors are not related.

### Alternative Hypothesis ($H_a)

For the population of elementary school students, gender and favorite colors are related.

### Set Alpha

${\alpha}=0.05$

### Degree of Freedom

$\text{df} = (\text{n_row}-1) * (\text{n_col}-1)$

### Decision State Rule

${t_{0.025, 2}} = 5.99$

If ${\chi}^2 > 5.99$, reject the hypothesis.

The ${\chi}^2$-value formula is given by:

$$
\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
$$


where:
- $O_i$ is the observed frequency for the $i$-th category,
- $E_i$ is the expected frequency for the $i$-th category.

|  | Blue | Green | Pink ||
|----------|----------|----------|----------|-------|
| Boy | 100 | 150 | 20 |270
| Girl | 20 | 30 | 180 |230
|      | 120| 180|200 

- $E_i$ for example, Boy & Blue: 120 * 270 / 500 = 64.8

### Can we perform calculation with python?

In [None]:
import pandas as pd

df = pd.DataFrame(data=[[100, 150, 20], [20, 30, 180]], columns=['Blue', "Green", "Pink"], index=['Boy', 'Girl'])

In [None]:
df

In [None]:
df.sum(axis=0)

In [None]:
df.sum(axis=1)

In [None]:
color_dict = df.sum(axis=0).to_dict()
gender_dict = df.sum(axis=1).to_dict()

total_sample = df.sum().sum()

In [None]:
total_sample

In [None]:
chi2 = 0

for idx, row in df.iterrows():
    print(f"*****{idx}*****")
    for c in df.columns:
        E = color_dict[c] * gender_dict[idx]/total_sample
        print(f"{c}: {E}")
        chi2 += (df.loc[idx][c] - E)**2/E

In [None]:
chi2

### It can be even faster with library `scipy`

In [None]:
from scipy.stats import chi2_contingency

In [None]:
chi2_contingency([[100, 150, 20], [20, 30, 180]])

In [None]:
output = chi2_contingency([[100, 150, 20], [20, 30, 180]])

In [None]:
output.statistic

In [None]:
output.pvalue