# Steps to Assess Statistical Significance:
1. Formulate Hypotheses:
    - Null Hypothesis (H0): there is no effect, and any observed results are due to random chance.
    - Alternative Hypothesis (H1 or Ha): there is a real effect or difference.

2. Choose a Significance Level (α): The probability of rejecting the null hypothesis when it is true (Type I error). Commonly 0.05.

3. Select an Appropriate Statistical Test: Common tests include t-tests, chi-square tests, ANOVA, and regression analysis.

4. Calculate the Test Statistic and P-value

5. Report Results and Consider Practical Significance

# Common Statstical Tests
### 1. T-tests
**Purpose** : Compare the means of two groups. <br>
**Assumptions**: Normal distribution of data, equal variances between groups, and independence of observations.<br>
**Types**: <br>
    - Independent (Two-sample) T-test: Compares the means of two independent groups (e.g., comparing the heights of men and women). <br>
    - Paired (Dependent) T-test: Compares the means of two related groups (e.g., comparing the weights of individuals before and after a diet).

### 2. ANOVA (Analysis of Variance):
**Purpose**: Compare the means of two or more groups.<br>
**Assumptions**: Assumes normal distribution of data, homogeneity of variances, and independence of observations.<br>
**Types**:<br>
    - One-way ANOVA: Compares the means of multiple groups based on one factor (e.g., comparing the test scores of students from different schools).<br>
    - Two-way ANOVA: Compares the means of groups based on two factors (e.g., comparing the test scores of students from different schools and different grades). <br>
**Note**: Analysis on a one way ANOVA with two independent samples is a T-test.

### 3. Chi-square Tests:
**Purpose**: Test the relationship between categorical variables. <br>
**Assumptions**: Assumes a sufficient sample size (expected frequencies in each cell should be at least 5) and independence of observations. <br>
**Types**: <br>
    - Test of Independence: Assesses whether there is an association between two categorical variables (e.g., gender and voting preference). <br>
    - Goodness-of-Fit Test: Tests whether the distribution of a single categorical variable matches an expected distribution (e.g., testing if a die is fair).

### 4. Regression Analysis:
**Purpose**: Model the relationship between a dependent variable and one or more independent variables.<br>
**Assumptions**: Assumes linearity (linear relationship between dependent and independent variables), independence of errors, homoscedasticity (constant variance of errors), and normality of errors for linear regression. <br>
**Types**:<br>
    - Simple Linear Regression: Models the relationship between a dependent variable and a single independent variable (e.g., predicting sales based on advertising budget).<br>
    - Multiple Linear Regression: Models the relationship between a dependent variable and multiple independent variables (e.g., predicting house prices based on size, location, and number of rooms).<br>
    - Logistic Regression: Used for binary classification problems (e.g., predicting whether a customer will buy a product or not).<br>

# Examples

In [2]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import scipy.stats as stats
rng = np.random.default_rng()

## T-Tests
### Independent (Two-sample):
**Groups**: Compares the means of two independent groups. Subjects in one group have no connection to the subjects in the other group.<br>
**Examples**: Comparing the heights of men and women, or test scores of students from two different schools

In [3]:
def reject_or_accept(A, B, alpha=0.05, type='independent'):
    if type == 'independent':
        t_stat, p_value = stats.ttest_ind(A, B) # Group 1 vs Group 2
    elif type == 'paired':
        t_stat, p_value = stats.ttest_rel(A, B) # Before and After
    else:
        ValueError('Test type should be either independent or paired.')
    return True if p_value < alpha else False

def simulate_t_tests(alpha=0.05, type='independent'):
    results = {
        10: [], 
        100: [],
        1000: [],
        10000: []
    }
    for i, sample_size in enumerate(results.keys()):
        for _ in range(1000):
            A = rng.standard_normal(sample_size)
            B = rng.standard_normal(sample_size)
            results[sample_size].append(reject_or_accept(A, B, alpha, type='independent'))
        _, counts = np.unique(np.array(results[sample_size]), return_counts=True)
        print(f"Sample Size: {sample_size} {(len(results)-i)*' '} Reject: {counts[0]}, Accepted: {counts[1]}")

print("Alpha = 0.05 ______________________________________________________")
simulate_t_tests(alpha=0.05, type='independent')
print("\nAlpha = 0.1 ______________________________________________________")
simulate_t_tests(alpha=0.1, type='independent')

Alpha = 0.05 ______________________________________________________
Sample Size: 10      Reject: 951, Accepted: 49
Sample Size: 100     Reject: 939, Accepted: 61
Sample Size: 1000    Reject: 942, Accepted: 58
Sample Size: 10000   Reject: 949, Accepted: 51

Alpha = 0.1 ______________________________________________________
Sample Size: 10      Reject: 889, Accepted: 111
Sample Size: 100     Reject: 899, Accepted: 101
Sample Size: 1000    Reject: 910, Accepted: 90
Sample Size: 10000   Reject: 899, Accepted: 101


### Paired (Dependent):
**Groups**: Compares the means of two related groups. Subjects in these groups are connected. Typically through repeated measurements on the same subjects. <br>
**Examples**: Comparing the weights of individuals before and after a diet, or the test scores of students before and after a specific training program

In [4]:
print("Alpha = 0.05 ______________________________________________________")
simulate_t_tests(alpha=0.05, type='paired')
print("\nAlpha = 0.1 ______________________________________________________")
simulate_t_tests(alpha=0.1, type='paired')

Alpha = 0.05 ______________________________________________________
Sample Size: 10      Reject: 953, Accepted: 47
Sample Size: 100     Reject: 949, Accepted: 51
Sample Size: 1000    Reject: 953, Accepted: 47
Sample Size: 10000   Reject: 951, Accepted: 49

Alpha = 0.1 ______________________________________________________
Sample Size: 10      Reject: 913, Accepted: 87
Sample Size: 100     Reject: 903, Accepted: 97
Sample Size: 1000    Reject: 894, Accepted: 106
Sample Size: 10000   Reject: 902, Accepted: 98


### Bonferroni correction
When multiple hypothesis tests are conducted, the Bonferroni correction can help to maintain the overall type I error rate at a desired level.

In [6]:
sample_size = 100
n_tests = 3
data1 = rng.standard_normal(sample_size)
data2 = rng.standard_normal(sample_size)
data3 = rng.standard_normal(sample_size)
t_stat1, p_val1 = stats.ttest_ind(data1, data2)
t_stat2, p_val2 = stats.ttest_ind(data1, data3)
t_stat3, p_val3 = stats.ttest_ind(data2, data3)
print(f"Original p-values: {p_val1:.4f}, {p_val2:.4f}, {p_val3:.4f}")

# Bonferroni correction
alpha = 0.05
bonferroni_alpha = alpha / n_tests
print(f"Bonferroni corrected alpha: {bonferroni_alpha:.4f}")

adjusted_p_val1 = min(p_val1 * n_tests, 1)
adjusted_p_val2 = min(p_val2 * n_tests, 1)
adjusted_p_val3 = min(p_val3 * n_tests, 1)
print(f"Adjusted p-values: {adjusted_p_val1:.4f}, {adjusted_p_val2:.4f}, {adjusted_p_val3:.4f}")

Original p-values: 0.3707, 0.3088, 0.0686
Bonferroni corrected alpha: 0.0167
Adjusted p-values: 1.0000, 0.9263, 0.2059


## Chi-square 
### Test of Independence:

In [101]:
# Rows represent categories of variable 1 (e.g., Gender: Male, Female)
# Columns represent categories of variable 2 (e.g., Preference: Yes, No)
data = np.array([[30, 20],   #   Male: 30 Yes, 20 No
                 [35, 15]])  # Female: 35 Yes, 15 No

chi2_stat, p_value, dof, expected = stats.chi2_contingency(data)
True if p_value < 0.05 else False

False

[Khan Academy: Contingency table chi-square test](https://www.youtube.com/watch?v=hpWdDmgsIRE)

Null Hypothesis: Herbs do nothing <br>
Alternative Hypothesis: Herbs do something

In [122]:
#                 Herb 1      Herb 2   Placebo
data = np.array([[20,         30,      30],        # Sick 
                 [100,        110,     90]])       # Not Sick

chi2_stat, p_value, dof, expected = stats.chi2_contingency(data)
print(f"Chi-squared Statistic: {chi2_stat:.04f}")
print(f"P-value: {p_value:.04f}")

alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis - the herbs offer patient benefit.")
else:
    print("Fail to reject the null hypothesis - we are not confident that the herbs improved patient outcomes.")

Chi-squared Statistic: 2.5258
P-value: 0.2828
Fail to reject the null hypothesis - the observed frequencies match the expected distribution.


## Chi-square Goodness-of-Fit Test:
[Khan Academy Example](https://www.youtube.com/watch?v=2QeDRsxSF9M)

In [118]:
#          Day:     M   T   W   T   F   S
expected_freq =   [.1, .1, .15, .2, .3, .15]  # Expected frequencies in each category. By default the categories are assumed to be equally likely.
observed_counts = [30, 14, 34, 45, 57, 20]  # Observed values in each category.
observed_freq = [i/sum(observed_counts) for i in observed_counts] # Normalize observed values to frequencies.
expected_counts = [int(i*sum(observed_counts)) for i in expected_freq]

chi2_stat, p_value = stats.chisquare(f_obs=observed_counts, f_exp=expected_counts)

print(f"Chi-squared Statistic: {chi2_stat:.04f}")
print(f"P-value: {p_value:.04f}")

alpha = 0.05
if p_value < alpha:
    print("Reject the null hypothesis - the observed frequencies do not match the expected distribution.")
else:
    print("Fail to reject the null hypothesis - the observed frequencies match the expected distribution.")


Chi-squared Statistic: 11.4417
P-value: 0.0433
Reject the null hypothesis - the observed frequencies do not match the expected distribution.


# One-way ANOVA
Tests whether two or more groups have the same population mean. 

[Zedstatstics Example](https://www.youtube.com/watch?v=9cnSWads6oo)

$
\text{Sum of Squares = SST} = \sum ({X - \overline{X}})^2
$

$
\text{Sum of Squares Within Groups = SSW} = \sum ({\overline{X} - \overline{\overline{X}_i}})^2
$

$
\text{Sum of Squares Between Groups = SSB} = \sum ({X_i - \overline{X}})^2
$


$
\text{F} = \frac{\frac{\text{SSB}}{c - 1}} {\frac{\text{SSW}}{n - c}}
\\ \text{Where:}
\\ \quad \text{c = number of categories}
\\ \quad \text{n = number of observations}
$


In [149]:
def SSW(group):
    X_bar = sum(group) / len(group)
    return sum([(num-X_bar)**2 for num in group])

def SSB(group, global_mean):
    X_bar = sum(group) / len(group)
    return len(group) * (X_bar - global_mean)**2

def f_stat(SSB, SSW, n, c):
    return (SSB / (c-1)) / ((SSW)/(n-c))

group1 = [1, 5, 9]
group2 = [3, 5, 7]
group3 = [4, 5, 6]
groups = [group1, group2, group3]

n = sum([len(group) for group in groups])
c = len(groups)
group_means = [sum(group)/len(group) for group in groups]
global_mean = sum(group_means) / len(group_means)

for group in groups:
    print(group, "     F-Statstic:", f_stat(SSB(group, global_mean), SSW(group), n, c), "   SSW:", SSW(group), "   SSB:", SSB(group, global_mean))

[1, 5, 9]      F-Statstic: 0.0    SSW: 32.0    SSB: 0.0
[3, 5, 7]      F-Statstic: 0.0    SSW: 8.0    SSB: 0.0
[4, 5, 6]      F-Statstic: 0.0    SSW: 2.0    SSB: 0.0


In [150]:
group1 = [1, 3, 5]
group2 = [5, 7, 9]
group3 = [4, 5, 6]
groups = [group1, group2, group3]

n = sum([len(group) for group in groups])
c = len(groups)
group_means = [sum(group)/len(group) for group in groups]
global_mean = sum(group_means) / len(group_means)

for group in groups:
    print(group, "     F-Statstic:", f_stat(SSB(group, global_mean), SSW(group), n, c), "   SSW:", SSW(group), "   SSB:", SSB(group, global_mean))

[1, 3, 5]      F-Statstic: 4.5    SSW: 8.0    SSB: 12.0
[5, 7, 9]      F-Statstic: 4.5    SSW: 8.0    SSB: 12.0
[4, 5, 6]      F-Statstic: 0.0    SSW: 2.0    SSB: 0.0


In [126]:
# Sample data
group1 = [20, 21, 19, 22, 24]
group2 = [28, 32, 30, 29, 27]
group3 = [25, 29, 27, 26, 28]

# Perform one-way ANOVA
f_statistic, p_value = stats.f_oneway(group1, group2, group3)

print(f"F-statistic: {f_statistic:.04f}")
print(f"P-value: {p_value:.04f}")

# Interpretation
alpha = 0.05
if p_value < alpha:
    print("Reject null hypothesis - significant differences exist between the groups.")
else:
    print("Fail to reject null hypothesis - no significant difference between the groups.")

F-statistic: 25.8788
P-value: 0.0000
Reject null hypothesis - significant differences exist between the groups.
