## Hypothesis Testing in Statistics and Data Science

In statistics and data science, hypothesis testing is a fundamental process used to assess the validity of an assumption (hypothesis) based on sample data. The type of test you choose depends on factors like the type of data, sample size, number of groups, and the goal of the analysis (comparing means, proportions, variances, etc.). Here's an overview of when and how to use different tests for hypothesis testing:

### 1. **Z-Test**
- **Purpose**: Used to compare the sample mean to a known population mean when the population standard deviation is known, typically with large sample sizes (n ≥ 30).
- **Types of Z-Tests**:
  - **One-sample Z-test**: Comparing a single sample mean to a known population mean.
  - **Two-sample Z-test**: Comparing the means of two independent samples when both population variances are known.
  - **Proportion Z-test**: Comparing sample proportions to population proportions or comparing proportions between two samples.
- **When to Use**:
  - **One-sample Z-test**: When testing if a sample mean differs from a known population mean with known variance.
  - **Two-sample Z-test**: Rarely used in practice due to unknown population variances.
  - **Proportion Z-test**: When comparing proportions with large sample sizes.
- **Assumptions**:
  - The data follows a normal distribution or the sample size is large enough for the Central Limit Theorem to apply.
  - Population variance (standard deviation) is known.
  - Samples are independent.

### 2. **T-Test**
- **Purpose**: Used to compare means when the population standard deviation is unknown.
- **Types of T-Tests**:
  - **One-sample T-test**: Compare the sample mean to a known population mean.
  - **Independent Two-sample T-test**:
    - **Equal variances assumed (Student's T-test)**.
    - **Unequal variances assumed (Welch's T-test)**.
  - **Paired T-test**: Compare means from the same group at different times or under different conditions (dependent samples).
- **When to Use**:
  - **Any sample size**, especially when the sample size is small (n < 30).
  - **Unknown population standard deviation**.
  - Data is approximately normally distributed (robust to small deviations from normality).
- **Assumptions**:
  - For **Independent T-tests**:
    - Samples are independent.
    - Data in each group is normally distributed.
    - Variances are equal (for Student's T-test) or unequal (for Welch's T-test).
  - For **Paired T-test**:
    - Differences between pairs are normally distributed.
    - Pairs are matched or related.

### 3. **Chi-Square Test**
- **Purpose**: Used for categorical data to test relationships between variables or test the goodness-of-fit between observed and expected frequencies.
- **Types of Chi-Square Tests**:
  - **Chi-Square Goodness-of-Fit Test**: Compares observed frequencies with expected frequencies based on a theoretical distribution.
  - **Chi-Square Test of Independence**: Tests if there is a relationship between two categorical variables (contingency tables).
- **When to Use**:
  - When both variables are categorical.
  - Testing the independence between variables (e.g., gender and voting preference).
  - Comparing observed vs. expected frequencies (e.g., testing a die for fairness).
- **Assumptions**:
  - Observations are independent.
  - Expected frequency in each cell should be at least 5 (or no more than 20% of cells have expected frequencies less than 5, and all expected frequencies are at least 1).
  - Data should be in the form of counts or frequencies.

### 4. **F-Test**
- **Purpose**: Used to compare variances between two populations or to assess if multiple groups have the same variance (important for ANOVA).
- **When to Use**:
  - When comparing the variances of two populations.
  - As a preliminary test for ANOVA to check homogeneity of variance (though Levene's Test is more robust).
- **Assumptions**:
  - Data in each group is normally distributed.
  - Samples are independent.
- **Caveats**:
  - Sensitive to departures from normality.
  - Alternative tests like **Levene's Test** or **Bartlett's Test** can be used for testing equality of variances.

### 5. **ANOVA (Analysis of Variance)**
- **Purpose**: Used to compare the means of three or more groups.
- **Types of ANOVA**:
  - **One-way ANOVA**: Tests for differences in the means across three or more independent groups based on one factor.
  - **Two-way ANOVA**: Tests for differences with two independent variables and can assess interaction effects.
  - **Repeated Measures ANOVA**: Extension of the paired T-test for more than two time points or conditions.
- **When to Use**:
  - **One-way ANOVA**: Comparing means across multiple groups (e.g., test scores from different teaching methods).
  - **Two-way ANOVA**: Analyzing the effect of two factors (e.g., teaching method and class size on test scores).
- **Assumptions**:
  - The residuals are normally distributed.
  - Homogeneity of variances (homoscedasticity) across groups.
  - Observations are independent.
- **Post-Hoc Tests**:
  - If ANOVA indicates significant differences, post-hoc tests (e.g., Tukey's HSD) are used to identify which groups differ.

### 6. **Mann-Whitney U Test (Wilcoxon Rank-Sum Test)**
- **Purpose**: A non-parametric test to compare two independent samples when the normality assumption is violated.
- **When to Use**:
  - Comparing distributions of two independent samples (alternative to independent T-test).
  - Data is ordinal or continuous but not normally distributed.
- **Assumptions**:
  - Observations are independent.
  - The distributions of the two groups have the same shape (for the test to compare medians).

### 7. **Wilcoxon Signed-Rank Test**
- **Purpose**: A non-parametric test used to compare two related samples when the normality assumption is violated.
- **When to Use**:
  - Comparing paired or matched samples (alternative to paired T-test).
  - Data is ordinal or continuous but not normally distributed.
- **Assumptions**:
  - Differences between pairs are symmetrically distributed around the median.
  - Observations are paired and come from the same population.

### 8. **Kruskal-Wallis H Test**
- **Purpose**: A non-parametric version of one-way ANOVA used to compare three or more independent groups when the normality assumption is violated.
- **When to Use**:
  - Comparing distributions across three or more groups (alternative to one-way ANOVA).
  - Data is ordinal or continuous but not normally distributed.
- **Assumptions**:
  - Observations are independent.
  - The distributions of the groups are similar in shape.

### 9. **Correlation Tests**
- **Pearson's Correlation Coefficient (r)**:
  - **Purpose**: Measures the strength and direction of the linear relationship between two continuous variables.
  - **When to Use**:
    - Both variables are continuous and approximately normally distributed.
    - The relationship is linear.
  - **Assumptions**:
    - Variables are normally distributed.
    - Linear relationship exists.
    - No significant outliers.
- **Spearman's Rank Correlation Coefficient (ρ)**:
  - **Purpose**: A non-parametric measure of rank correlation; assesses how well the relationship between two variables can be described by a monotonic function.
  - **When to Use**:
    - Data is ordinal, interval, or ratio but not normally distributed.
    - Relationship is monotonic but not necessarily linear.
  - **Assumptions**:
    - Observations are independent.
    - Variables are at least ordinal.

### 10. **Fisher's Exact Test**
- **Purpose**: Used to determine if there are nonrandom associations between two categorical variables in a 2x2 contingency table.
- **When to Use**:
  - When sample sizes are small and the Chi-Square Test assumptions are not met.
  - In 2x2 tables regardless of sample size for exact results.
- **Assumptions**:
  - Data are counts or frequencies.
  - Observations are independent.

### 11. **McNemar's Test**
- **Purpose**: Used on paired nominal data to determine whether the row and column marginal frequencies are equal.
- **When to Use**:
  - Testing for changes in responses using paired data (e.g., before and after treatment).
- **Assumptions**:
  - Data are paired and nominal.
  - Observations are independent within pairs.

---

### Summary Table of Test Selection

| Test                         | Purpose                                         | Data Type           | Sample Size   | Key Assumptions                                         | Example Scenario                                          |
|------------------------------|-------------------------------------------------|---------------------|---------------|---------------------------------------------------------|-----------------------------------------------------------|
| **Z-Test**                   | Compare means or proportions (1 or 2 samples)   | Continuous          | Large (n ≥ 30) | Known population variance, normality                    | Compare average height of sample to population mean       |
| **T-Test**                   | Compare means (1 or 2 samples)                  | Continuous          | Any size      | Unknown population variance, normality                  | Compare average scores of two teaching methods            |
| **Chi-Square Test**          | Test association between categorical variables  | Categorical         | Any size      | Expected count ≥ 5 in cells (some exceptions apply)     | Test if gender is associated with political preference    |
| **F-Test**                   | Compare variances                               | Continuous          | Any size      | Normality, independent samples                          | Test if two populations have the same variance in height  |
| **ANOVA**                    | Compare means across 3+ groups                  | Continuous          | Any size      | Normality, equal variances, independence                | Test if 3 teaching methods have different average scores  |
| **Mann-Whitney U Test**      | Compare distributions of 2 independent samples  | Ordinal/Continuous  | Any size      | Independent samples, similar shaped distributions       | Compare satisfaction scores between two stores            |
| **Wilcoxon Signed-Rank Test**| Compare paired samples                          | Ordinal/Continuous  | Any size      | Differences are symmetrically distributed               | Compare blood pressure before and after treatment         |
| **Kruskal-Wallis H Test**    | Compare distributions across 3+ groups          | Ordinal/Continuous  | Any size      | Independent samples, similar shaped distributions       | Test if 3 stores have different customer satisfaction     |
| **Pearson's Correlation**    | Assess linear relationship between variables    | Continuous          | Any size      | Normality, linearity, no outliers                       | Correlation between height and weight                     |
| **Spearman's Correlation**   | Assess monotonic relationship between variables | Ordinal/Continuous  | Any size      | Monotonic relationship                                  | Correlation between rank in class and stress levels       |
| **Fisher's Exact Test**      | Test association in small samples (2x2 tables)  | Categorical         | Small         | Data are counts, independence                           | Test if a new drug affects recovery rates                 |
| **McNemar's Test**           | Test changes in paired nominal data             | Categorical         | Any size      | Paired data, independence within pairs                  | Test if a training program changes pass/fail rates        |

---

### Guidelines for Choosing the Right Test

1. **Type of Data**:
   - **Continuous**: Z-test, T-test, ANOVA, Pearson's Correlation.
   - **Categorical**: Chi-Square Test, Fisher's Exact Test, McNemar's Test.
   - **Ordinal or Non-normal Continuous**: Non-parametric tests (Mann-Whitney U, Wilcoxon Signed-Rank, Kruskal-Wallis H, Spearman's Correlation).

2. **Number of Groups or Variables**:
   - **One group**: One-sample tests (Z-test, T-test).
   - **Two groups**:
     - **Independent samples**: Independent T-test, Mann-Whitney U Test, Chi-Square Test.
     - **Paired samples**: Paired T-test, Wilcoxon Signed-Rank Test, McNemar's Test.
   - **Three or more groups**: ANOVA, Kruskal-Wallis H Test.

3. **Independent vs. Paired Samples**:
   - **Independent samples**: Groups are unrelated (e.g., different people in each group).
   - **Paired samples**: Measurements are related (e.g., same subjects before and after treatment).

4. **Assumptions**:
   - **Parametric Tests** (Z-test, T-test, ANOVA):
     - Normality of data (or residuals).
     - Homogeneity of variances (equal variances).
     - Independence of observations.
   - **Non-parametric Tests**:
     - Fewer assumptions about the data distribution.
     - Used when data does not meet parametric test assumptions.

5. **Sample Size**:
   - **Large samples**: Z-test, but T-tests are acceptable due to the Central Limit Theorem.
   - **Small samples**: T-tests, non-parametric tests.

6. **Variances**:
   - **Equal variances**: Use standard T-tests and ANOVA.
   - **Unequal variances**: Use Welch's T-test, consider non-parametric alternatives.

7. **Testing Relationships**:
   - **Correlation**:
     - **Linear relationship**: Pearson's Correlation.
     - **Monotonic relationship**: Spearman's Rank Correlation.
   - **Association between categorical variables**: Chi-Square Test, Fisher's Exact Test.

---

### Additional Considerations

- **Normality Tests**:
  - **Shapiro-Wilk Test**: Tests for normality in small samples.
  - **Kolmogorov-Smirnov Test**: Tests for normality in larger samples.

- **Homogeneity of Variance Tests**:
  - **Levene's Test**: Tests equality of variances; robust to departures from normality.
  - **Bartlett's Test**: Tests equality of variances; sensitive to departures from normality.

- **Effect Size Measures**:
  - Important to report along with p-values to indicate the magnitude of differences or relationships (e.g., Cohen's d, Eta-squared).

- **Multiple Comparisons**:
  - When performing multiple tests, consider adjusting p-values to control for Type I error (e.g., Bonferroni correction).

- **Confidence Intervals**:
  - Provide additional information about the precision of estimates and should be reported alongside test statistics.

---

Each statistical test has its own assumptions and conditions. Selecting the appropriate test depends heavily on your data structure, the specific hypotheses you're testing, and whether the assumptions of the test are met. Always explore your data thoroughly before choosing a test, and consider consulting a statistician if in doubt.