- Hypothesis testing is a statistical method used to test an assumption or claim (hypothesis) about a population parameter. 

##### steps
- 1. State the Hypotheses:
      - Null Hypothesis ($H_0$): This is the default assumption, often stating that there is no effect or no difference.
      - Alternative Hypothesis($H_A$ OR $H_1$): This is the hypothesis we aim to support, suggesting that there is an effect or a difference.
___

- 2.  The significance level, often denoted as 𝛼 ,is a threshold used in hypothesis testing to determine whether the results of a statistical test are statistically significant.
 It represents the probability of making a Type I error, which occurs when you incorrectly reject a true null hypothesis (i.e., conclude that there is an effect when there is none).




- Key Points about the Significance Level (α): The significance level is the probability of rejecting the null hypothesis when it is actually true. It sets the criterion for how extreme the data must be to reject $𝐻_0$. The lower the 𝛼, the stronger the evidence required to reject the null hypothesis.
- For example, 𝛼=0.05 means you are willing to accept a 5% chance of incorrectly rejecting the null hypothesis (i.e., making a Type I error).


##### Interpretation:
- If the p-value (probability value) from the hypothesis test is less than 𝛼, you reject the null hypothesis ($𝐻_0$) and conclude that there is statistically significant evidence for the alternative hypothesis ($𝐻_𝐴$).
- If the p-value is greater than or equal to 𝛼, you fail to reject the null hypothesis. This means you do not have enough evidence to support $H_A$.

___

3. Select the Test Statistic: Depending on the nature of the data (e.g., sample size, known population variance), you may use different test statistics. 

- A test statistic is a standardized value used in statistical hypothesis testing to determine whether to reject the null hypothesis. 
- It quantifies how far the sample data deviates from what is expected under the null hypothesis. 
-  Based on the type of test being conducted (e.g., for means, proportions, or variances), the test statistic will follow a particular distribution (e.g., t-distribution, z-distribution, chi-square distribution).

## 1. Z-Statistic (Z-Test)
Used when the **population standard deviation** is known, and the sample size is large (usually \( n > 30 \)).

### Formula:
$$
Z = \frac{\bar{x} - \mu}{\frac{\sigma}{\sqrt{n}}}
$$

where:
- \( \bar{x} \) = sample mean
- \( \mu \) = population mean
- \( \sigma \) = population standard deviation
- \( n \) = sample size

### When to Use:
- Known population standard deviation.
- Large sample size (typically \( n > 30 \)).

---

## 2. T-Statistic (T-Test)
Used when the **population standard deviation** is unknown and the sample size is small (usually \( n < 30 \)).

### Formula:
$$
t = \frac{\bar{x} - \mu}{\frac{s}{\sqrt{n}}}
$$

where:
- \( \bar{x} \) = sample mean
- \( \mu \) = population mean
- \( s \) = sample standard deviation
- \( n \) = sample size

### When to Use:
- Unknown population standard deviation.
- Small sample size (typically \( n < 30 \)).

---

## 3. Chi-Square Statistic (Chi-Square Test)
Used for categorical data in tests like **goodness-of-fit** or **independence**.

### Formula for Goodness-of-Fit:
$$
\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
$$

where:
- \( O_i \) = observed frequency
- \( E_i \) = expected frequency

### When to Use:
- Categorical data analysis (e.g., testing if data follows a specific distribution or testing independence in contingency tables).

---

## 4. F-Statistic (F-Test)
Used to compare variances between two or more groups, often used in **ANOVA** (Analysis of Variance).

### Formula:
$$
F = \frac{\text{Variance between groups}}{\text{Variance within groups}}
$$

### When to Use:
- Comparing the variances of two or more groups.
- Used in **ANOVA** for comparing means across multiple groups.

---

## 5. Proportion Z-Statistic (Z-Test for Proportions)
Used to test hypotheses about **proportions** in a population.

### Formula:
$$
Z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}}
$$

where:
- \( \hat{p} \) = sample proportion
- \( p_0 \) = population proportion under the null hypothesis
- \( n \) = sample size

### When to Use:
- Hypothesis tests involving proportions (e.g., testing whether the proportion of successes in a sample differs from a hypothesized population proportion).

---

## 6. Mann-Whitney U Statistic (for Non-Parametric Tests)
Used when the data does not meet the assumptions for the **t-test** (e.g., when data is ordinal or not normally distributed). It is a non-parametric alternative to the independent samples t-test.

### Formula:
The calculation is based on ranks and involves comparing the sum of ranks for two independent samples.

### When to Use:
- Non-parametric alternative to the t-test.
- When data does not follow a normal distribution or when working with ordinal data.

---

## 7. Wilcoxon Signed-Rank Statistic (for Paired Samples)
A non-parametric test statistic used for paired data when the assumptions of the paired t-test (normality) are not met.

### Formula:
Based on the ranks of the differences between pairs of observations.

### When to Use:
- Non-parametric alternative to the paired t-test.
- For comparing two related or paired samples when the data does not meet the normality assumption.

---

## Summary of When to Use Each Test Statistic:
| Test Statistic         | When to Use                                                    |
|------------------------|---------------------------------------------------------------|
| **Z-Statistic (Z-Test)**| Large sample size (\(n > 30\)), known population standard deviation. |
| **T-Statistic (T-Test)**| Small sample size (\(n < 30\)), unknown population standard deviation. |
| **Chi-Square Statistic**| Categorical data, goodness-of-fit, or tests of independence. |
| **F-Statistic (F-Test)**| Comparing variances, ANOVA for multiple group comparisons.  |
| **Proportion Z-Statistic**| Hypothesis tests for population proportions.               |
| **Mann-Whitney U**      | Non-parametric alternative for independent samples.           |
| **Wilcoxon Signed-Rank**| Non-parametric alternative for paired data.                  |


___
___