<a href="https://colab.research.google.com/github/ranton256/stats_notebooks/blob/main/NonParametricTests.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Non-Parametric Tests




The definition of what a non-parametric test is best understood by comparing parametric tests to non-parametric tests.


| Parametric Tests | Non-parametric Tests |
|------------------|----------------------|
| Estimate a parameter like μ, σ, or p (proportion) prior to hypothesis testing. | Hypothesis testing without parameter estimation. Involves counting or ranking. |
| Generally require a population to be normally distributed. | "Distribution-free statistics". |
| Only works for quantitative data. | Works for both qualitative and quantitative data. |
| More power. | Less power. |
| Need more detailed data (more information). | Work with less detailed data (less information). |
| Work with smaller sample sizes. | Need large sample sizes. |

If you have a choice, generally a parametric test is preferred to a non-parametric one because it has more power. On the other hand, if you reject H₀ with a non-parametric test, you can be more confident in your decision.

The above table is taken from [[4]](https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests/16.01%3A_Prelude_to_Non-Parametric_Tests)

## Examples of non-parametric tests

* **Two Sample Wilcoxon Rank Sum Test - (Mann-Whitney U Test)** - Compares two independent groups to determine if their population distributions differ, and is applicable when comparing two groups that are not normally distributed.
* **Paired Wilcoxon Signed Rank Test** - Compares two related samples to assess whether their population mean ranks differ, and is used for before-and-after studies or matched pairs.
* **Median Sign Test** - The median sign test is a test of a null hypothesis about the median, MD, of a population based on the binomial distribution. To use the test, every subject is assigned a score of + , 0 or - depending on whether their data point value is greater than, the same as or less than the H0 median. [[5]](https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests)
* **Kruskal-Wallis Test (H Test)** - The Kruskal-Wallis Test is a non-parametric one-way ANOVA. It detects differences in means between groups. The distribution behind the test is a new discrete distribution called the H distribution that assumes the group samples come from populations with identically shaped distributions. [[5]](https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests)
* **Spearman Rank Correlation Coefficient** - This is a rank alternative to the Pearson correlation coefficient that may be used when the assumption of normality is not met for hypothesis testing. [[5]](https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests)

## As explained by ChatGPT 4o mini

### Prompt Given

> Describe the concept, purpose,  assumptions, applicability and methods for Non-Parametric Tests as a tutorial for someone with basic but not advanced statistical background, such as a students in a stats 201 style course.

### Non-Parametric Tests: A Beginner’s Guide

#### What are Non-Parametric Tests?

Non-parametric tests are statistical methods that do not assume a specific distribution for the data. Unlike parametric tests (like t-tests or ANOVA), which assume that data follows a normal distribution, non-parametric tests are more flexible and can be used with a variety of data types, especially when the data does not meet the assumptions required for parametric tests.

#### Purpose of Non-Parametric Tests

The primary purpose of non-parametric tests is to analyze data when:
- The sample size is small.
- The data is ordinal (ranked) rather than interval or ratio (where differences between values are meaningful).
- The data does not meet the assumptions of normality or homogeneity of variance required by parametric tests.

Non-parametric tests are useful for:
- Comparing medians rather than means.
- Analyzing data that contains outliers or is skewed.
- Handling small sample sizes where normality cannot be assumed.

#### Assumptions of Non-Parametric Tests

1. **Data Type**: Non-parametric tests can handle ordinal or nominal data. They are also applicable to continuous data that doesn't meet the assumptions of parametric tests.
  
2. **Independence**: Observations should be independent of each other. This means that the value of one observation does not affect another.

3. **Distribution**: There are no specific distributional assumptions, which makes non-parametric tests suitable for data that may be skewed or contain outliers.

4. **Shape of Distribution**: While there are no strict assumptions about distribution, the underlying distributions should have a similar shape when comparing groups.

#### Applicability of Non-Parametric Tests

Non-parametric tests can be used in various scenarios, including:
- **Small Sample Sizes**: When the sample size is too small to reliably assess normality.
- **Non-Normal Distributions**: When data is not normally distributed or is skewed.
- **Ordinal Data**: When dealing with ranked data (e.g., survey results).
- **Heteroscedasticity**: When variances are unequal among groups.

#### Common Non-Parametric Tests

1. **Mann-Whitney U Test**:
   - **Purpose**: Compares two independent groups to determine if their population distributions differ.
   - **Applicability**: Used when comparing two groups that are not normally distributed.

2. **Wilcoxon Signed-Rank Test**:
   - **Purpose**: Compares two related samples to assess whether their population mean ranks differ.
   - **Applicability**: Used for before-and-after studies or matched pairs.

3. **Kruskal-Wallis H Test**:
   - **Purpose**: An extension of the Mann-Whitney U test for comparing three or more independent groups.
   - **Applicability**: Used when the assumption of normality is violated for multiple groups.

4. **Friedman Test**:
   - **Purpose**: Used to detect differences in treatments across multiple test attempts.
   - **Applicability**: Suitable for repeated measures on the same subjects.

5. **Chi-Square Test**:
   - **Purpose**: Tests the relationship between two categorical variables.
   - **Applicability**: Used for frequency data in contingency tables.

#### Conclusion

Non-parametric tests are valuable tools in statistics, especially when the assumptions of parametric tests cannot be met. They provide robust alternatives for analyzing data, making them particularly useful for students and researchers working with diverse datasets. Understanding when and how to apply these tests can enhance your ability to conduct meaningful analyses and draw valid conclusions from your data.

## Claude's reply to that.

### Prompt Given

> Evaluate this introduction to non-parametric tests for accuracy and clarity and suggest any appropriate corrections or clarifications. Please also suggest any relevant references.

### Response

Your introduction to non-parametric tests is generally accurate and clear. It provides a good overview for beginners. However, I can suggest a few minor corrections and clarifications to enhance its accuracy and completeness:

1. In the "Purpose of Non-Parametric Tests" section:
   - It's worth noting that non-parametric tests are not always less powerful than parametric tests. In some cases, especially when assumptions of parametric tests are violated, non-parametric tests can be more powerful.

2. In the "Assumptions of Non-Parametric Tests" section:
   - The point about "Shape of Distribution" is somewhat oversimplified. While some non-parametric tests (like the Mann-Whitney U test) assume similar shapes of distribution when comparing groups, this is not true for all non-parametric tests.

3. In the "Applicability of Non-Parametric Tests" section:
   - You could add that non-parametric tests are also useful when dealing with censored data, which is common in survival analysis.

4. In the "Common Non-Parametric Tests" section:
   - For the Mann-Whitney U Test, it's more accurate to say it tests whether one group tends to have higher values than the other, rather than just if distributions differ.
   - For the Wilcoxon Signed-Rank Test, it compares the distribution of differences in paired data, not just mean ranks.
   - The Chi-Square Test is often considered a separate category (distribution-free test) rather than strictly non-parametric.

5. Additional points to consider including:
   - Mention of permutation tests as a powerful non-parametric approach.
   - Discussion of the trade-offs between parametric and non-parametric tests (e.g., ease of interpretation, power under different conditions).

Suggested references to include:

1. Hollander, M., Wolfe, D. A., & Chicken, E. (2013). Nonparametric statistical methods (3rd ed.). John Wiley & Sons.

2. Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.

3. Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). John Wiley & Sons.

4. Gibbons, J. D., & Chakraborti, S. (2011). Nonparametric statistical inference (5th ed.). Chapman and Hall/CRC.

5. Good, P. I. (2005). Permutation, parametric, and bootstrap tests of hypotheses (3rd ed.). Springer.

These adjustments and additions would make your introduction more comprehensive and nuanced, providing beginners with a solid foundation in non-parametric tests.

## References

1. Hollander, M., Wolfe, D. A., & Chicken, E. (2013). Nonparametric statistical methods (3rd ed.). John Wiley & Sons. <https://www.wiley.com/en-us/Nonparametric+Statistical+Methods%2C+3rd+Edition-p-9780470387375>

2. Siegel, S., Castellan, N. J. (1988). Nonparametric Statistics for the Behavioral Sciences. Singapore: McGraw-Hill. At [Google Books](https://www.google.com/books/edition/_/ha3AQgAACAAJ?hl=en&sa=X&ved=2ahUKEwiks-3h_52JAxWMJzQIHQuAB7AQ8fIDegQIKhAo)

3. Gibbons, J. D., Chakraborti, S. (2011). Nonparametric Statistical Inference, Fifth Edition. India: Taylor & Francis. At [Google Books](https://www.google.com/books/edition/_/Xhm4IgAACAAJ?hl=en&sa=X&ved=2ahUKEwiyn5OCgJ6JAxUREzQIHWxoOoYQre8FegQIAxAq)

4. LibreTexts, (2022). 16.1: Prelude to Non-Parametric Tests <https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests/16.01%3A_Prelude_to_Non-Parametric_Tests>

5. LibreTexts, (2022). 16: Non-parametrics Tests. <https://stats.libretexts.org/Sandboxes/admin/Introduction_to_Applied_Statistics_for_Psychology_Students_(Sarty)/16%3A_Non-parametric_Tests>
