<a href="https://colab.research.google.com/github/poonamaswani/DataScienceAndAI/blob/main/CAM_DS_C101_Demo_3_2_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**First things first** - please go to 'File' and select 'Save a copy in Drive' so that you have your own version of this activity set up and ready to use.
Remember to update the portfolio index link to your own work once completed!

# 3.2.3 Exploring the Mann-Whitney U and Wilcoxon signed-rank tests

Follow the demonstration to learn how to perform a Mann–Whitney U and Wilcoxon signed-rank test in Python. In this video, you will learn how to:
- Perform a Mann-Whitney U test.
- Perform a Wilcoxon signed-rank test.
- Interpret the output of these tests.

## a. Mann–Whitney U test
Also known as the Wilcoxon rank-sum test, the Mann–Whitney U test is a non-parametric statistical test employed to compare the difference(s) between two independent groups.

Consider the following scenario: AB Consulting is conducting a survey among their employees to determine the effectiveness of two training programmes in improving employee productivity. They divide the participants into two groups of eight people. Participants have to write a test after completing the training programme. The higher the mark, the better the productivity of the participants. The following table indicates the test scores of participants out of 40:

|Group 1|Group 2|
|---|---|
12|18|
14|20|
15|21|
19|25|
22|28|
24|33|
29|36|
30|40|

The hypotheses are:
- $H_0$: There is no difference in median productivity between the two groups.
- $H_1$: There is a significant difference in median productivity between the two groups.



In [None]:
# Import the necessary libraries.
import numpy as np
from scipy.stats import mannwhitneyu

In [None]:
# Create sample data for the Mann–Whitney U test.
group1 = np.array([12, 14, 15, 19,
                   22, 24, 29, 30])
group2 = np.array([18, 20, 21, 25,
                   28, 33, 36, 40])

In [None]:
# Perform the Mann–Whitney U test.
mannwhitney_stat, mannwhitney_p = mannwhitneyu(group1,
                                               group2)

# View the output.
print("Mann-Whitney statistic:", mannwhitney_stat)
print("Mann-Whitney p-value:", mannwhitney_p)

Mann-Whitney statistic: 17.0
Mann-Whitney p-value: 0.13038073038073036


The output indicates the test statistic ($U$) value is 17.0. The $p$-value is approximately 0.1304. Since the $p$-value is greater than the common alpha ($\alpha$) level of 0.05, we would not reject $H_0$, indicating that there is no significant difference between the two groups based on the provided data.

## b. Wilcoxon signed-rank test
A non-parametric statitsical test employed to compare paired samples to determine whether the population mean ranks differ.

Consider the following scenario: CU Medical Consultants tested the effectiveness of a new medical treatment for anxiety. The levels of anxiety of seven patients were measured before and after treatment in a pilot study. The results, out of 30, are presented in the following table:

|Before|After|
|---|---|
22|24|
20|21|
25|27|
30|26|
22|23|
24|25|
26|25|

The hypotheses are:
- $H_0$: The median difference between anxiety levels before and after treatment is zero (the treatment has no effect).
- $H_a$: The median difference between anxiety levels before and after treatment is not zero (the treatment has an effect).

In [None]:
# Import the necessary libraries.
import numpy as np
from scipy.stats import wilcoxon

In [None]:
# Example data for Wilcoxon signed-rank test
# The two sets of measurements are related/ paired (e.g., before and after a treatment)
data_before = np.array([22, 20, 25,
                        30, 22, 24, 26])
data_after = np.array([24, 21, 27, 26,
                       23, 25, 25])

In [None]:
# Perform Wilcoxon signed-rank test.
wilcoxon_stat, wilcoxon_p = wilcoxon(data_before,
                                     data_after)

# View the output.
print("Wilcoxon statistic:", wilcoxon_stat)
print("Wilcoxon p-value:", wilcoxon_p)

Wilcoxon statistic: 9.5
Wilcoxon p-value: 0.46875


The output indicates a test statistic value is 9.5. The $p$-value is approximately 0.4688. This $p$-value is also greater than 0.05, suggesting that there is no significant difference between the 'before' and 'after' conditions of the paired data based on the provided data.

# Key information
The demonstration illustrated how to perform and interpret the Mann–Whitney U and Wilcoxon signed-rank tests. The interpretation of the $p$-values would depend on the $\alpha$ level you set for your test (commonly 0.05 for a 5% significance level).​

## Reflect
What are the practical applications of this technique?

### Portfolio entry
Select here, then select the pen from the toolbar to add your entry.