# Wilcoxon Signed Rank Sum Test

In [1]:
import pandas as pd
from scipy import stats

If the data are not normally distributed, the one-sample t-test should not be used (although this test is fairly robust against deviations from normality). Instead, we must use a nonparametric test on the mean value.
* We can do this by performing a <b>Wilcoxon signed rank sum test</b>.

## Example

An example of this is if you where to collect the blood pressure for an individual before and after some treatment, condition, or time point.

* <b>Null Hypotheisis</b>: The difference between the pairs follows a symmetric distribution around zero

In [2]:
df=pd.read_csv("blood_pressure.csv") # crate data frame from the .csv file
df[['bp_before','bp_after']].describe() # take a look at the data

Unnamed: 0,bp_before,bp_after
count,120.0,120.0
mean,156.45,151.358333
std,11.389845,14.177622
min,138.0,125.0
25%,147.0,140.75
50%,154.5,149.5
75%,164.0,161.0
max,185.0,185.0


There are two ways to go about this using the scipy.stats.wilcoxon() method. The first is to calculate the differences between the two conditions and pass that through the method, and the second is much simpler where one enters the two conditions and let’s Python take care of everything.

* Going the first way of calculating the difference between the two conditions.

In [4]:
df['bp_difference'] = df['bp_before'] - df['bp_after']
df['bp_difference'][df['bp_difference']==0]

41     0
74     0
103    0
115    0
Name: bp_difference, dtype: int64

Since there are differences of 0, these scores need to be excluded from the ranking process.

* First, using a variable that contains the difference between the conditions.

In [5]:
stats.wilcoxon(df['bp_difference'])

WilcoxonResult(statistic=2234.5, pvalue=0.0014107333565442858)

* second example code will show how to use the method using both variables

In [6]:
stats.wilcoxon(df['bp_before'], df['bp_after'])

WilcoxonResult(statistic=2234.5, pvalue=0.0014107333565442858)

The findings are statistically significant! One can reject the null hypothesis

<b>Interpretation:</b>
A Wilcoxon T test was used to analyze the blood pressure before and after the intervention to test if the intervention had a significant affect on the blood pressure. The blood pressure before the intervention was higher (M= 156.45 ± 11.39 units) compared to the blood pressure post intervention (M= 151.36 ± 14.18 units); there was a statistically significant decrease in blood pressure (t=2,234.5, p= 0.0014).