# One-Sample Wilcoxon Signed Rank Test
*By P. Stikker*<br>
https://PeterStatistics.com<br>
https://www.youtube.com/stikpet<br>

## Introduction

Two tests can be used to test a hypothesized median. The first does exactly what this and is known as a sign-test, however another test is more frequently used and has a lot scarier name: one-sample Wilcoxon signed rank test (Wilcoxon, 1945). This second test uses rankings (it ranks the scores) and because of this might give a slightly different result. The advantage of this second test is that it can catch some smaller differences and is the one I will explain here. If you are curious about what the Wilcoxon test does exactly I'd recommend <a href="https://onlinecourses.science.psu.edu/stat414/node/319">this site</a>. 

In the Example section, I'll discuss how to perform a one-sample Wilcoxon Signed Rank Test with an example.

## Example

First we'll need some example data. I'll import it as a Pandas dataframe, so will need 'pandas'.

In [1]:
import pandas as pd       # https://pandas.pydata.org/

And then load the example data using the <a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html">'read_csv'</a>. 

In [2]:
myDf = pd.read_csv('../Data/csv/GSS2012a.csv')

  interactivity=interactivity, compiler=compiler, result=result)


In [3]:
myField = myDf['accntsci'].dropna()

The categories in this field are their original labels, but need to be numeric values, so we should re-code the field into numeric values.

Lets first see which options there were (using '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.unique.html">unique</a>').

In [4]:
myField.unique()

array(['Very scientific', 'Pretty scientific', 'Not too scientific',
       'Not scientific at all'], dtype=object)

Lets assign these to numeric values, by making a dictionary out of the coding:

In [5]:
myCoding = {'Not scientific at all': 1, 'Not too scientific': 2, 'Pretty scientific': 3, 'Very scientific': 4}

And now to replace the labels with their new codes (using '<a href="https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.replace.html">replace</a>'):

In [6]:
myField2 = myField.replace(myCoding)

The Wilcoxon test can be found in the scipy.stats library. So lets import that:

In [7]:
from scipy.stats import wilcoxon  # https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html

The test requires a hypothesized median. I'd like to use the middle value of the possible scores for this, so in the example thats right between 'Not too scientific' and 'Pretty scientific', so at 2.5.

In [8]:
medHyp = 2.5

Now the test, will look at the differences with this hypothesized median, so each time at 'myField2-medHyp', the test itself can then be done as shown below.

In [9]:
rank, pVal = wilcoxon(myField2-medHyp, zero_method = 'wilcox', correction = False)
print(rank)
print(pVal)

129626.0
8.078555909697749e-33


The 'zero_method' is about how it should deal with ties. That is, if the score is equal to the hypothesized median. The options are:

* 'pratt', which will include the zero values when determining the ranks, but removes the ranks of the zero values itself.
* 'wilcox', which simply first removes them
* 'zsplit', includes the zero values and splits the ranks over the positive and negative ones.

The 'correction' is about the continuity correction.

The 129626 is either the sum of the ranks that were positive, or the sum of the ranks that had a negative deviation, whichever is smaller.

The last value is the p-value (sig. in SPSS terms). It is the probability of results as in the sample, or even more extreme, if the population median would indeed be 2.5. Usually if this value is below .05 we would conclude that this assumption about the population is not true, and report that there is a significant difference.

For the reporting a so-called Z-value is often noted as well, but unfortunately not returned by the wilcoxon function. Guess we'll have to reverse-engineer that.

We'll need the function from the 'norm' package of scipy.stats for that:

In [10]:
from scipy.stats import norm   # https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html

Then, simply use the found p-value to find the corresponding z-value:

In [11]:
norm.ppf(pVal/2)

-11.931822145966912

I like to automate things, so below a function that will do all the work. To automate on deciding the hypothesized mean, I would also need the mean function from python's statistics library:

In [12]:
from statistics import mean    # https://docs.python.org/3/library/statistics.html#statistics.mean

Now for my own function:

In [13]:
# Wilcoxon one-sample test

def wilcoxonOS(myData, field, catCoding=None, hypMed = None):
    myField = myData[field].dropna()   
    
    if catCoding != None:
        myField = myField.replace(catCoding)
    myFreq = myField.value_counts()
    
    myMed = hypMed
    if hypMed == None:
        myMed = (min(myFreq.index)+max(myFreq.index))/2
    
    rank, pVal = wilcoxon(myField-myMed, zero_method = 'wilcox', correction = False)
    zVal = norm.ppf(pVal/2)
    
    return rank, pVal, zVal, myMed

Lets test it out:

In [14]:
wilcoxonOS(myDf, 'accntsci', catCoding={'Not scientific at all': 1, 'Not too scientific': 2, 'Pretty scientific': 3, 'Very scientific': 4}, hypMed = 2.5)

(129626.0, 8.078555909697749e-33, -11.931822145966912, 2.5)

And another example:

In [15]:
myDf = pd.read_csv('../Data/csv/StudentStatistics.csv')
wilcoxonOS(myDf, 'Teach_Motivate', catCoding = {'Fully Disagree': 1, 'Disagree': 2, 'Neither disagree nor agree': 3, 'Agree': 4, 'Fully agree': 5})

(236.5, 0.005298523793092, -2.7883013105590395, 3.0)

## References

Wilcoxon, F. (1945). Individual comparisons by ranking methods. *Biometrics Bulletin, 1*(6), 80. doi:10.2307/3001968