Fixed error in calculation of Mann-Whitney statistic and added support to specify alternative hypothesis #144

Closed
wants to merge 1 commit into
from

Projects

None yet

2 participants

@sebp
sebp commented Jan 31, 2012

The mannwhitneyu function did not always return the correct U, see the following example:

from scipy.stats import mannwhitneyu

x = [19.8958398126694,19.5452691647182,19.0577309166425,21.716543054589,20.3269502208702,20.0009273294025,19.3440043632957,20.4216806548105,19.0649894736528,18.7808043120398,19.3680942943298,19.4848044069953,20.7514611265663,19.0894948874598,19.4975522356628,18.9971170734274,20.3239606288208,20.6921298083835,19.0724259532507,18.9825187935021,19.5144462609601,19.8256857844223,20.5174677102032,21.1122407995892,17.9490854922535,18.2847521114727,20.1072217648826,18.6439891962179,20.4970638083542,19.5567594734914]
y = [19.2790668029091,16.993808441865,18.5416338448258,17.2634018833575,19.1577183624616,18.5119655377495,18.6068455037221,18.8358343362655,19.0366413269742,18.1135025515417,19.2201873866958,17.8344909022841,18.2894380745856,18.6661374133922,19.9688601693252,16.0672254617636,19.00596360572,19.201561539032,19.0487501090183,19.0847908674356]

u, p = mannwhitneyu(x, y)
print u, p

In the example above u is 102, but really should be 498.

In addition, you can now specify what the alternative hypothesis should be, set to 'less' by default for backwards compatibility.

Code adapted from https://svn.r-project.org/R/trunk/src/library/stats/R/wilcox.test.R

@rgommers
Member

Hi, it would be good to fix this, but your code is based on GPL-licensed code. Therefore we can't use it as is. Would you be able to provide a new patch not based on R code?

Your example above that should give 498 as a result, is that your own? If so, that can be turned into a regression test in scipy/stats/tests/test_stats.py

@rgommers
Member

The addition of an alternative keyword looks good to me.

@sebp
sebp commented Feb 1, 2012

Sorry, I didn't pay attention to the license issue. Obviously, I can't go back in time and not look at the R code. If it helps, R's wilcox.test does more things not covered in scipy's version (one sample tests, exact p-values). I changed the code a little bit and added a test case as well.

@rgommers
Member
rgommers commented Feb 1, 2012

The tests look good. Those are not based on R code, right? I guess what we should do is take those tests plus a description of what needs changing, and then let someone who didn't see this code or the R code fix the code independently.

@sebp
sebp commented Feb 1, 2012

That's correct, the tests are not based on R, they are by myself.

@rgommers
Member
rgommers commented Feb 5, 2012

I've put the tests in a separate commit, with one more fix for the existing test, and opened a ticket with this bug report at http://projects.scipy.org/scipy/ticket/1593.

I'll close this PR due to the licensing issue. I'll ask on the mailing list if someone is willing to re-implement the fix based on the tests, that's the safe way to do this.

@rgommers rgommers closed this Feb 5, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment