-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
MAINT: stats.wilcoxon: improve documentation and tests #21867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chrisb83 Could you take a look at these follow-up suggestions?
scipy/stats/_morestats.py
Outdated
- The default, ``method='auto'``, selects between the two: when | ||
``len(d) <= 50`` and there are no zeros and no ties, the exact method | ||
is used; if the sample size is small and there are zeros or ties, the | ||
p-value is computed using `permutation_test`; | ||
otherwise, the approximate method is used. The p-value computed by | ||
the permutation test is deterministic since it is only used if the | ||
sample size is small enough to iterate over all possible outcomes. | ||
- The default, ``method='auto'``, selects between the two: | ||
``method='exact'`` is used when ``len(d) <= 50``, and | ||
``method='asymptotic'`` is used otherwise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have not yet defined "ties" or "zeros". Above, it specifies
Assume that all elements of
d
are independent and identically distributed observations, and all are distinct and nonzero.
So let's document what happens under this assumption, then define ties and zeros and document how those affect the behavior.
if n_zero > 0 and method == "exact": | ||
warnings.warn("Exact p-value calculation does not work if there are " | ||
"zeros. Consider using method `asymptotic` or " | ||
"`stats.PermutationMethod`", | ||
stacklevel=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted in #21567 (comment):
'exact'
is also not exact if there are ties, but there is no similar warning in that case.- Users have expressed confusion about what a "zero" is in the past.
- The recommended normal approximation /
method='asymptotic'
is not exact, either, so this doesn't motivate why one would be better than the other. (A permuation test would also not be exact unless the sample size is small.)
It's tough to address all the subtleties of accurate hypothesis testing in a warning; that's why we've elaborated on this stuff in the documentation. And I'm concerned that incomplete warnings like this can do more harm than good: it may give users the impression that the code will help them avoid shooting themselves in the foot, but scipy.stats
doesn't do that consistently. If we think a more consistent policy across the board is important, that is something we can consider, but for now, let's not give a false impression.
if 0 < d.shape[-1] < 10 and method == "asymptotic": | ||
warnings.warn("Sample size too small for normal approximation.", stacklevel=2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here: does this imply that a sample size of 10 is large enough for a normal approximation?
Note that there wasn't actually a test for this warning before. It was always just filtered out, so presumably it was not deemed to be critical.
@@ -1471,7 +1471,7 @@ def test_wilcoxon_bad_arg(self): | |||
assert_raises(ValueError, stats.wilcoxon, [1, 2], [1, 2], "dummy") | |||
assert_raises(ValueError, stats.wilcoxon, [1, 2], [1, 2], | |||
alternative="dummy") | |||
assert_raises(ValueError, stats.wilcoxon, [1]*10, mode="xyz") | |||
assert_raises(ValueError, stats.wilcoxon, [1]*10, method="xyz") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method
is the new name for mode
. I didn't change the tests when the change was made, but now seems like a good time to do it. In test_approx_mode
below, I test that mode
is still accepted.
w, p = stats.wilcoxon(d) | ||
assert_equal((w, p), stats.wilcoxon(d, method="asymptotic")) | ||
|
||
d[d == 0] = 1 # tie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We test with a zero; we should test with a tie.
# n <= 50: if there are zeros in d = x-y, use PermutationMethod | ||
# this is a slower test to show that results are deterministic if n=13 | ||
pm = stats.PermutationMethod() | ||
d = np.arange(0, 13) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method='exact'
gives the same result in this case. Let's test a case that allows us to distinguish between the two methods.
# `method='auto'`, there are zeros or ties in `d = x-y`, and `len(d) <= 13`. | ||
d = np.arange(-5, 8) # zero | ||
res = stats.wilcoxon(d) | ||
ref = (27.5, 0.3955078125) # stats.wilcoxon(d, method=PermutationMethod()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Save some time. It's OK to hard-code values given by SciPy code given that the intent of the test is to check that the result is deterministic.
ref = (27.5, 0.3955078125) # stats.wilcoxon(d, method=PermutationMethod()) | ||
assert_equal(res, ref) | ||
|
||
d[d == 0] = 1 # tie |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we need to check for ties, too.
|
||
# n <= 50: if there are zeros in d = x-y, use PermutationMethod | ||
pm = stats.PermutationMethod() | ||
d = np.arange(0, 5) | ||
d = np.arange(-2, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method='exact'
gives the same result in this case. Let's test a case that allows us to distinguish between the two methods.
If the other tests pass, I'll fix the lint error and commit with |
[docs only]
Remaining doc failure seems unrelated, since it's about |
@chrisb83 Would you be willing to take a look here before we branch? The adjustments are pretty straightforward, I think. |
The majority of the changes is straight-forward (improved documentation, simplification of some tests). I am not sure how we deal with the removal of warnings.. For me, it is ok, though I could imagine that other people prefer to keep them. A warning is more visible to the user than the documentation of the limitations. So instead of removing the warning, one could also try to improve them. |
Certainly one could. This is part of what I meant by "If we think a more consistent policy across the board is important, that is something we can consider". That is not the direction I chose to go with this PR because it is a much bigger project. Instead, I chose to make an incremental improvement by removing the incomplete and potentially misleading warnings. However, I would welcome someone work on better warnings throughout stats. I'd suggest that this person open an issue about it, recommending that we audit functions for pitfalls and warn according to a unified policy. In the meantime, removing these incomplete warnings makes it clearer to users and developers that we do not currently protect against inaccurate results due to poor choice of |
OK, let's remove the warnings:
|
Reference issue
gh-21567
What does this implement/fix?
gh-21567 made the following improvements to
scipy.stats.wilcoxon
:method
parameter if it is explicitly passed by the user. In particular, ifmethod='exact'
is specified, use it regardless of whether there are ties/zeros.'auto'
method by performing a permutation test if the sample size is small and there are ties or zeros.method='approx'
tomethod='asymptotic'
for consistency with closely related testsmannwhitneyu
,kendalltau
, andpage_trend_test
.During the review process, I noticed some other items deserving of maintenance. This PR makes those adjustments; rationale is noted inline.