New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add k-sample Anderson-Darling test to stats module #3183
Changes from 1 commit
95ad13d
3f421bc
23e1f77
5612e42
ea78602
f1f3770
79b9da6
b7023b6
45a7447
445febf
ab1216e
bb3ec01
9b0cb39
908e281
6688108
24931cb
8a92e25
d9d46af
b2c5ef9
371dcc7
b2682cd
b138007
a12c26f
f90390d
a2a077b
1626e0f
aad0972
11b5a01
7e8e03c
6bce442
6231a8d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,7 +28,7 @@ | |
'boxcox_llf', 'boxcox', 'boxcox_normmax', 'boxcox_normplot', | ||
'shapiro', 'anderson', 'ansari', 'bartlett', 'levene', 'binom_test', | ||
'fligner', 'mood', 'wilcoxon', | ||
'pdf_fromgamma', 'circmean', 'circvar', 'circstd', | ||
'pdf_fromgamma', 'circmean', 'circvar', 'circstd', 'anderson_ksamp' | ||
] | ||
|
||
|
||
|
@@ -1131,6 +1131,137 @@ def rootfunc(ab,xj,N): | |
return A2, critical, sig | ||
|
||
|
||
def anderson_ksamp(*args): | ||
"""The Anderson-Darling test for k-samples. | ||
|
||
The k-sample Anderson-Darling test is a modification of the | ||
one-sample Anderson-Darling test. It tests the null hypothesis | ||
that k-samples are drawn from the same population without having | ||
to specify the distribution function of that population. The | ||
critical values depend on the number of samples. | ||
|
||
Parameters | ||
---------- | ||
sample1, sample2, ... : array_like | ||
k arrays of sample data | ||
|
||
Returns | ||
------- | ||
Tk : float | ||
Normalized k-sample Anderson-Darling test statistic, not adjusted for | ||
ties | ||
tm : array | ||
The critical values for significance levels 25%, 10%, 5%, 2.5%, 1% | ||
p : float | ||
An approximate significance level at which the null hypothesis for the | ||
provided samples can be rejected | ||
|
||
Raises | ||
------ | ||
ValueError | ||
If less than 2 samples are provided, a sample is empty, or no | ||
distinct observations are in the samples. | ||
|
||
See Also | ||
-------- | ||
ks_2samp : 2 sample Kolmogorov-Smirnov test | ||
anderson : 1 sample Anderson-Darling test | ||
|
||
Notes | ||
----- | ||
[1]_ Define two versions of the k-sample Anderson-Darling test: | ||
one for continous distributions and one for discrete | ||
distributions, in which ties between samples may occur. This | ||
routine computes the former. According to [1]_, the two test | ||
statistics differ only slightly if a few collisions due to | ||
round-off errors occur in the test not adjusted for ties between | ||
samples. | ||
|
||
.. versionadded:: 0.14.0 | ||
|
||
References | ||
---------- | ||
.. [1] Scholz, F. W & Stephens, M. A. (1987), K-Sample Anderson-Darling | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. write |
||
Tests, Journal of the American Statistical Association, Vol. 82, | ||
pp. 918-924. | ||
|
||
Examples: | ||
--------- | ||
>>> from scipy import stats | ||
>>> np.random.seed(314159) | ||
|
||
The null hypothesis that the two random samples come from the same | ||
distribution can be rejected at the 5% level because the returned | ||
test value is greater than the critical value for 5% (1.961) but | ||
not at the 2.5% level. The interpolation gives an approximate | ||
significance level of 3.1%: | ||
|
||
>>> stats.anderson_ksamp(np.random.normal(size=50), \ | ||
np.random.normal(loc=0.5, size=30)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
(2.4632469079409978, array([ 0.325, 1.226, 1.961, 2.718, 3.752]), | ||
0.03130207656720708) | ||
|
||
The null hypothesis cannot be rejected for three samples from an | ||
identical distribution. The approximate p-value (87%) has to be | ||
computed by extrapolation and may not be very accurate: | ||
|
||
>>> stats.anderson_ksamp(np.random.normal(size=50), \ | ||
np.random.normal(size=30), np.random.normal(size=20)) | ||
(-0.72478622084152444, | ||
array([ 0.44925884, 1.3052767, 1.9434184, 2.57696569, 3.41634856]), | ||
0.8732440333177699) | ||
|
||
""" | ||
|
||
k = len(args) | ||
if (k < 2): | ||
raise ValueError("anderson_ksamp needs at least two samples") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. blank lines below this and the next couple of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. only 2 samples? IIRC other tests need 5 to continue with a warning and 20 for no warning. 2 certainly isn't enough for useful results. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. never mind, figured this one out. The wording is a bit confusing, I propose "two sets of samples". And then there should be the check for number of values per set of samples. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
samples = list(map(np.asarray, args)) | ||
Z = np.hstack(samples) | ||
N = Z.size | ||
Zstar = np.unique(Z) | ||
L = Zstar.size | ||
if not L > 1: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
raise ValueError("anderson_ksamp needs more than one distinct " | ||
"observation") | ||
n = np.array([sample.size for sample in samples]) | ||
if any(n == 0): | ||
raise ValueError("anderson_ksamp encountered sample without " | ||
"observations") | ||
A2kN = 0. | ||
lj = np.array([(Z == zj).sum() for zj in Zstar[:-1]]) | ||
Bj = lj.cumsum() | ||
for i in arange(0, k): | ||
fij = np.array([(samples[i] == zj).sum() for zj in Zstar[:-1]]) | ||
Mij = fij.cumsum() | ||
inner = lj / float(N) * (N * Mij - Bj * n[i])**2 / (Bj * (N - Bj)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this equation (6) in Scholz and Stephens, and not equation (3) ? discrete parent population ? not continuous as in the docstring, lines 1174, 1175 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it is. The docstring needs to be fixed. |
||
A2kN += inner.sum() / n[i] | ||
|
||
h = (1. / arange(1, N)).sum() | ||
H = (1. / n).sum() | ||
g = 0 | ||
for l in arange(1, N-1): | ||
inner = np.array([1. / ((N - l) * m) for m in arange(l+1, N)]) | ||
g += inner.sum() | ||
a = (4*g - 6) * (k - 1) + (10 - 6*g)*H | ||
b = (2*g - 4)*k**2 + 8*h*k + (2*g - 14*h - 4)*H - 8*h + 4*g - 6 | ||
c = (6*h + 2*g - 2)*k**2 + (4*h - 4*g + 6)*k + (2*h - 6)*H + 4*h | ||
d = (2*h + 6)*k**2 - 4*h*k | ||
sigmasq = (a*N**3 + b*N**2 + c*N + d) / ((N - 1.) * (N - 2.) * (N - 3.)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is based on equation 4? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. |
||
m = k - 1 | ||
Tk = (A2kN - m) / math.sqrt(sigmasq) | ||
|
||
b0 = np.array([0.675, 1.281, 1.645, 1.96, 2.326]) | ||
b1 = np.array([-0.245, 0.25, 0.678, 1.149, 1.822]) | ||
b2 = np.array([-0.105, -0.305, -0.362, -0.391, -0.396]) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This deserves a comment above the line |
||
tm = b0 + b1 / math.sqrt(m) + b2 / m | ||
pf = np.polyfit(tm, log(np.array([0.25, 0.1, 0.05, 0.025, 0.01])), 2) | ||
if Tk < tm.min() or Tk > tm.max(): | ||
warnings.warn("approximate p-value will be computed by extrapolation") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this warning needed? It shows up in most of the test cases, so I'm guessing it's not that uncommon (didn't check). If so, adding a note in the docstring might make more sense. If the warning has to be kept, it shouldn't show up in the test output (can be silenced within a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what the best pattern for cases like this is. I don't know how good the extrapolation is, it might have quite a large error in some ranges.
For most use cases the exact p-value outside [0.01, 0.25] doesn't really matter and just mentioning in docstring would be enough. But I guess there would be multiple testing applications, where smaller p-values are relevant and users need to be aware that those are not very precise. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think the quality of the interpolation is known. Scholz & Stephens vary the polynomial order depending on the number of samples and provide no guidance for what a general procedure should use. The test cases are taken from Scholz and Stephens and happen to be cases where the null hypothesis can be rejected at better than the 1% level. Given the unknown level of accuracy I'd prefer to keep the warning, unless there's a strong preference to move it to the docstring. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. OK, that's fine with me then. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. warning is fine with me too |
||
p = math.exp(np.polyval(pf, Tk)) | ||
return Tk, tm, p | ||
|
||
|
||
def ansari(x,y): | ||
""" | ||
Perform the Ansari-Bradley test for equal scale parameters | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -10,7 +10,7 @@ | |
from numpy.random import RandomState | ||
from numpy.testing import (TestCase, run_module_suite, assert_array_equal, | ||
assert_almost_equal, assert_array_less, assert_array_almost_equal, | ||
assert_raises, assert_, assert_allclose, assert_equal, dec) | ||
assert_raises, assert_, assert_allclose, assert_equal, dec, assert_warns) | ||
|
||
from scipy import stats | ||
|
||
|
@@ -83,6 +83,64 @@ def test_bad_arg(self): | |
assert_raises(ValueError, stats.anderson, [1], dist='plate_of_shrimp') | ||
|
||
|
||
class TestAndersonKSamp(TestCase): | ||
def test_example1(self): | ||
# Example data from Scholz & Stephens (1987), originally | ||
# published in Lehmann (1995, Nonparametrics, Statistical | ||
# Methods Based on Ranks, p. 309) | ||
# Pass a mixture of lists and arrays | ||
t1 = [38.7, 41.5, 43.8, 44.5, 45.5, 46.0, 47.7, 58.0] | ||
t2 = np.array([39.2, 39.3, 39.7, 41.4, 41.8, 42.9, 43.3, 45.8]) | ||
t3 = np.array([34.0, 35.0, 39.0, 40.0, 43.0, 43.0, 44.0, 45.0]) | ||
t4 = np.array([34.0, 34.8, 34.8, 35.4, 37.2, 37.8, 41.2, 42.8]) | ||
Tk, tm, p = assert_warns(UserWarning, stats.anderson_ksamp, t1, t2, | ||
t3, t4) | ||
assert_almost_equal(Tk, 4.449, 3) | ||
assert_array_almost_equal([0.4985, 1.3237, 1.9158, 2.4930, 3.2459], | ||
tm, 4) | ||
assert_almost_equal(p, 0.0021, 4) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are the Tk and pvalues in the unittests from Scholz and Stephens or "regression test" numbers? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Tk values are from Scholz and Stephens. The pvalues differ by one at the last digit because I used a second order polynomial instead of a linear one. The choice was motivated by looking at the interpolation of the two test cases. |
||
|
||
def test_example2(self): | ||
# Example data taken from an earlier technical report of | ||
# Scholz and Stephens | ||
# Pass lists instead of arrays | ||
t1 = [194, 15, 41, 29, 33, 181] | ||
t2 = [413, 14, 58, 37, 100, 65, 9, 169, 447, 184, 36, 201, 118] | ||
t3 = [34, 31, 18, 18, 67, 57, 62, 7, 22, 34] | ||
t4 = [90, 10, 60, 186, 61, 49, 14, 24, 56, 20, 79, 84, 44, 59, 29, | ||
118, 25, 156, 310, 76, 26, 44, 23, 62] | ||
t5 = [130, 208, 70, 101, 208] | ||
t6 = [74, 57, 48, 29, 502, 12, 70, 21, 29, 386, 59, 27] | ||
t7 = [55, 320, 56, 104, 220, 239, 47, 246, 176, 182, 33] | ||
t8 = [23, 261, 87, 7, 120, 14, 62, 47, 225, 71, 246, 21, 42, 20, 5, | ||
12, 120, 11, 3, 14, 71, 11, 14, 11, 16, 90, 1, 16, 52, 95] | ||
t9 = [97, 51, 11, 4, 141, 18, 142, 68, 77, 80, 1, 16, 106, 206, 82, | ||
54, 31, 216, 46, 111, 39, 63, 18, 191, 18, 163, 24] | ||
t10 = [50, 44, 102, 72, 22, 39, 3, 15, 197, 188, 79, 88, 46, 5, 5, 36, | ||
22, 139, 210, 97, 30, 23, 13, 14] | ||
t11 = [359, 9, 12, 270, 603, 3, 104, 2, 438] | ||
t12 = [50, 254, 5, 283, 35, 12] | ||
t13 = [487, 18, 100, 7, 98, 5, 85, 91, 43, 230, 3, 130] | ||
t14 = [102, 209, 14, 57, 54, 32, 67, 59, 134, 152, 27, 14, 230, 66, | ||
61, 34] | ||
Tk, tm, p = assert_warns(UserWarning, stats.anderson_ksamp, t1, t2, t3, | ||
t4, t5, t6, t7, t8, t9, t10, t11, t12, t13, | ||
t14) | ||
assert_almost_equal(Tk, 3.288, 3) | ||
assert_array_almost_equal([0.5990, 1.3269, 1.8052, 2.2486, 2.8009], | ||
tm, 4) | ||
assert_almost_equal(p, 0.0041, 4) | ||
|
||
def test_not_enough_samples(self): | ||
assert_raises(ValueError, stats.anderson_ksamp, np.ones(5)) | ||
|
||
def test_no_distinct_observations(self): | ||
assert_raises(ValueError, stats.anderson_ksamp, np.ones(5), np.ones(5)) | ||
|
||
def test_empty_sample(self): | ||
assert_raises(ValueError, stats.anderson_ksamp, np.ones(5), []) | ||
|
||
|
||
class TestAnsari(TestCase): | ||
|
||
def test_small(self): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use some more descriptive variable names?
p
is so widely used that it may be OK, butTk
andtm
are not. Same comment for the internal variables, they're almost all one-letter which isn't readable.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, why not name the return values consistent with
anderson
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the return values to be as consistent with
anderson
as possible. I prefer to keep the internal variable names unchanged. The implementation of methods from the literature can only be understood together with the paper describing the method. In such cases I find it much more helpful to have the variable names in the code match the variable names in the paper rather than trying to come up with descriptive names that anybody trying to match the code to the paper than has to translate back.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, the paper is available online for free, so for internal variables that should be fine.