Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats: add Page's L test #12531

Merged
merged 40 commits into from
Jan 27, 2021
Merged

ENH: stats: add Page's L test #12531

merged 40 commits into from
Jan 27, 2021

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented Jul 12, 2020

In the CZI Proposal, we indicated that we would add Page's L test.

This is part of our effort to address the top level Statistics Enhancements roadmap item "Expand the set of hypothesis tests."


Original post and updates (most have been addressed)
The initial commit is just proposed documentation so we can discuss the signature and functionality. Update: it's all here now.

I may be getting carried away with the three different methods. Some other stats methods implement only asymptotic approximations, so I assume I could do the same here, but I think that naively-implemented exact (permutation) and Monte Carlo methods would be useful without adding too much difficulty. Update: Naive exact was too slow, but I added a more efficient exact method.

Questions:

  1. Is it OK to have the function in its own file? Have I integrated it into scipy.stats correctly? Do you agree that pagel is the appropriate name, considering other tests are named like kendalltau, personr, and spearmanr? Update: I followed epps_singleton_2samp as an example, which imports into stats.py and adds it into __all__ there. Should I move these both directly to __init__.py?
  2. Can you suggest a better name for any argument?
  3. Should we allow the user to pass in raw data instead of ranks? If not, then we don't need the second argument, and perhaps a better name for the first argument would be ranks. Update: I think we should keep it. R does.
  4. Should we allow the user to pass in data with the columns out of the hypothesized order? If not, then we don't need the third argument.
  5. Do we want all the proposed methods? When there are no ties, 'auto' would simply select between 'exact' and 'asymptotic' based on Table 2 of the original paper. As Wikipedia states it, "The approximation is reliable for more than 20 subjects with any number of conditions, for more than 12 subjects when there are 4 or more conditions, and for any number of subjects when there are 9 or more conditions.".
  6. If we're going to have a Monte Carlo method, how should the user specify the number of samples? Can they just put an integer into the method argument instead of method='mc' plus a separate n_s argument?
  7. Reference [2] says "If there are ties within blocks, mean ranks can be used. Such ties reduce the variance of L... As a consequence, the p-value based on the uncorrected variance would be too large, hence waiving the correction would not violate the significance level." In other words, the asymptotic method estimates the p-value conservatively in the presence of ties. Is that OK, or is it important to adjust for the possibility of ties? [2] suggests Van de Wiel and Di Bucchianico for that. Update: That reference is not very explicit about how to perform the computation. I'd like to leave this out of the PR.
  8. I was originally planning on implementing the exact method naively - actually generating all the rank permutations and computing the L statistic for each. That is going to be too slow for some of the larger combinations of k and n. Is it ok to eliminate the exact method, or, if the user selects auto, resort to Monte Carlo when exact will be too slow and asymptotic will be too inaccurate? Update: I'm implementing the exact algorithm from this paper
  9. Does the documentation's explanation for the 'exact' and 'mc' calculations make sense? Are they correct?
  10. Is the original article quoted too much in the documentation?

I'll add an example problem at the end of the documentation later. Fingers crossed Sphinx doesn't give me too much of a headache.... Update: surprisingly, no issues!

@WarrenWeckesser

@mdhaber mdhaber added scipy.stats enhancement A new feature or improvement labels Jul 12, 2020
@mdhaber mdhaber added this to the 1.6.0 milestone Jul 12, 2020
@@ -265,6 +265,7 @@
brunnermunzel
combine_pvalues
jarque_bera
pagel
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Immediate first thought: pagel like "bagel"? This doesn't seem like a good function name. Maybe page_l?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a look. page_l was the original name, but I changed it for consistency with kendalltau, spearmanr, pearsonr, johnsonsb, johnsonsu, mannwhitneyu, and friedmanchisquare: surname adjoined with variable name. There are some functions that have a _ after the surname, but they don't have the name of a variable after them; they're more of a description (e.g. fisher_exact, yeojohnson_normmax.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either is fine with me, but @mdhaber's argument for following the common pattern pushes me towards pagel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rgommers, I'd like to merge this soon. Are you OK with pagel?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. It's a terrible name, but at least it's consistent with the other terrible names - and page_l also won't tell anyone what the function does.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've found two versions in R, one called page.trend.test (in the crank package) and another called page.test (in the cultevo package). In this blog post, the author of the cultevo package argues that the test isn't actually a trend test. If that argument is convincing, then including trend in the name isn't quite accurate.

Perhaps something with the word ordered, e.g. page_ordered_test (with or without the underscores, with or without the word test). The name Page is clearly associated with this test, so I think we want to keep page in there.

Naming things is hard. Suggestions for names that are not terrible would be appreciated!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

page_test or page_l_test is what I'd choose. We have ttest, binomtest etc. as well - makes it a bit more descriptive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only thing preventing me from merging this is @rgommers objection to the name. I know @mdhaber prefers pagel, and (for better or worse), that style is consistent with many other names in scipy.stats. I don't have a strong preference, and when that happens I try to go with the original author's preference. Does anyone else have input? Is there a set of objective criteria we can apply to evaluate the quality of a name?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just ping on the mailing list? If no one else objects or has a clear preference, go with pagel

in the following order: tutorial, lecture, seminar.

>>> table = [[3, 4, 3],
[2, 2, 4],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: this requires ... at the start of each line to be valid doctest syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, thanks!

scipy/stats/_pagel.py Outdated Show resolved Hide resolved
* there are :math:`n \geq 3` treatments,
* :math:`m \geq 2` subjects are observed for each treatment, and
* the observations are hypothesized to have a particular order.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Wikipedia page has useful context here, like it's related to spearmanr and has more statistical power than friedmanchisquare. I think when we add more of these lesser-known tests, such context is becoming more and more important to add.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, adding.

@rgommers
Copy link
Member

Have I integrated it into scipy.stats correctly?

It works, but is suboptimal. If it gets its own file then it should be imported directly from stats/__init__.py. However, I don't think this should be in its own file. If stats.py gets too large, we should just break it up in logical sections. One file named after one function this small is a bit odd.

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 13, 2020

One file named after one function this small is a bit odd.

There is a lot more code to be written. If, in the end, it's too small for its own file, I'll move it. In the meantime, easier to develop in its own file.

If it gets its own file then it should be imported directly from stats/__init__.py.

I was following the example of epps_singleton_2samp, which gets imported into stats.py:

from ._hypotests import epps_singleton_2samp
from ._pagel import pagel

and included in __all___ there,

           'brunnermunzel', 'epps_singleton_2samp', 'pagel']

Should I fix both of them?

@chrisb83
Copy link
Member

If stats.py gets too large, we should just break it up in logical sections. One file named after one function this small is a bit odd.

I created _hypotests.py for new statistical tests / hypothesis testing to avoid adding more and more code to stats.py. could we add Page L to that file (and potentially revise the way the functions are imported) ?

import scipy.stats


Page_L_Result = namedtuple('Page_L_Result',
Copy link
Contributor Author

@mdhaber mdhaber Jul 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @WarrenWeckesser suggested, I'm going to change this to some other sort of object that requires items to be accessed by name.

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 17, 2020

TLDR: I added scipy/stats/pagel_exact.npy, which the exact method uses to look up pre-calculated PMF values, but CI can't find the file.

E   FileNotFoundError: [Errno 2] No such file or directory: 'C:\\hostedtoolcache\\windows\\Python\\3.7.8\\x64\\lib\\site-packages\\scipy\\stats\\pagel_exact.npy'

Help?


I try to load it with:

    dir_path = os.path.dirname(os.path.realpath(__file__))
    datafile = os.path.join(dir_path, "pagel_exact.npy")
    all_pmfs = np.load(datafile, allow_pickle=True).item()

How can I fix this?
update: can't check now, but maybe .gitignore needs an exception.
update: nope, it's not that.
update:
scipy\interpolate\tests\test_interpnd.py has a function:

def data_file(basename):
    return os.path.join(os.path.abspath(os.path.dirname(__file__)),
                        'data', basename)

and uses it like:

points = np.load(data_file('estimate_gradients_hang.npy'))

I changed my code to use abspath just like this function, but it's still not working. I don't see anything in .gitignore that would prevent the .npy files from getting copied. Do the CI definitions prevent it?

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 19, 2020

Thanks @WarrenWeckesser!

@mdhaber
Copy link
Contributor Author

mdhaber commented Jul 19, 2020

CircleCI was successful in 51c88a9 but fails in ed3b9d0. How did those changes break the doc build (without any error messages, of course)?

Update: sounds like data was a bad name.

/home/circleci/repo/build/testenv/lib/python3.7/site-packages/scipy/stats/morestats.py:docstring of scipy.stats.boxcox_llf:11: WARNING: py:obj reference target not found: data
/home/circleci/repo/build/testenv/lib/python3.7/site-packages/scipy/stats/morestats.py:docstring of scipy.stats.boxcox_llf:18: WARNING: py:obj reference target not found: data
/home/circleci/repo/build/testenv/lib/python3.7/site-packages/scipy/stats/morestats.py:docstring of scipy.stats.boxcox_llf:18: WARNING: py:obj reference target not found: data
writing output... [ 89%] generated/scipy.stats.kstwobign .. generated/scipy.stats.rv_discrete.support
writing output... [ 94%] generated/scipy.stats.rv_discrete.var .. sparse.csgraph
/home/circleci/repo/build/testenv/lib/python3.7/site-packages/scipy/stats/morestats.py:docstring of scipy.stats.yeojohnson_llf:12: WARNING: py:obj reference target not found: data
/home/circleci/repo/build/testenv/lib/python3.7/site-packages/scipy/stats/morestats.py:docstring of scipy.stats.yeojohnson_llf:19: WARNING: py:obj reference target not found: data

Copy link
Member

@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a bunch of suggested changes inline. There are also many lines in the test class TestPageL that are longer than 79 characters. Those should be fixed before we merge the PR.

I have one API change to consider. In almost every example that I find (web pages, text books, R function documentation), the given data is the raw observations, not the ranked observations. (Off the top of my head, the only case where I recall the given data being the ranked observations is the example from the original paper.) I suspect this is, in fact, the most common use-case. This means the users will almost always have to give the argument ranked=False when using the function. I think we should make that the default.

``'asymptotic'``` *p*-values, however, tend to be smaller (i.e. less
conservative) than the ``'exact'`` *p*-values.


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove a blank line.

Suggested change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops; I'll include this in the next commit.

scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/tests/test_stats.py Outdated Show resolved Hide resolved
scipy/stats/tests/test_stats.py Outdated Show resolved Hide resolved
@mdhaber
Copy link
Contributor Author

mdhaber commented Dec 10, 2020

This means the users will almost always have to give the argument ranked=False when using the function. I think we should make that the default.

I remember I chose this default for speed. Ranking the data is the most expensive part of the (asymptotic) tests, I think, so I thought we should skip ranking by default.

But you're right; we should probably make it convenient first, and if the user cares about speed, they can pay attention to the available parameters.

@mdhaber
Copy link
Contributor Author

mdhaber commented Dec 10, 2020

There are also many lines in the test class TestPageL that are longer than 79 characters. Those should be fixed before we merge the PR.

OK, please send me the formatted lines and I will include them, or you're welcome to push directly.

Update: done.

@mdhaber
Copy link
Contributor Author

mdhaber commented Dec 10, 2020

This means the users will almost always have to give the argument ranked=False when using the function. I think we should make that the default.

Done.

Copy link
Member

@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Matt! I noticed three more tiny whitespace issues, otherwise this looks ready.

scipy/stats/tests/test_stats.py Outdated Show resolved Hide resolved
scipy/stats/tests/test_stats.py Show resolved Hide resolved
scipy/stats/tests/test_stats.py Show resolved Hide resolved
Co-authored-by: Warren Weckesser <warren.weckesser@gmail.com>
Copy link
Member

@WarrenWeckesser WarrenWeckesser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is ready! I'll hold off merging this until Monday, to give time for previous commenters (or anyone else) to take a look at the updated version.

We use the example from [3]_: 10 students are asked to rate three
teaching methods - tutorial, lecture, and seminar - on a scale of 1-5,
with 1 being the lowest and 5 being the highest. We have decided that
a confidence level of 99% is required to reject the null hypothsis in favor
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a confidence level of 99% is required to reject the null hypothsis in favor
a confidence level of 99% is required to reject the null hypothesis in favor

scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/_pagel.py Outdated Show resolved Hide resolved
scipy/stats/_pagel.py Outdated Show resolved Hide resolved
@mdhaber
Copy link
Contributor Author

mdhaber commented Jan 25, 2021

@WarrenWeckesser Done, I think. If doctests fail due to line break, please resolve as you see fit.

@mdhaber
Copy link
Contributor Author

mdhaber commented Jan 26, 2021

Well that's nice that doctests passed.

@WarrenWeckesser
Copy link
Member

Thanks @mdhaber, merged. I added a brief note about it to the 1.7.0 release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants