-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: stats: add nonparametric one-sample quantile test and CI #12680
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added a few drive-by comments--not a substitute for a stats
guru review.
Thanks for the contribution. At the moment, there are no functions in SciPy that return a CI, #12609 adds a CI to a test. In the same spirit, one could implement a quantile test (see e.g. Conover: Practical nonparam. statistics) and in addition to the pvalue, return the CI. (we already have |
fixed minor documentation things
The only CI failure is in sparse.linalg and unrelated to this PR. Unless there are further comments, I believe this PR is ready to be merged. |
@romain-jacob have you thought about the following suggestion?
|
Hi @chrisb83, yes, sorry I forgot to follow-up on that one. The method I am proposing here is not related to hypothesis testing, it's an estimation method for quantiles. I do not know the quantile test you are referring to (I did not look it up), but if it is a test of practical relevance, then I guess that could be certainly added too. But I think that should be in a distinct PR, no? |
In general, there is a very close relationship between CIs and hypothesis tests. E.g. if you estimate a mean and you can construct confidence intervals, you can define a test and vice versa. I am just wondering about the best way to integrate the functionality into SciPy. At the moment, just returning a CI without an estimate of the quantile somehow looks odd. are there other implementations, e.g. in R, that we can use as a benchmark? |
I see your point, and indeed I am aware of the close connection between CI and NHST, but I am not aware of any "test" that explicitly focus on percentiles (except the median). That being said, I know very little about statistics in general, so such tests may well exist, but I could not find any. I just browsed through the R packages and found QuantileNPCI which seems to be doing about the same thing, although it seems it computes only two-sided CI (not 100% sure yet, I just had a quick look). Unfortunately I am not proficient in R, so it's not easy for me to quickly compare the two implementations. But I can try computing the CI presented in the QuantileNPCI example; that will already do for a quick check. |
Yes |
scipy/stats/_stats_py.py
Outdated
calculate the p-value respectively. `1` corresponds to the "greater" | ||
alternative hypothesis and `-1` to the "less". For the "two-sided" | ||
alternative, a value of `+1` means, that given the data, there is | ||
greater than equal likelihood that the population quantile is larger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
greater than equal likelihood that the population quantile is larger | |
greater than or equal likelihood that the population quantile is larger |
?
scipy/stats/_stats_py.py
Outdated
or `T2`, the observed number of observations strictly less than the | ||
hypothesized quantile. Two test statistics are required to handle the | ||
possibility the data was generated from a discrete or mixed | ||
distribution. `T1` is sensitive whether the population quantile is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
distribution. `T1` is sensitive whether the population quantile is | |
distribution. `T1` is sensitive to whether the population quantile is |
scipy/stats/_stats_py.py
Outdated
greater than equal likelihood that the population quantile is larger | ||
than the hypothesized value, while a value of `-1` means there is | ||
strictly greater likelihood that it is less than the hypothesized | ||
value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with this use of "likelihood". Also, what is the likelihood greater than?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry. It’s nonsense. I wrote it on the plane earlier today while dozing. I’ll fix it up.
Naming things is so hard. I'm not happy with the following repr for >>> rvs = stats.uniform.rvs(size=100, random_state=rng)
>>> stats.quantile_test(rvs, q=0.6, p=0.75, alternative='greater')
QuantileTestResult(statistic=64, statistic_sign=1, pvalue=0.00940696592998271) My thought is that if we're encoding whether the statistic To me, I think the important thing that should be communicated is that |
Part of the difficulty is that the statistic itself is always a count of values to the same side of the quantile, regardless of the alternative. Could this be refactored such that one is the statistics is changed to its complement? Then |
I’ll think about it too. Sounds very promising. |
Hello there! Thanks a lot for your help @steppi! I just wanted to say that it's been strange to see the work on this PR unfolding without me raising a finger, but I'm really grateful to see more experienced contributors wrapping things up. So, many thanks :-) |
Happy to help @romain-jacob! A group of us stats maintainers got together at SciPy 2023 and started working on wrapping up / reviewing outstanding PRs. Thanks for this valuable contribution! |
@mdhaber I decided to go with I think T1 and T2 are nice because for the continuous case each approximates |
scipy/stats/_stats_py.py
Outdated
@@ -10063,7 +10060,7 @@ def quantile_test(x, *, q=0, p=0.5, alternative='two-sided'): | |||
expect the null hypothesis to be rejected. | |||
|
|||
>>> stats.quantile_test(rvs, q=0.5, p=0.5, alternative='greater') | |||
QuantileTestResult(statistic=67, statistic_sign=1, pvalue=0.9997956114162866) | |||
QuantileTestResult(statistic1=67, statistic_type=1, pvalue=0.9997956114162866) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steppi isn't this supposed to be statistic
, not statistic1
? Here and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romain-jacob Could you check the recent changes for accuracy? When you approve, these corrections have been made, and @steppi takes a final pass through the rendered documentation to look for inconsistencies like the ones I've highlighted, I think it's fine for @steppi to merge.
I just reviewed the current state and double-checked the code and the examples. The change to the So all good from my side :-) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like most of the stylization issues are fixed. I'm re-reviewing so it's easier to keep track of what remains.
Looking pretty good @steppi. I think we should rebase, squashing consecutive commits by the same author. You want to do that, or want me to? |
You can do it if you have time. Eating dinner now and won’t have a chance for a while. |
I've rebased some pretty messy histories before, but this one seems really tough. Would you take a look? If it's too messy, LMK. |
Sure. Will do. |
DOC: Sync docstring for test with actual implementation DOC: Indent hypotheses [skip actions] DOC: Fix docstring formatting [skip actions] DOC: Try to fix another sphinx issue [skip actions] TST: Fix hardcoded value in doctest [skip actions] TST: More fix for doctest [skip actions] DOC: More thrashing [skip actions] DOC: sphinx was confused by `p`th [skip actions] DOC: try to deal with more sphinx weirdness [skip actions] DOC: Change pth to latex [skip actions] [skip cirrus] DOC: Make p-value consistent in docs [skip actions] [skip cirrus] DOC: More docstring cleanup [skip actions] [skip cirrus] Update scipy/stats/_stats_py.py Co-authored-by: Matt Haberland <mhaberla@calpoly.edu> Update scipy/stats/_stats_py.py [skip actions] [skip cirrus] Co-authored-by: Matt Haberland <mhaberla@calpoly.edu> Remove blank line (squash) MAINT: Make result contain only one statistic + info MAINT: Bring some lines to 79 chars or less MAINT: Fix statistic not set for alternative two-side MAINT: Update statistic_type -> statistic_sign TST: Fix doctests [skip actions] [skip cirrus] MAINT: Change to statistic_type 1, 2 MAINT: Fix formatting in tests Apply suggestions from code review [skip cirrus] [skip actions] Co-authored-by: Matt Haberland <mhaberla@calpoly.edu> DOC: Documentation fixes [skip actions] [skip cirrus] DOC: documentation fixes [skip actions] [skip cirrus] DOC: More doc fixes [skip actions] [skip cirrus] DOC: Adjust formatting for bullet list [skip actions] [skip cirrus] DOC: Adjust subscript to superscript in pth [skip cirrus] [skip actions] DOC: Remove extraneous blank line [skip actions] [skip cirrus] DOC: Unbold hypotheses [skip cirrus] [skip actions]
Congratulations and thanks for seeing it through! |
Reference issue
Relate to statsmodels/statsmodels#6562
Loosely relate to #10577
(additional statistical methods, including confidence intervals)
What does this implement?
This PR implements a non-parametric approach to compute confidence intervals for quantiles for a given confidence level and input
x
which is either a set of samples (one-dimensional array_like) or the number of samples available. The confidence intervals are valid if and only if the samples are i.i.d.Both one-sided and two-sided confidence intervals can be obtained (default is one-sided). The function returns two values: either the bounds for the two one-sided confidence intervals, or the lower and upper bounds of a two-sided confidence interval.
The return values are either the indexes of the bounds (if
x
is an integer) or sample values (ifx
is the set of samples).None
is returned when there are not enough samples to compute the desired confidence interval.This approach is well-known, present is statistics textbooks but got somehow overlooked despite its simplicity and usefulness. An implementation in SciPy would help changing that.
Additional information