fix: use robust pseudo p-value for two-sided significance#514
Open
madhavcodez wants to merge 1 commit into
Open
fix: use robust pseudo p-value for two-sided significance#514madhavcodez wants to merge 1 commit into
madhavcodez wants to merge 1 commit into
Conversation
The two-sided alternative derived the p-value from percentiles of the reference distribution. When the conditional reference distribution is constant, those percentiles collapse to a single value, so both the lower and upper tail counts include every permutation. The resulting count exceeds p_permutations and the p-value can exceed one (e.g. 1.95 for a constant 19-permutation null). Replace the percentile approach with the equivalent robust pseudo p-value, 2 * (min(greater, lesser) + 1) / (permutations + 1), clipped at one. This keeps the directed (one-sided) p-value no larger than the two-sided value and stays bounded for degenerate nulls. Closes pysal#504
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Title
fix: use robust pseudo p-value for two-sided significance
Summary
The two-sided branch of
_permutation_significanceinesda/significance.pyderived the p-value from percentiles of the reference distribution. When the
conditional reference distribution is constant (every permuted value is
identical), those percentiles collapse to a single value, so both the lower and
upper tail counts include the entire reference distribution. The resulting count
exceeds
p_permutations, and the p-value can exceed one.This replaces the percentile approach with the equivalent robust pseudo
p-value suggested in the issue:
clipped at one. The result stays in
(0, 1]for degenerate nulls and matchesthe existing one-sided counting conventions used by the
greater/lesserbranches, so the
directed <= two-sidedinvariant still holds.Changes
esda/significance.py: two-sided branch now uses the clipped pseudo p-value.esda/tests/test_significance.py: regression tests coveringthe result is exactly
1.0for both the scalar and vector inputs;statistic off the constant, asserting the pseudo p-value
0.1;0.016;two-sided == 2 * directedidentity on a one-sided statistic.Testing
pytest esda/tests/test_significance.pypasses (10 tests), including theexisting
test_execution_and_rangeandtest_alternative_relationships.reports
1.95) and passes with the fix.(
test_moran.py,test_moran_local_mv.py) passes.Closes #504