fix: use robust pseudo p-value for two-sided significance by madhavcodez · Pull Request #514 · pysal/esda

madhavcodez · 2026-06-30T02:46:21Z

Title

fix: use robust pseudo p-value for two-sided significance

Summary

The two-sided branch of _permutation_significance in esda/significance.py
derived the p-value from percentiles of the reference distribution. When the
conditional reference distribution is constant (every permuted value is
identical), those percentiles collapse to a single value, so both the lower and
upper tail counts include the entire reference distribution. The resulting count
exceeds p_permutations, and the p-value can exceed one.

>>> import numpy as np
>>> from esda.significance import calculate_significance
>>> calculate_significance(5.0, np.full((1, 19), 5.0), alternative="two-sided")
1.95

This replaces the percentile approach with the equivalent robust pseudo
p-value suggested in the issue:

2 * (min(greater, lesser) + 1) / (permutations + 1)

clipped at one. The result stays in (0, 1] for degenerate nulls and matches
the existing one-sided counting conventions used by the greater/lesser
branches, so the directed <= two-sided invariant still holds.

Changes

esda/significance.py: two-sided branch now uses the clipped pseudo p-value.
esda/tests/test_significance.py: regression tests covering
- the degenerate constant null with the statistic on the constant, asserting
  the result is exactly 1.0 for both the scalar and vector inputs;
- the second failure mode of the old formula, a constant null with the
  statistic off the constant, asserting the pseudo p-value 0.1;
- a seeded normal null, asserting the exact pseudo p-value 0.016;
- the two-sided == 2 * directed identity on a one-sided statistic.

Testing

pytest esda/tests/test_significance.py passes (10 tests), including the
existing test_execution_and_range and test_alternative_relationships.
Each new assertion fails on the unpatched source (the degenerate scalar case
reports 1.95) and passes with the fix.
Broad regression on the consumers of the function
(test_moran.py, test_moran_local_mv.py) passes.

Closes #504

The two-sided alternative derived the p-value from percentiles of the reference distribution. When the conditional reference distribution is constant, those percentiles collapse to a single value, so both the lower and upper tail counts include every permutation. The resulting count exceeds p_permutations and the p-value can exceed one (e.g. 1.95 for a constant 19-permutation null). Replace the percentile approach with the equivalent robust pseudo p-value, 2 * (min(greater, lesser) + 1) / (permutations + 1), clipped at one. This keeps the directed (one-sided) p-value no larger than the two-sided value and stays bounded for degenerate nulls. Closes pysal#504

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: use robust pseudo p-value for two-sided significance#514

fix: use robust pseudo p-value for two-sided significance#514
madhavcodez wants to merge 1 commit into
pysal:mainfrom
madhavcodez:fix/degenerate-two-sided-pvalue

madhavcodez commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

madhavcodez commented Jun 30, 2026

Title

Summary

Changes

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant