Skip to content

Loading…

test case for R<-1 in scipy.stats.linregress and fix in the code #433

Closed
wants to merge 4 commits into from

3 participants

@kif

Code for preventing R to be lower than -1.0 due to accumulation of numerical error scipy.stats.linregress and a test-case (tested under debian7, python2.7, amd64 and probably dependent on the compiler as well)

@josef-pkt josef-pkt commented on an outdated diff
scipy/stats/tests/test_stats.py
@@ -743,6 +743,20 @@ def test_linregress(self):
res = (1.0, 5.0, 0.98229948625750, 7.45259691e-008, 0.063564172616372733)
assert_array_almost_equal(stats.linregress(x,y),res,decimal=14)
+ def test_regress_simple_negative_cor(self):
+ """
+ If the slope of the regression is negative the factor R tend to -1 not 1.
+ Sometimes rounding errors make it < -1 leading to
+ """
+ a, n = 1e-71, 100000
+ x = np.linspace(a, 2 * a, n)
+ y = np.linspace(2 * a, a, n)
+ stats.linregress(x, y)
+ res = stats.linregress(x, y)
+ assert_(res[2] >= -1, "R factor is in [-1..1]")
@josef-pkt SciPy member

the assertion is too loose, shouldn't this be
assert_equal(res[2], -1, ...)
there is a small chance that equality doesn't hold for all platforms
the we could use both (just add an almost equal)

assert_(res[2] >= -1, "R factor is in [-1..1]")
assert_almost_equal(res[2]), -1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@josef-pkt josef-pkt commented on an outdated diff
scipy/stats/stats.py
@@ -2962,6 +2962,7 @@ def linregress(x, y=None):
else:
r = r_num / r_den
if (r > 1.0): r = 1.0 # from numerical error
+ elif (r < -1.0): r = -1.0 # from numerical error
@josef-pkt SciPy member

cosmetic: move r = ... into new lines (also the old)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@josef-pkt
SciPy member

@rgommers should go into 0.12

@kif

I think all your comment were taken into account. Josef, is it OK ?

@josef-pkt

error message could be improved, it's purpose is mainly to get the information about the test in case of failure
maybe just something like "perfect negative correlation case"

SciPy member

just additional info:
The numbers, actual and expected, are shown automatically in case of a test failure, so it doesn't need to be in the error message.

@rgommers
SciPy member

Squashed and merged as c415f8d. Make a minor change to the messages - they're comments now. Failure will then show the actual numerical values.

Thanks Jerome, Josef.

@rgommers rgommers closed this
@ClemensFMN ClemensFMN referenced this pull request
Commit has since been removed from the repository and is no longer available.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
Showing with 20 additions and 1 deletion.
  1. +5 −1 scipy/stats/stats.py
  2. +15 −0 scipy/stats/tests/test_stats.py
View
6 scipy/stats/stats.py
@@ -2961,7 +2961,11 @@ def linregress(x, y=None):
r = 0.0
else:
r = r_num / r_den
- if (r > 1.0): r = 1.0 # from numerical error
+ # test for numerical error propagation
+ if (r > 1.0):
+ r = 1.0
+ elif (r < -1.0):
+ r = -1.0
#z = 0.5*log((1.0+r+TINY)/(1.0-r+TINY))
df = n-2
t = r*np.sqrt(df/((1.0-r+TINY)*(1.0+r+TINY)))
View
15 scipy/stats/tests/test_stats.py
@@ -743,6 +743,21 @@ def test_linregress(self):
res = (1.0, 5.0, 0.98229948625750, 7.45259691e-008, 0.063564172616372733)
assert_array_almost_equal(stats.linregress(x,y),res,decimal=14)
+ def test_regress_simple_negative_cor(self):
+ """
+ If the slope of the regression is negative the factor R tend to -1 not 1.
+ Sometimes rounding errors makes it < -1 leading to stderr being NaN
+ """
+ a, n = 1e-71, 100000
+ x = np.linspace(a, 2 * a, n)
+ y = np.linspace(2 * a, a, n)
+ stats.linregress(x, y)
+ res = stats.linregress(x, y)
+ assert_(res[2] >= -1, "propagated numerical errors were not corrected")
+ assert_almost_equal(res[2], -1, "perfect negative correlation case")
+ assert_(not np.isnan(res[4]), "stderr should stay finite")
+
+
class TestHistogram(TestCase):
""" Tests that histogram works as it should, and keeps old behaviour
"""
Something went wrong with that request. Please try again.