Skip to content
This repository

BUG: incorrect limits in discrete ppf, fixes ticket:1802, see also ticket:1803 #394

Closed
wants to merge 4 commits into from

3 participants

Josef Perktold rajkrpan Ralf Gommers
Josef Perktold
Collaborator

corrects limits for bisection in discrete ppf, use .a and .b instead of .invcdf_a, .invcdf_b

make hypergeom.a always non-negative

test runs without errors with changes on top of scipy 0.11rc2

rajkrpan

It seems that the fix has some additional problem. For e.g., stats.hypergeom.ppf(0.02, 32335, 32335, 895) no longer works

Josef Perktold
Collaborator

The last commit raises a RuntimeError if the updating in the bisection stops. Added as a safeguard, even if it should never be reached if the code is correct.

I guess RuntimeError is the right exception, is it?

tested on scipy 0.9.0 without any of the other changes. In the original case of ticket:1802 an exception would be raised now instead of the endless loop.

Note: pull request needs a test run with the current scipy master, since I never run the master version.

Ralf Gommers
Owner

RuntimeError is ok, fix works fine also with master.

Question on the tickets: 1802 still has some unfixed issues listed (spd.hypergeom.pmf(0, 32335, 0, 895) --> nan, etc.), while 1803 seems fixed. You description in PR title says to close 1802 and leave 1803 open though?

Ralf Gommers
Owner

Never mind, see ticket 1806 now. Close both 1802 and 1803 then?

Josef Perktold
Collaborator

Yes, close both. Initially I wanted to keep 1803 open for further improvements but I think it should work well enough now. The only thing left would be improving sf/isf, but that's not directly relevant for this changes.

one cleanup I forgot: invcdf_a and invcdf_b can also be remove from the rv_discrete.__init__ . They are not used anymore

Ralf Gommers
Owner

OK, merged in 951dffc. Thanks Josef.

Ralf Gommers rgommers closed this January 05, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
22  scipy/stats/distributions.py
@@ -5354,10 +5354,10 @@ def _drv2_moment(self, n, *args):
5354 5354
     return tot
5355 5355
 
5356 5356
 def _drv2_ppfsingle(self, q, *args):  # Use basic bisection algorithm
5357  
-    b = self.invcdf_b
5358  
-    a = self.invcdf_a
  5357
+    b = self.b
  5358
+    a = self.a
5359 5359
     if isinf(b):            # Be sure ending point is > q
5360  
-        b = max(100*q,10)
  5360
+        b = int(max(100*q,10))
5361 5361
         while 1:
5362 5362
             if b >= self.b: qb = 1.0; break
5363 5363
             qb = self._cdf(b,*args)
@@ -5366,7 +5366,7 @@ def _drv2_ppfsingle(self, q, *args):  # Use basic bisection algorithm
5366 5366
     else:
5367 5367
         qb = 1.0
5368 5368
     if isinf(a):    # be sure starting point < q
5369  
-        a = min(-100*q,-10)
  5369
+        a = int(min(-100*q,-10))
5370 5370
         while 1:
5371 5371
             if a <= self.a: qb = 0.0; break
5372 5372
             qa = self._cdf(a,*args)
@@ -5380,7 +5380,7 @@ def _drv2_ppfsingle(self, q, *args):  # Use basic bisection algorithm
5380 5380
             return a
5381 5381
         if (qb == q):
5382 5382
             return b
5383  
-        if b == a+1:
  5383
+        if b <= a+1:
5384 5384
     #testcase: return wrong number at lower index
5385 5385
     #python -c "from scipy.stats import zipf;print zipf.ppf(0.01,2)" wrong
5386 5386
     #python -c "from scipy.stats import zipf;print zipf.ppf([0.01,0.61,0.77,0.83],2)"
@@ -5392,10 +5392,16 @@ def _drv2_ppfsingle(self, q, *args):  # Use basic bisection algorithm
5392 5392
         c = int((a+b)/2.0)
5393 5393
         qc = self._cdf(c, *args)
5394 5394
         if (qc < q):
5395  
-            a = c
  5395
+            if a != c:
  5396
+                a = c
  5397
+            else:
  5398
+                raise RuntimeError('updating stopped, endless loop')
5396 5399
             qa = qc
5397 5400
         elif (qc > q):
5398  
-            b = c
  5401
+            if b != c:
  5402
+                b = c
  5403
+            else:
  5404
+                raise RuntimeError('updating stopped, endless loop')
5399 5405
             qb = qc
5400 5406
         else:
5401 5407
             return c
@@ -6661,7 +6667,7 @@ def _rvs(self, M, n, N):
6661 6667
     def _argcheck(self, M, n, N):
6662 6668
         cond = rv_discrete._argcheck(self,M,n,N)
6663 6669
         cond &= (n <= M) & (N <= M)
6664  
-        self.a = N-(M-n)
  6670
+        self.a = max(N-(M-n), 0)
6665 6671
         self.b = min(n,N)
6666 6672
         return cond
6667 6673
     def _logpmf(self, k, M, n, N):
14  scipy/stats/tests/test_distributions.py
@@ -938,6 +938,20 @@ def test_norm_logcdf():
938 938
     finally:
939 939
         np.seterr(**olderr)
940 940
 
  941
+def test_hypergeom_interval_1802():
  942
+    #these two had endless loops
  943
+    assert_equal(stats.hypergeom.interval(.95, 187601, 43192, 757),
  944
+                 (152.0, 197.0))
  945
+    assert_equal(stats.hypergeom.interval(.945, 187601, 43192, 757),
  946
+                 (152.0, 197.0))
  947
+    #this was working also before
  948
+    assert_equal(stats.hypergeom.interval(.94, 187601, 43192, 757),
  949
+                 (153.0, 196.0))
  950
+
  951
+    #degenerate case .a == .b
  952
+    assert_equal(stats.hypergeom.ppf(0.02, 100, 100, 8), 8)
  953
+    assert_equal(stats.hypergeom.ppf(1, 100, 100, 8), 8)
  954
+
941 955
 
942 956
 if __name__ == "__main__":
943 957
     run_module_suite()
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.