Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

BUG: solved issue with ppf #216

Closed
wants to merge 2 commits into from

4 participants

@nokfi

I issued this first as a branch called ticket1493. This got messed up with the branch repair-typo. Therefore I removed ticket1493, and pushed this new branch. There are also some comments related to the old, but now removed, branch. I include this as one comment here.

@nokfi

I replaced the ppf function. Once this works, I'll remove xa and xb.

josef-pkt commented 8 hours ago
the new ppf function looks good, as discussed on the mailing list, the existing test should cover this case.

possible extra optimization
_ppf_to_solve uses cdf which needs to go through the generic wrapping code each time.
we could use _cdf instead, but that would require additional checking/tests to see whether check_args are properly set (I think they are)

nokfi commented 2 hours ago
As a test I naively replaced self.cdf by self._pdf in _ppf_to_solve, and ran test_continuous_basic.py This fails, so a simple replacement will not suffice.

josef-pkt commented 2 hours ago
"replaced self.cdf by self._pdf" did you actually use _pdf or _cdf ?

my guess was, that, since you limit the range to the interval [.a, .b], _cdf might work, but maybe it would require open intervall (.a, .b).
I tried some cases like this in the past, and sometimes it worked often it broke on a few distributions.

nokfi commented 2 hours ago
On 17 May 2012 21:30, Josef Perktold
reply@reply.github.com
wrote:
"replaced self.cdf by self._pdf" did you actually use _pdf or _cdf ?
To be explicit, I did this:

def _ppf_to_solve(self, x, q,*args):
    return apply(self._cdf, (x, )+args)-q

this self._cdf rather than self.cdf

my guess was, that, since you limit the range to the interval [.a, .b], _cdf might work, but maybe it would require open intervall (.a, .b).
I tried some cases like this in the past, and sometimes it worked often it broke on a few distributions.
I like to include .a and .b in the search interval to cover the cases
q = 0 and q =1. Sure, mathematically speaking the return value of
cdf(x) = 1 can be set to np.inf. However, this is, at least for me,
less informative than self.b. (and likewise for self.a).

So we stick to cdf, rather than _cdf, at least for the moment?
�$B!D

josef-pkt commented an hour ago
Yes, stick with .cdf. Worth a try, but if it doesn't work it's unrelated to current pull request.
Your limiting to [a,b] is the right thing to do

(aside: q=0, q=1 is supposed to be handled by generic part, but using left=.a and right=.b if finite, let's brentq search arbitrarily close to the boundary, which is singular in some cases (pdf goes to inf). Although, it might still break at a singularity, or if cdf uses intquad and we want ppf(1e-10))

@dlax
Collaborator

Does this replace PR #214? If so, you should close the latter @nokfi.

@nokfi
@rgommers
Owner

general_cont_ppf can indeed be removed.

@rgommers
Owner

I didn't see the git question in PR-214 before, but I'm working on disentangling some of these PRs now.

Nicky, I think you are now used to creating different branches for new features, but keep in mind that you should create each branch separately based on master, and not on each other (you did that correctly for this PR now). Otherwise the same commits show up in several PRs.

@rgommers
Owner

The change to test_continuous_basic.py is already in master, that's why this can't be merged automatically. Once this is done, I can rebase and merge it.

@nokfi

Sorry for the mess. Is there something that I should do about this, or is it easy to fix?

@rgommers
Owner

No, it's no problem. This one's not too bad, PR-205 was the one requiring some surgery and that's merged now.

@nokfi
@josef-pkt
Collaborator

docstring line 879 and following still contain explanation for xa xb AFAICS

The only problem left here is the removal of xa and xb from the init. This can possibly break existing code even if it's not necessary anymore.

I'm a bit puzzled that I don't find any xa, xb defined in any of the distributions when the instances are created. A quick look at my own code, also doesn't use it anymore. I was convinced that I changed xa and xb for some distributions.

If we want to be conservative with respect to backwards compatibility, we could add for example a **kwds to the init method, and warn,

if 'xa' in kwds or 'xb' in kwds: 
    import warnings
    <deprecation warning....>

Since I recently advertised xa, xb on stackoverflow, http://stackoverflow.com/questions/10678546/creating-new-distributions-in-scipy it might be better to raise a deprecation warning instead of raising an exception if someone defines xa, xb.

Other than the xa,xb in the init it's a good change. If the test suite passes (including slow tests), then there are no hidden problems with any distribution.

@rgommers
Owner

Better to just not remove xa and xb keywords from the signature in that case, add the deprecation warning for it and open a new ticket to remove them for 0.12. Adding a new **kwds parameter is not so nice.

@rgommers
Owner

I can do that, remove the left over doc and also remove general_cont_ppf and merge this.

@josef-pkt
Collaborator

sounds good,
(I didn't like **kdws much either, but it would have removed xa, xb from the visible signature, although almost no one will look at the init signature)

@josef-pkt
Collaborator

in case it's not obvious: if xa, xb stay as kwd arguments, the defaults should be set to None, so it's easy to check whether someone used a value as argument.

@rgommers
Owner

Opened http://projects.scipy.org/scipy/ticket/1667 to track removal of these params.

@rgommers
Owner

recipinvgauss and foldcauchy were still initialized with an xb parameter by the way.

@josef-pkt
Collaborator

I'm glad you found them, I searched everywhere but only for xa, not for xb. (and I can still trust my memory)

@rgommers
Owner

foldcauchy doesn't seem to have a test for this specific issue though. recipinvgauss apparently wasn't working anyway, according to the FIXME note above it.

@rgommers
Owner

Pushed as ed7f037 and 49ed1d2. Thanks Nicky and Josef.

@rgommers rgommers closed this
@jnothman jnothman referenced this pull request from a commit
Commit has since been removed from the repository and is no longer available.
@rgommers rgommers referenced this pull request from a commit in rgommers/scipy
@rgommers rgommers DEP: remove deprecated keywords `xa, xb` from continuous distributions.
Closes gh-2192.  See gh-216 for discussion.
388cf0c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on May 17, 2012
  1. @nokfi

    BUG: solved issue with ppf

    nokfi authored
  2. @nokfi
This page is out of date. Refresh to see the latest.
View
31 scipy/stats/distributions.py
@@ -434,11 +434,9 @@ def _build_random_array(fun, args, size=None):
## (needs cdf function) and uses brentq from scipy.optimize
## to compute ppf from cdf.
class general_cont_ppf(object):
- def __init__(self, dist, xa=-10.0, xb=10.0, xtol=1e-14):
+ def __init__(self, dist, xtol=1e-14):
self.dist = dist
self.cdf = eval('%scdf'%dist)
- self.xa = xa
- self.xb = xb
self.xtol = xtol
self.vecfunc = sgf(self._single_call,otypes='d')
def _tosolve(self, x, q, *args):
@@ -1077,7 +1075,7 @@ def _pdf:
"""
- def __init__(self, momtype=1, a=None, b=None, xa=-10.0, xb=10.0,
+ def __init__(self, momtype=1, a=None, b=None,
xtol=1e-14, badvalue=None, name=None, longname=None,
shapes=None, extradoc=None):
@@ -1095,8 +1093,6 @@ def __init__(self, momtype=1, a=None, b=None, xa=-10.0, xb=10.0,
self.a = -inf
if b is None:
self.b = inf
- self.xa = xa
- self.xb = xb
self.xtol = xtol
self._size = 1
self.m = 0.0
@@ -1177,7 +1173,28 @@ def _ppf_to_solve(self, x, q,*args):
return apply(self.cdf, (x, )+args)-q
def _ppf_single_call(self, q, *args):
- return optimize.brentq(self._ppf_to_solve, self.xa, self.xb, args=(q,)+args, xtol=self.xtol)
+ left = right = None
+ if self.a > -np.inf:
+ left = self.a
+ if self.b < np.inf:
+ right = self.b
+
+ factor = 10.
+ if not left: # i.e. self.a = -inf
+ left = -1.*factor
+ while self._ppf_to_solve(left, q,*args) > 0.:
+ right = left
+ left *= factor
+ # left is now such that cdf(left) < q
+ if not right: # i.e. self.b = inf
+ right = factor
+ while self._ppf_to_solve(right, q,*args) < 0.:
+ left = right
+ right *= factor
+ # right is now such that cdf(right) > q
+
+ return optimize.brentq(self._ppf_to_solve, \
+ left, right, args=(q,)+args, xtol=self.xtol)
# moment from definition
def _mom_integ0(self, x,m,*args):
View
5 scipy/stats/tests/test_continuous_basic.py
@@ -305,8 +305,9 @@ def check_sample_meanvar(sm,m,msg):
@_silence_fp_errors
def check_cdf_ppf(distfn,arg,msg):
- npt.assert_almost_equal(distfn.cdf(distfn.ppf([0.001,0.5,0.999], *arg), *arg),
- [0.001,0.5,0.999], decimal=DECIMAL, err_msg= msg + \
+ values = [0.001,0.5,0.999]
+ npt.assert_almost_equal(distfn.cdf(distfn.ppf(values, *arg), *arg),
+ values, decimal=DECIMAL, err_msg= msg + \
' - cdf-ppf roundtrip')
@_silence_fp_errors
Something went wrong with that request. Please try again.