Fix for cornercases where ppf and isf methods fails #463

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
4 participants
@pbrod
Contributor

pbrod commented Mar 10, 2013

This PR fixes two problems as described in Ticket1131:

  1. ppf method fails on array-like 'loc' or 'scale'
  2. ppf and isf methods return wrong value at the boundary (p=0 or p=1)

See http://projects.scipy.org/scipy/ticket/1131

scipy/stats/distributions.py
+
+ proxy_value = self.b * scale + loc
+ if product(shape(proxy_value)) != 1:
+ proxy_value = extract(cond2, proxy_value * cond2)

This comment has been minimized.

Show comment Hide comment
@josef-pkt

josef-pkt Mar 22, 2013

Member

many of these kind of expressions in stats distributions look like they are written the inherited style

maybe we should start to explicitly broadcast, which would make the code easier to understand (I always had and have a problem which calculations are done for "real" and which just to broadcast to get matching shapes.)

numpy.broadcast_arrays

I think this should be equivalent (not tested):

proxy_value = extract(cond2, np.broadcast_arrays(proxy_value, cond2) [0])
@josef-pkt

josef-pkt Mar 22, 2013

Member

many of these kind of expressions in stats distributions look like they are written the inherited style

maybe we should start to explicitly broadcast, which would make the code easier to understand (I always had and have a problem which calculations are done for "real" and which just to broadcast to get matching shapes.)

numpy.broadcast_arrays

I think this should be equivalent (not tested):

proxy_value = extract(cond2, np.broadcast_arrays(proxy_value, cond2) [0])

This comment has been minimized.

Show comment Hide comment
@pv

pv Mar 22, 2013

Member

place and extract are equivalent to fancy indexing, right? I would prefer new code to use indexing rather than these (legacy) functions...

@pv

pv Mar 22, 2013

Member

place and extract are equivalent to fancy indexing, right? I would prefer new code to use indexing rather than these (legacy) functions...

This comment has been minimized.

Show comment Hide comment
@pbrod

pbrod Mar 28, 2013

Contributor

@josef-pkt: An alternative and shorter is proxy_value = argsreduce(cond2, proxy_value) [0]
@pv: The argument against is that fancy indexing is slower and especially for ndim>1 and that place also works on zero dimensional arrays while fancy indexing does not.

@pbrod

pbrod Mar 28, 2013

Contributor

@josef-pkt: An alternative and shorter is proxy_value = argsreduce(cond2, proxy_value) [0]
@pv: The argument against is that fancy indexing is slower and especially for ndim>1 and that place also works on zero dimensional arrays while fancy indexing does not.

scipy/stats/distributions.py
@@ -1464,7 +1465,11 @@ def ppf(self,q,*args,**kwds):
cond = cond0 & cond1
output = valarray(shape(cond),value=self.a*scale + loc)
place(output,(1-cond0)+(1-cond1)*(q!=0.0), self.badvalue)
- place(output,cond2,self.b*scale + loc)
+
+ proxy_value = self.b * scale + loc

This comment has been minimized.

Show comment Hide comment
@josef-pkt

josef-pkt Mar 22, 2013

Member

naming to make it easier to understand, this is the upper bound on the support

proxy_value -> upp for example

same for lower bound

@josef-pkt

josef-pkt Mar 22, 2013

Member

naming to make it easier to understand, this is the upper bound on the support

proxy_value -> upp for example

same for lower bound

This comment has been minimized.

Show comment Hide comment
@pbrod

pbrod Mar 28, 2013

Contributor

I agree. I will fix that.

@pbrod

pbrod Mar 28, 2013

Contributor

I agree. I will fix that.

@josef-pkt

This comment has been minimized.

Show comment Hide comment
@josef-pkt

josef-pkt Mar 22, 2013

Member

Looks fine, but I'm not able to follow all the shape manipulations (without stepping through the code with examples)

Member

josef-pkt commented Mar 22, 2013

Looks fine, but I'm not able to follow all the shape manipulations (without stepping through the code with examples)

@@ -492,7 +492,8 @@ def interval(self, alpha):
def valarray(shape,value=nan,typecode=None):
"""Return an array of all value.
"""
- out = reshape(repeat([value],product(shape,axis=0),axis=0),shape)
+
+ out = ones(shape, dtype=bool) * value

This comment has been minimized.

Show comment Hide comment
@rgommers

rgommers Mar 31, 2013

Member

A lot more understandable than the old code. This is not really performance-critical, so no need for using empty and .fill, which would be faster probably.

@rgommers

rgommers Mar 31, 2013

Member

A lot more understandable than the old code. This is not really performance-critical, so no need for using empty and .fill, which would be faster probably.

@@ -2436,7 +2446,7 @@ def _stats(self):
return inf, inf, nan, nan
def _entropy(self):
return log(4*pi)
- def _fitstart(data, args=None):
+ def _fitstart(self, data, args=None):

This comment has been minimized.

Show comment Hide comment
@rgommers

rgommers Mar 31, 2013

Member

Hmm, that was quite a blatant bug.

@rgommers

rgommers Mar 31, 2013

Member

Hmm, that was quite a blatant bug.

+ loc=[3]*7,scale=2)
+ NaN = np.NaN
+ expected = np.array([ NaN, 1. , 1.00056419, 3.
+ , 4.99943581, 5. , NaN])

This comment has been minimized.

Show comment Hide comment
@rgommers

rgommers Mar 31, 2013

Member

these 3 lines could fit on a single line, will change that

@rgommers

rgommers Mar 31, 2013

Member

these 3 lines could fit on a single line, will change that

@rgommers

This comment has been minimized.

Show comment Hide comment
@rgommers

rgommers Mar 31, 2013

Member

Merged in c0e24e6, thanks Per.

Member

rgommers commented Mar 31, 2013

Merged in c0e24e6, thanks Per.

@rgommers rgommers closed this Mar 31, 2013

cimarronm pushed a commit to cimarronm/scipy that referenced this pull request Jan 23, 2014

@pbrod pbrod deleted the pbrod:Ticket1131 branch Apr 28, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment