Boundary parameter handling #283

simonbyrne · 2014-09-16T12:10:43Z

How should we handle parameters which lie on the boundary? e.g.

Poisson(0) (see original discussion) FIXED!
~~Normal(0,0)~~
~~Beta(a,0) and Beta(0,b) for a,b > 0.~~

In most cases the limits end up being Dirac measures, though in some cases there can be ambiguity (e.g. Beta(0,0)).

If we do include these, we also need to decide how to handle skewness/kurtosis: presumably either NaN or throw an error.

Update:
Don't allow this for continuous distributions. Distributions that need update:

Poisson
Geometric
NegativeBinomial

The text was updated successfully, but these errors were encountered:

andreasnoack · 2014-09-16T12:59:24Z

As argued on the list, I don't see the problem in extending the Poisson distribution to include λ=0. I doubt that skewness/kurtosis will be used for anything in the degenerate case anyway so whether they return Inf, NaN or an error is less important to me. I think the limits (Inf) are nicest and that it is okay that degenerate distributions return different results for skewness/kurtosis depending on which distribution they degenerate from.

I'd say let's wait and see for a demand before changing Beta and Normal. It is a bit more dramatic to go from continuous to discrete so maybe is not as useful as the Poisson case.

simonbyrne · 2014-09-16T13:52:45Z

That seems like a reasonable idea. This would also match the handling of Binomial.

nalimilan · 2014-09-16T15:03:35Z

I agree with @andreasnoack. As long as the limits are clearly defined, there's no reason to raise an error.

spaceLem · 2014-09-16T15:22:23Z

As the one who brought it up, I'm for the change. It's a more useful result than an error (at least to me and other modellers who are likely to encounter it), and it makes sense in the limit lambda -> 0. Also it matches behaviour in Matlab, Octave, R, and the GSL (although not Scipy).

jiahao · 2014-09-16T15:33:55Z

The mailing list thread had a question about the limiting skewness of Poisson distributions. One can do a more careful derivation, but empirically it looks like the limit is well defined as positive infinity:

julia> for i=1:15
        P = Poisson(10.0^-i)
        println(i,"\t", skewness(P))
       end
1   3.162277660168379
2   10.0
3   31.622776601683796
4   100.0
5   316.2277660168379
6   1000.0
7   3162.277660168379
8   10000.0
9   31622.776601683792
10  99999.99999999999
11  316227.76601683797
12  1.0e6
13  3.1622776601683795e6
14  1.0e7
15  3.1622776601683795e7

johnmyleswhite · 2014-09-16T16:09:39Z

As the person who's most worried about this proposed change, I'd like to argue for making this kind of change much more systematically. What worried me about the original proposal is that it seemed to only offer a small bit of convenience at the potential expense of formal correctness in lots of other computations.

In general, I've come to strongly prefer making decisions about core packages based on sweeping principles of design that dictate how the package should behave no matter what specific case is being considered. As @simonbyrne points out, there are many other boundary cases we should consider before deciding to allow Poisson(0).

For all of those cases, we might adopt the design principle that whenever a boundary condition exists and has a clear well-defined limit, we adopt the value at the limit as the value for that boundary condition. In particular, whenever a boundary condition is equivalent to a Dirac measure, we produce outputs equivalent to those we would produce for a hypothetical Dirac measure distribution type.

If do we adopt that kind of design principle, I'd like to make sure we apply it systematically and not wait for someone to complain about inconsistencies in how we handle different distributions.

I suspect this principle could affect a lot of other distributions, including at least:

Binomial with zero counts
Cauchy with 0 scale
Dirichlet with some alpha = 0
DiscreteUniform with lower bound = upper bound
Exponential with scale = 0
Gamma with shape = 0 and scale = 0
Geometric with p = 0 or p = 1
InverseGamma with shape = 0 and scale = 0
Laplace with scale = 0
Logistic with scale = 0
LogNormal with log standard deviation = 0
NegativeBinomial with p = 0
Uniform with lower bound = upper bound

So I'd say that, if we're going to embrace boundary conditions, we should really embrace them and figure out how this design principle would impact everything in Distributions.

andreasnoack · 2014-09-16T23:27:02Z

Okay, let me try to break the filibuster attempt. If we were paid to spend all our time on Distributions I think your proposal is reasonable. However, our resources are scarce so we should try to devote them where they make most use. I don't think this allows much time spend on going through all the methods of the Gumbel distribution for a zero scale parameter.

@spaceLem proposed a small change to the Poisson distribution which would make it a bit easier to use in an application and I don't believe that the change will give problems elsewhere.

A compromise could be to extend the discrete distributions only. I think it makes sense because, as argued on the list, the change from continuous to a point measure is more dramatic and, I think, less relevant.

StefanKarpinski · 2014-09-16T23:51:45Z

Another way to look at this issue when to indicate a problem for certain values of distribution parameters. There are some values that are all around useless and should cause an error immediately. Others, like those being discussed here seem to be ok or not depending on the question one then asks about the resulting distribution object. In such cases, it seems reasonable and in line with Julia's dynamic nature to allow construction and sensible questions and defer errors to until the wrong question is actually asked. It also seems like for a lot of these questions there's an arguably correct non-finite answer.

StefanKarpinski · 2014-09-16T23:54:13Z

Also, middle ground between handling cases in ad hoc fashion and implementing it all at once, consistently: figure out a good principle and implement some cases, but don't try to deal with all of them right away.

nalimilan · 2014-09-17T12:33:59Z

That's what I was going to suggest. @johnmyleswhite criteria are good, but we can wait for actual use cases to come up before implementing them. Starting with common cases is a good strategy.

StefanKarpinski · 2014-09-17T14:51:39Z

Yeah, having a coherent policy means that any time the issue comes up, everyone knows what to do.

lindahua · 2014-11-08T16:53:21Z

I have no problem with allowing zero rate for Poisson. However, allowing zero scale for continuous distributions seems to be a more complex problem. Dealing with atomic distributions is nontrivial. How can you tell an infinite density with probability mass 1.0 from that with probability mass 0.5?

lindahua · 2014-11-08T16:59:28Z

Also, now Poisson distributions depend on Rmath. Does R support zero rate?

andreasnoack · 2014-11-08T17:01:25Z

> rpois(10, 0)
 [1] 0 0 0 0 0 0 0 0 0 0

richardreeve · 2015-07-31T16:06:40Z

I've submitted pull request #398 to fix the Poisson(0) issue.

Fixing Boundary parameter handling #283 for Poisson(0)

richardreeve · 2015-08-04T17:05:50Z

The Poisson(0) issue is now fixed since pull requests #398 and #401 have been merged.

simonbyrne · 2017-11-30T17:57:04Z

I would be keen to allow Normal with std dev = 0, as it comes up fairly often.

Part of #283. I have hit this issue a few times: it can be very handy, especially for simulation. See also JuliaStats/StatsFuns.jl#62, which tweaks the underlying functions to be more useful.

itsdebartha · 2024-03-06T13:34:51Z

I would very much like to incorporate Geometric with the success parameter p=1. I came across a situation in a simulation study of a response-adaptive treatment allocation and encountered some error relating to zero(p) < p < one(p) when an allocation probability becomes 1. Moreover, I think including p=1 will be generalising this distribution a bit more.

Am willing to create a PR if this seems a satisfactory addition to the people here...

lindahua added the internal label Nov 3, 2014

lindahua added this to the v0.8 milestone Jul 29, 2015

johnmyleswhite added a commit that referenced this issue Aug 4, 2015

Merge pull request #398 from richardreeve/master

8c52fa6

Fixing Boundary parameter handling #283 for Poisson(0)

lindahua removed this from the v0.9 milestone Feb 4, 2017

simonbyrne mentioned this issue Oct 26, 2018

Allow Normal distribution with zero std deviation #789

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Boundary parameter handling #283

Boundary parameter handling #283

simonbyrne commented Sep 16, 2014 •

edited by andreasnoack

Loading

andreasnoack commented Sep 16, 2014

simonbyrne commented Sep 16, 2014

nalimilan commented Sep 16, 2014

spaceLem commented Sep 16, 2014

jiahao commented Sep 16, 2014

johnmyleswhite commented Sep 16, 2014

andreasnoack commented Sep 16, 2014

StefanKarpinski commented Sep 16, 2014

StefanKarpinski commented Sep 16, 2014

nalimilan commented Sep 17, 2014

StefanKarpinski commented Sep 17, 2014

lindahua commented Nov 8, 2014

lindahua commented Nov 8, 2014

andreasnoack commented Nov 8, 2014

richardreeve commented Jul 31, 2015

richardreeve commented Aug 4, 2015

simonbyrne commented Nov 30, 2017

itsdebartha commented Mar 6, 2024 •

edited

Loading

Boundary parameter handling #283

Boundary parameter handling #283

Comments

simonbyrne commented Sep 16, 2014 • edited by andreasnoack Loading

andreasnoack commented Sep 16, 2014

simonbyrne commented Sep 16, 2014

nalimilan commented Sep 16, 2014

spaceLem commented Sep 16, 2014

jiahao commented Sep 16, 2014

johnmyleswhite commented Sep 16, 2014

andreasnoack commented Sep 16, 2014

StefanKarpinski commented Sep 16, 2014

StefanKarpinski commented Sep 16, 2014

nalimilan commented Sep 17, 2014

StefanKarpinski commented Sep 17, 2014

lindahua commented Nov 8, 2014

lindahua commented Nov 8, 2014

andreasnoack commented Nov 8, 2014

richardreeve commented Jul 31, 2015

richardreeve commented Aug 4, 2015

simonbyrne commented Nov 30, 2017

itsdebartha commented Mar 6, 2024 • edited Loading

simonbyrne commented Sep 16, 2014 •

edited by andreasnoack

Loading

itsdebartha commented Mar 6, 2024 •

edited

Loading