Out of support behavior in distribution functions is not consistent with base R functions #5

hennerw · 2019-01-31T18:31:24Z

With base R distribution functions if you ask the d function for a value it will return 0 rather than an error. Similarly for the p functions, if you ask for a q value greater or less than the support it will return 1 or 0 respectively.

Examples:

> pbinom(2,1,.5)
[1] 1
> pbinom(-1,1,.5)
[1] 0
> dbinom(2,1,.5)
[1] 0

With the rmutil functions out of support values give an error.

> pbetabinom(10,9,.2,6)
Error in pbetabinom(10, 9, 0.2, 6) : q must be <= size
> dbetabinom(10,9,.2,6)
Error in dbetabinom(10, 9, 0.2, 6) : y must be <= size
> dbetabinom(10,9,0.2,6)
Error in dbetabinom(10, 9, 0.2, 6) : y must be <= size
> dbetabinom(-1,9,0.2,6)
Error in dbetabinom(-1, 9, 0.2, 6) : y must contain non-negative values

The base R method of dealing with this is the mathematically correct one: P(2 heads, one coin flip) =0. P(5 or fewer heads on 3 coin flips)=1. These are defined, if typically uninteresting, values and should not generate an error.

I have not checked all of the functions, but it looks like the fix should be fairly simple. For pbetabinom at least, removing the check will generate the expected results with no additional code.

The text was updated successfully, but these errors were encountered:

swihart · 2019-02-01T15:58:40Z

Thanks for opening an issue. I replaced the fewest lines possible to get the functionality requested. Basically I replaced stop() with message() and updated the message. I had to keep in mind the vector case of inputs.

Try out the following and let me know if it fixes the issue for pbetabinom and dbetabinom...

## We want behavior like pbinom():

## this should handle the below support case
## i.e. negative values...
pbinom(c(-2,-1,0), 5, 0.5)
pbetabinom(c(-2,-1,0), 9, 0.2, 6)

## this should handle the above support case
## i.e. the q>size cases
pbinom(c(5,6,7), 5, 0.5)
pbetabinom(c(9,10,11), 9, 0.2, 6)

which should give:

> ## We want behavior like pbinom():
> 
> ## this should handle the below support case
> ## i.e. negative values...
> pbinom(c(-2,-1,0), 5, 0.5)
[1] 0.00000 0.00000 0.03125
> pbetabinom(c(-2,-1,0), 9, 0.2, 6)
Negative values of q replaced with 0
[1] 0.2859368 0.2859368 0.2859368
> 
> ## this should handle the above support case
> ## i.e. the q>size cases
> pbinom(c(5,6,7), 5, 0.5)
[1] 1 1 1
> pbetabinom(c(9,10,11), 9, 0.2, 6)
Elements of q that were greater than size were set equal to size
[1] 1 1 1

For the density functions, similar approach.

## We want behavior like dbinom():

## this should handle the below support case
## i.e. negative values...
dbinom(c(-2,-1,0), 5, 0.5)
dbetabinom(c(-2,-1,0), 9, 0.2, 6)

## this should handle the above support case
## i.e. the q>size cases
dbinom(c(5,6,7), 5, 0.5)
dbetabinom(c(9,10,11), 9, 0.2, 6)

should give

> ## We want behavior like dbinom():
> 
> ## this should handle the below support case
> ## i.e. negative values...
> dbinom(c(-2,-1,0), 5, 0.5)
[1] 0.00000 0.00000 0.03125
> dbetabinom(c(-2,-1,0), 9, 0.2, 6)
Negative values of y replaced with -1 to yield 0 value density
[1] 0.0000000 0.0000000 0.2859368
> 
> ## this should handle the above support case
> ## i.e. the q>size cases
> dbinom(c(5,6,7), 5, 0.5)
[1] 0.03125 0.00000 0.00000
> dbetabinom(c(9,10,11), 9, 0.2, 6)
Elements of y exist that are greater than size
[1] 0.0008552741 0.0000000000 0.0000000000

I'll just submit these changes to CRAN for pbetabinom and dbetabinom. Were there any other functions you really wanted to work this way?

hennerw · 2019-02-01T17:02:18Z

I think your solution is better than mine.

No correction needs to be made for the q > size cases as the result looks like it will be 1 anyway.

For q <0 I had been doing it inductively:

# after corrects for vector inputs
  if (any(q < 0)){
      # stop("q must contain non-negative values")
     # browser()
     if(all(q < 0)) return(rep(0,len))
     Val <- q>=0
     out <- rep(0,len)
     out[Val] <- pbetabinom_c(q[Val],size[Val],m[Val],s[Val])
     return(out)
   }

However you're right. ifelse(q<0,res,0) at the end would be the simplest solution.

The beta Binomial functions are currently the only functions I am making use of right now. I would suggest updating the rest of them for consistency with other base functions, but all of it's up to you.

Thank you for the quick response (And the great package)!

swihart · 2019-02-01T17:04:59Z

I'll leave this issue open and will work on the rest of the functions as I can. Thanks for writing! I've already submitted v1.1.2 to CRAN.

hennerw · 2019-02-13T19:07:22Z

Hi Swihart,

I took a look at the updated version today and I think that it is incorrect. I'm sorry I did not notice earlier.

Setting q to 0 in the case of q < 0 is not correct. 0 is a possible outcome, -1 is not. This results in incorrect results:

> pbetabinom(0, 5, 0.2, 6)
[1] 0.4297082
> pbetabinom(-1, 5, 0.2, 6)
Negative values of q replaced with 0
[1] 0.4297082

The result in the second case should be 0. It is possible to get 0 or fewer success out of 5, but it is not possible to get -1 or fewer out of 5, so the probability should be 0. (The q > size case does work out)

Here is an easy inductive fix: (Added after vectorization)

  if (any(q > size)){
    # Updating to correctly deal with this siduation
    # stop("q must be <= size")
    if(all(q>size)) return(rep(1,len))
    Val <- q<= size
    out <- rep(1,len)
    out[Val] <- pbetabinom_c(q[Val],size[Val],m[Val],s[Val])
    return(out)
  }
  if (any(q < 0)){
    # stop("q must contain non-negative values")
    if(all(q < 0)) return(rep(0,len))
    Val <- q>=0
    out <- rep(0,len)
    out[Val] <- pbetabinom_c(q[Val],size[Val],m[Val],s[Val])
    return(out)
  }

Here is my use case for out of range results (Besides consistency with base R functions):

I am flipping 20 coins with an unknown probability of coming up heads. After each flip I stop if the probability that I'll have fewer than 9 heads at the end of the experiment is less than 10% and declare the experiment a failure.

I would like to generate a heat map showing the probability of ending with fewer than 9 successes given any potential combination of observations/successes during the experiment:

I could add my own handling to the cases where the observed successes are already greater than 9 or the number of successes required to reach 9 is fewer than the remaining number of trials. However the coding is cleaner if the pbetabinom function can correctly handle those cases. (and again inline with the handling of the base R functions)

swihart · 2019-03-04T14:19:11Z

Is dbetabinom() okay? Is it just pbetabinom()?

swihart · 2019-03-04T14:49:04Z

I think I have a fix for pbetabinom that changes the original code as little as possible and give the desired results. Thank you for your example code and continued interest. I'll get it out in v1.1.3 soon.

swihart · 2019-03-04T14:50:40Z

## We want behavior like pbinom():

## this should handle the below support case
## i.e. negative values...
pbinom(c(-2,-1,0), 5, 0.5)
pbetabinom(c(-2,-1,0), 9, 0.2, 6)



## this should handle the above support case
## i.e. the q>size cases
pbinom(c(5,6,7), 5, 0.5)
pbetabinom(c(9,10,11), 9, 0.2, 6)

Now gives

> ## We want behavior like pbinom():
> 
> ## this should handle the below support case
> ## i.e. negative values...
> pbinom(c(-2,-1,0), 5, 0.5)
[1] 0.00000 0.00000 0.03125
> pbetabinom(c(-2,-1,0), 9, 0.2, 6)
Negative values of q detected.  `pbetabinom` returns 0 for such values.
[1] 0.0000000 0.0000000 0.2859368
> 
> 
> 
> ## this should handle the above support case
> ## i.e. the q>size cases
> pbinom(c(5,6,7), 5, 0.5)
[1] 1 1 1
> pbetabinom(c(9,10,11), 9, 0.2, 6)
Elements of q that were greater than size detected. `pbetabinom` returns 1 for such values.
[1] 1 1 1

swihart · 2019-03-04T16:15:35Z

check out v1.1.3. off to cran.

I'll look into patching up other [pd] functions in near future.

swihart added a commit that referenced this issue Feb 1, 2019

minimal changes for issue #5

de358cb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Out of support behavior in distribution functions is not consistent with base R functions #5

Out of support behavior in distribution functions is not consistent with base R functions #5

hennerw commented Jan 31, 2019

swihart commented Feb 1, 2019

hennerw commented Feb 1, 2019

swihart commented Feb 1, 2019 •

edited

Loading

hennerw commented Feb 13, 2019

swihart commented Mar 4, 2019

swihart commented Mar 4, 2019 •

edited

Loading

swihart commented Mar 4, 2019

swihart commented Mar 4, 2019

Out of support behavior in distribution functions is not consistent with base R functions #5

Out of support behavior in distribution functions is not consistent with base R functions #5

Comments

hennerw commented Jan 31, 2019

swihart commented Feb 1, 2019

hennerw commented Feb 1, 2019

swihart commented Feb 1, 2019 • edited Loading

hennerw commented Feb 13, 2019

swihart commented Mar 4, 2019

swihart commented Mar 4, 2019 • edited Loading

swihart commented Mar 4, 2019

swihart commented Mar 4, 2019

swihart commented Feb 1, 2019 •

edited

Loading

swihart commented Mar 4, 2019 •

edited

Loading