Poor precision in BinomialDistribution for probability & logProbablilty #166

maralorn · 2020-06-24T08:03:08Z

A recent built on the nixos build server hydra failed with this error for statistics-0.15.2.0:

Tests for: BinomialDistribution
   <...>
   log probabilty check:                                                                  FAIL
     *** Failed! Falsified (after 34 tests):
     binomial 79 9.83897381493678e-2
     7
     probability    = 0.14935400356648396
     logProbability = -1.9014359280741342
     log p          = -1.9014359280741266
     eps            = 3.970429114115656e-15
     Use --quickcheck-replay=39638 to reproduce.

My best guess is, that this is not indicative of a bug in the library. Maybe the eps is to small?
I don‘t assume that this is the result of linking against a wrong library version or something, i.e. a issue in nixpkgs. But please enlighten me.

The text was updated successfully, but these errors were encountered:

Shimuuar · 2020-06-24T08:44:01Z

My bet is naive error estimation. It always fails somewhere in parameter space. I'll check it out

Shimuuar · 2020-08-13T19:40:38Z

Playing a bit binomial distribution shows that yes. It's a bug. Ironically test failure happens in region where algorithm performs tolerable.

Shimuuar · 2020-08-13T20:03:04Z

It turns out main contribution to error comes from binomial coefficient.

Shimuuar · 2022-01-07T19:49:35Z

I took second look. For probability main contribution to error is term (1-p)^(n-k) . It's possible to replace it with exp(log1p(-p)*(n-k)) but it isn't clear win for small (1-p). I wish I had pow1p. For logDensity we subtract two numbers of similar magnitude with usual consequences.

In the meantime I just increased error tolerance. test should not fail any more.

This should fix failure found in #166: log probabilty check: FAIL *** Failed! Falsified (after 34 tests): binomial 79 9.83897381493678e-2 7 probability = 0.14935400356648396 logProbability = -1.9014359280741342 log p = -1.9014359280741266 eps = 3.970429114115656e-15 It turns out that main contribution to error in probability is (1-p)^(n-k) and for logProbability we subtract values of same magnitude with usual consequences for precision

GeorgeCo · 2022-04-16T14:44:16Z

@Shimuuar Thanks for your work on this! This sounds like a regression, if so, does it make sense to open a low priority bug on that?

wrt to your comment above on pow1p, it would be nice if the package math-functions had pow1p. When I have a chance I'll file an ER for that and update this. I guess the ER will be to start with pow1p and ultimately add everything IEEE recommends, as listed in JuliaLang/julia#6148. BTW do you know of any library that has implementations? https://www.gnu.org/software/gsl/doc/html/math.html has some.

vcunat · 2024-03-27T06:38:40Z

statistics-0.16.2.1 on aarch64-darwin:

      log probability check:                                                                 FAIL
        *** Failed! Falsified (after 41 tests):
        negativeBinomial 98.7162369063961 0.7010879233617198
        39
        probability    = 4.97063436288968e-2
        logProbability = -3.0016227156162785
        log p          = -3.0016227156162563
        ulps[log]      = 50
        ulps[lin]      = 159
        Use --quickcheck-replay=805898 to reproduce.
        Use -p '!((/Pearson correlation/||/t_qr/)||/Tests for: FDistribution.1-CDF is correct/)&&/Tests for: NegativeBinomialDistribution.log probability check/' to rerun this test only.

Full log – I don't expect it's interesting and might be overwritten by retries of exactly the same build, but let me link it: https://cache.nixos.org/log/b7pzjh4s2dh41zgiwb9mvl0qky945xj6-statistics-0.16.2.1.drv

maralorn changed the title ~~Testfailure for Binomialdistribution~~ Test failure for BinomialDistribution Jun 24, 2020

Shimuuar changed the title ~~Test failure for BinomialDistribution~~ Poor precision in BinomialDistribution for probability & logProbablilty Jan 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poor precision in BinomialDistribution for probability & logProbablilty #166

Poor precision in BinomialDistribution for probability & logProbablilty #166

maralorn commented Jun 24, 2020

Shimuuar commented Jun 24, 2020

Shimuuar commented Aug 13, 2020

Shimuuar commented Aug 13, 2020

Shimuuar commented Jan 7, 2022

GeorgeCo commented Apr 16, 2022

vcunat commented Mar 27, 2024

Poor precision in BinomialDistribution for probability & logProbablilty #166

Poor precision in BinomialDistribution for probability & logProbablilty #166

Comments

maralorn commented Jun 24, 2020

Shimuuar commented Jun 24, 2020

Shimuuar commented Aug 13, 2020

Shimuuar commented Aug 13, 2020

Shimuuar commented Jan 7, 2022

GeorgeCo commented Apr 16, 2022

vcunat commented Mar 27, 2024