Test "Pipes: HMM pipe model is equivalent to standard model" is flaky #283

turion · 2023-06-12T08:42:02Z

Failures like these occur often:

Failures:

  test/Spec.hs:141:5: 
  1) Pipes: HMM pipe model is equivalent to standard model
       Falsified (after 32 tests and 4 shrinks):
         [0.0,0.0,-1.6666666666666667,-1.6666666666666667]

  To rerun use: --match "/Pipes: HMM/pipe model is equivalent to standard model/"

Randomized with seed 309526202

This might even lead to monad-bayes getting marked broken in nixpkgs, see NixOS/nixpkgs#236431

The text was updated successfully, but these errors were encountered:

turion · 2023-06-12T08:42:56Z

Also see #260 , we should move pipes to a separate package. Then only that separate package would fail.

turion · 2023-06-12T08:54:33Z

Might be caused by nick8325/quickcheck#359

turion · 2023-06-12T09:49:34Z

Or similar to haskell/vector#460 it might be that the new generator produces more values for which we experience overflows.

turion · 2023-06-12T10:01:44Z

Easily reproduceable wtih this diff:

diff --git a/monad-bayes.cabal b/monad-bayes.cabal
index e74b1fe..94b3fed 100644
--- a/monad-bayes.cabal
+++ b/monad-bayes.cabal
@@ -189,7 +189,7 @@ test-suite monad-bayes-test
     , pipes
     , pretty-simple
     , profunctors
-    , QuickCheck
+    , QuickCheck == 2.14.3
     , random
     , statistics
     , text

And a file cabal.project.local containing:

tests: True

turion · 2023-06-12T10:28:02Z

The issue is in the hard sorting function in the compact function. I added the following diff to inspect the generated values:

diff --git a/test/TestPipes.hs b/test/TestPipes.hs
index 4efe68a..fe51a12 100644
--- a/test/TestPipes.hs
+++ b/test/TestPipes.hs
@@ -8,12 +8,14 @@ import Control.Monad.Bayes.Enumerator (enumerator)
 import Data.AEq (AEq ((~==)))
 import HMM (hmm, hmmPosterior)
 import Pipes.Prelude (toListM)
+import Test.QuickCheck (counterexample, Property, property)
 
 urns :: Int -> Bool
 urns n = enumerator (urn n) ~== enumerator (urnP n)
 
-hmms :: [Double] -> Bool
+hmms :: [Double] -> Property
 hmms observations =
   let hmmWithoutPipe = hmm observations
       hmmWithPipe = reverse . init <$> toListM (hmmPosterior observations)
-   in enumerator hmmWithPipe ~== enumerator hmmWithoutPipe
+      msg = unlines $ show <$> [("with   ", enumerator hmmWithPipe), ("without", enumerator hmmWithoutPipe)]
+  in counterexample msg $ property $ (fst <$> enumerator hmmWithPipe) ~== (fst <$> enumerator hmmWithoutPipe)

This gave e.g.

  1) Pipes: HMM pipe model is equivalent to standard model
       Falsified (after 9 tests and 2 shrinks):
         [0.0,0.0,-0.5]
         ("with   ",[([1,2,0],0.1546027526337784),([2,0,1],7.073671679025113e-2),([1,1,2],6.82502090909774e-2),([1,2,1],6.184110105351136e-2),([1,1,1],5.85001792208378e-2),([2,2,0],5.305253759268838e-2),([2,0,2],5.305253759268837e-2),([1,2,2],4.638082579013355e-2),([1,1,0],3.900011948055852e-2),([0,2,0],3.8836185197532215e-2),([2,1,2],3.643161738034025e-2),([2,0,0],3.53683583951256e-2),([1,0,1],3.533777203057793e-2),([2,1,1],3.122710061172023e-2),([0,1,2],2.6669130334371047e-2),([1,0,2],2.6503329022933453e-2),([0,1,1],2.2859254572318057e-2),([2,2,1],2.1221015037075363e-2),([2,1,0],2.0818067074480157e-2),([1,0,0],1.7668886015288984e-2),([2,2,2],1.591576127780652e-2),([0,2,1],1.5534474079012886e-2),([0,1,0],1.5239503048212037e-2),([0,2,2],1.1650855559259665e-2),([0,0,1],1.0356316052675273e-2),([0,0,2],7.767237039506448e-3),([0,0,0],5.1781580263376365e-3)])
         ("without",[([1,2,0],0.15460275263377876),([2,0,1],7.073671679025137e-2),([1,1,2],6.82502090909775e-2),([1,2,1],6.184110105351154e-2),([1,1,1],5.850017922083786e-2),([2,0,2],5.3052537592688506e-2),([2,2,0],5.3052537592688506e-2),([1,2,2],4.6380825790133653e-2),([1,1,0],3.900011948055858e-2),([0,2,0],3.883618519753228e-2),([2,1,2],3.643161738034034e-2),([2,0,0],3.536835839512568e-2),([1,0,1],3.533777203057802e-2),([2,1,1],3.122710061172028e-2),([0,1,2],2.666913033437113e-2),([1,0,2],2.650332902293353e-2),([0,1,1],2.2859254572318098e-2),([2,2,1],2.122101503707541e-2),([2,1,0],2.081806707448021e-2),([1,0,0],1.7668886015289015e-2),([2,2,2],1.5915761277806563e-2),([0,2,1],1.5534474079012931e-2),([0,1,0],1.5239503048212075e-2),([0,2,2],1.1650855559259689e-2),([0,0,1],1.0356316052675282e-2),([0,0,2],7.767237039506465e-3),([0,0,0],5.178158026337637e-3)])

Look closely at the 6th and 7th elements.

([2,2,0],5.305253759268838e-2),([2,0,2],5.305253759268837e-2)

vs.

([2,0,2],5.3052537592688506e-2),([2,2,0],5.3052537592688506e-2)

The floating point numbers are so close that they are ordered in two different ways. The equality check at the end is fine with small differences because it uses ~==, but the sorting algorithm distinguishes them because it uses hard comparison. A simple solution is to sort again before testing. PR upcoming.

reubenharry · 2023-06-12T22:57:44Z

The stuff with pipes was added by me, partly to make MCMC smoother, partly to explore the stuff we're now exploring with dunai/Rhine. It's really not important, and if it's a pain point, can happily be removed. Sorry for the hassle there.

Fix #283, fixes flaky pipes test

turion · 2023-06-13T11:14:37Z

No worries. It's actually helpful to have two different implementations of something for comparison. So I think there is value in that test. But it would be great if we can tackle #260.

turion mentioned this issue Jun 12, 2023

haskellPackages: update stackage and hackage NixOS/nixpkgs#236431

Merged

turion added a commit to turion/monad-bayes that referenced this issue Jun 12, 2023

Fix tweag#283, fixes flaky pipes test

d4f0d6e

turion closed this as completed in 060cbb4 Jun 13, 2023

turion added a commit that referenced this issue Jun 13, 2023

Merge pull request #284 from turion/fix_283

1979322

Fix #283, fixes flaky pipes test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test "Pipes: HMM pipe model is equivalent to standard model" is flaky #283

Test "Pipes: HMM pipe model is equivalent to standard model" is flaky #283

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

reubenharry commented Jun 12, 2023

turion commented Jun 13, 2023

Test "Pipes: HMM pipe model is equivalent to standard model" is flaky #283

Test "Pipes: HMM pipe model is equivalent to standard model" is flaky #283

Comments

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

turion commented Jun 12, 2023

reubenharry commented Jun 12, 2023

turion commented Jun 13, 2023