Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement epsilon-based floating point number comparisons in tests #57

Merged
merged 47 commits into from
Jul 4, 2022
Merged

implement epsilon-based floating point number comparisons in tests #57

merged 47 commits into from
Jul 4, 2022

Conversation

sfindeisen
Copy link
Contributor

@sfindeisen sfindeisen commented Jun 15, 2022

Implements --eq-epsilon flag for cabal test which sets the precision for floating point value comparison.

Example 1 (no flag):

$ cabal test shortTestForCI --test-options='-p "/randomLL2 140 0 3 5 54282 5.0e-2/"'
[...]
Test suite shortTestForCI: RUNNING...
Short tests for CI
  MNIST RNN short tests
    randomLL2 140 0 3 5 54282 5.0e-2: OK (2.61s)

Example 2 (high precision):

$ cabal test shortTestForCI --test-options='-p "/randomLL2 140 0 3 5 54282 5.0e-2/" --eq-epsilon 0.0000000000000000001'
[...]
Test suite shortTestForCI: RUNNING...
Short tests for CI
  MNIST RNN short tests
    randomLL2 140 0 3 5 54282 5.0e-2: FAIL (2.68s)
      test/common/TestCommon.hs:62:
      wrong result: 30.061871495725264 is expected to be a member of [30.061871495723956,30.06187089990927]

Fixes #46

@sfindeisen sfindeisen marked this pull request as ready for review June 22, 2022 20:18
-- Ugly global epsilon used to compare floating point values.
eqEpsilonRef :: IORef Double
{-# NOINLINE eqEpsilonRef #-}
eqEpsilonRef = unsafePerformIO $ newIORef eqEpsilonDefault
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm proud of you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me hours. It's so tricky! Did you know, for example, that this version:

epsilonRef :: Fractional a => IORef a
{-# NOINLINE epsilonRef #-}
epsilonRef = unsafePerformIO $ newIORef 3.1415

doesn't work?! Even if I explicitly write :: Double everywhere I use it. Why is that?!

Copy link
Owner

@Mikolaj Mikolaj Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you try to construct a value of an existential type Exist a. Fractional a => IORef a. However, to construct a value of an existential type you need to provide that type, not only a value. And you don't do that.

I'm unclear about the details of existential vs universal types in Haskell, e.g., if it's about positive occurrences of a type variable, or strictly positive and what happens if a variable has mixed variance. However, in this example it's simple [edit: not so simple, see Tom's comment below]. The occurrence of a is not under any number of arrows, so the whole thing is existential, not universal. If it was existential, it could be abused: you could construct a reference by initializing it with a Float, but later update it with a Double, overwriting a nearby memory cell (assuming Floats are smaller than Doubles).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GHC implements type classes as dictionary passing, so

epsilonRef :: Fractional a => IORef a
epsilonRef = unsafePerformIO $ newIORef 3.1415

is rather similar to

epsilonRef :: Fractional a -> IORef a
epsilonRef fractionalDict = unsafePerformIO $ newIORef (fromRational fractionalDict 3.1415)

and it seems rather likely that this makes a fresh IORef containing 3.1415 at each use site. Would that explain what you see?

Copy link
Owner

@Mikolaj Mikolaj Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fractional a -> IORef a

In that case, type variable a has an occurrence on the left hand side of an arrow so, yes, this could probably be a universal type if interpreted that way. Similarly, if this function took an argument to initialize the ref with: a -> IORef a. And if these are lambdas, the body of a lambda is never evaluated, so the ref is not created. But I guess

epsilonRef :: IORef a
epsilonRef = unsafePerformIO $ newIORef undefined

would be firmly in the existential territory and no lambda, so perhaps my explanation only applied to that one. Not tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tomjaguarpaw I can't comment on GHC's type classes implementation but yes, what I see is like a fresh IORef each time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a complete example:

import Data.IORef
import System.IO.Unsafe

epsilonRef :: Fractional a => IORef a
{-# NOINLINE epsilonRef #-}
epsilonRef = unsafePerformIO $ newIORef 3.1415

setEpsilon :: Fractional a => a -> IO ()
setEpsilon = atomicWriteIORef epsilonRef

main :: IO ()
main = do
  () <- (setEpsilon :: (Double -> IO ())) (0.2 :: Double)  -- doesn't work!
  epsilon <- (readIORef (epsilonRef :: IORef Double))
  print $ "epsilon: " ++ show epsilon

On my machine this prints:

"epsilon: 3.1415"

Is this what you'd expect?

@sfindeisen sfindeisen marked this pull request as draft June 23, 2022 05:54
@sfindeisen
Copy link
Contributor Author

Wait a sec with the review, I will tweak it a bit more.

@sfindeisen sfindeisen marked this pull request as ready for review June 29, 2022 10:08
@Mikolaj
Copy link
Owner

Mikolaj commented Jun 29, 2022

I assume this is ready for review? Github is not sending me email notifications when somebody changes PR status from draft to normal, so I need to guess from pauses in commit generation. :)

I'm surprised: HUnit-approx could be reused, after all, despite tasty not using HUnit? They are compatible enough regardless? That's great news. Is it because their internal types are not abstract and they agree? Was your initial implementation of assertApproxEqual approximately equal to that from HUnit-approx?

@Mikolaj Mikolaj requested a review from goldfirere June 29, 2022 19:08
@sfindeisen
Copy link
Contributor Author

I assume this is ready for review?

Yes please.

I'm surprised: HUnit-approx could be reused, after all, despite tasty not using HUnit? They are compatible enough regardless? That's great news. Is it because their internal types are not abstract and they agree? Was your initial implementation of assertApproxEqual approximately equal to that from HUnit-approx?

4x yes

Copy link
Owner

@Mikolaj Mikolaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done. And you validated @goldfirere's design by coming up with it independently.

It seems the tasty aux maintainers don't know that HUnit-approx works with their stuff. Perhaps that's because some additional code is needed, similar to what you wrote. May be worth turning their attention to this PR, even just to leave a paper trail for anybody browsing their bug tracker in search of such functions. If wonder if the extra code should go into HUnit-approx, tasty, tasty-HUnit-approx or if it has to be repeated in each application.

@goldfirere
Copy link
Collaborator

Oh, wow. I didn't think I'd be hearing from HUnit-approx again! :) Glad it was useful.

@Mikolaj
Copy link
Owner

Mikolaj commented Jun 30, 2022

Oh, wow. I didn't think I'd be hearing from HUnit-approx again! :) Glad it was useful.

We'll still make it big. :D

@sfindeisen
Copy link
Contributor Author

Thanks for the review @Mikolaj . What about the open discussions, esp. the one about Data.Vector.Generic? Will you have any comments there?

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 1, 2022

Thanks for the review @Mikolaj . What about the open discussions, esp. the one about Data.Vector.Generic? Will you have any comments there?

Huh, github is not showing me anything. Could you give me links?

@sfindeisen
Copy link
Contributor Author

Huh, github is not showing me anything. Could you give me links?

If you click on Files changed tab then you'll see there Pending discussions.

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 1, 2022

I can't see anything. The "Conversations" drop down menu shows an irrelevant discussion. Do you mean a different PR? I tried in a different browser, not logged in to github, but still nothing.

Copy link
Contributor Author

@sfindeisen sfindeisen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -427,9 +428,11 @@ readmeTests :: TestTree
readmeTests = testGroup "Simple tests for README"
[ testCase " Float (1.1, 2.2, 3.3)"
$ atanReadmeDReverse (V.fromList [1.1 :: Float, 2.2, 3.3])
-- TODO: change this to @?~
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the following TODO are tricky: if I just replace @?= with @?~, I am getting this compile-time error:

test/common/TestSingleGradient.hs:431:7: error:
    • No instance for (Traversable Vector) arising from a use of ‘@?~’
      There are instances for similar types:
        instance Traversable vector-0.12.3.1:Data.Vector.Vector
          -- Defined in ‘Data.Vector’
        instance Traversable Data.Vector.Vector
          -- Defined in ‘Data.Strict.Vector.Autogen’
    • In the second argument of ‘($)’, namely
        ‘atanReadmeDReverse (V.fromList [1.1 :: Float, 2.2, 3.3])
           @?~ (V.fromList [3.0715904, 0.18288425, 1.1761366], 4.937552)’
      In the expression:
        testCase " Float (1.1, 2.2, 3.3)"
          $ atanReadmeDReverse (V.fromList [1.1 :: Float, 2.2, 3.3])
              @?~ (V.fromList [3.0715904, 0.18288425, 1.1761366], 4.937552)
      In the second argument of ‘testGroup’, namely
        ‘[testCase " Float (1.1, 2.2, 3.3)"
            $ atanReadmeDReverse (V.fromList [1.1 :: Float, 2.2, ....])
                @?~ (V.fromList [3.0715904, ....], 4.937552),
          testCase " Double (1.1, 2.2, 3.3)"
            $ atanReadmeDReverse (V.fromList [1.1 :: Double, 2.2, ....])
                @?~ (V.fromList [3.071590389300859, ....], 4.9375516951604155)]’
    |
431 |       @?~ (V.fromList [3.0715904, 0.18288425, 1.1761366], 4.937552)
    |       ^^^
cabal: Failed to build lib:testLibrary from horde-ad-0.1.0.0 (which is
required by test:shortTestForCI from horde-ad-0.1.0.0).

Ok so I tried implementing the missing instance (in TestCommonEqEpsilon.hs):

import qualified Data.Vector.Generic as V

instance (V.Vector v a, AssertClose a) => AssertClose (v a, a) where
  (@?~) :: v a -> v a -> Assertion
  (@?~) (actual_xs, actual_x) (expected_xs, expected_x) =
    (@?~) actual_x expected_x >> assertCloseList "" (V.toList expected_xs) (V.toList actual_xs)

but then I get another error:

test/common/TestCommonEqEpsilon.hs:115:10: error:
    Duplicate instance declarations:
      instance (Traversable t, AssertClose a) => AssertClose (t a, a)
        -- Defined at test/common/TestCommonEqEpsilon.hs:115:10
      instance (V.Vector v a, AssertClose a) => AssertClose (v a, a)
        -- Defined at test/common/TestCommonEqEpsilon.hs:120:10
    |
115 | instance (Traversable t, AssertClose a) => AssertClose ((t a), a) where
    |          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
cabal: Failed to build lib:testLibrary from horde-ad-0.1.0.0 (which is
required by test:shortTestForCI from horde-ad-0.1.0.0).

This of course does not make any sense.

What am I doing wrong?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work even if I fix the typo:

instance (V.Vector v a, AssertClose a) => AssertClose ((v a), a) where
  (@?~) :: (v a, a) -> (v a, a) -> Assertion
  (@?~) (actual_xs, actual_x) (expected_xs, expected_x) =
    (@?~) actual_x expected_x >> assertCloseList "" (V.toList expected_xs) (V.toList actual_xs)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ @Mikolaj here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I can see both comment chains in both tabs now. Looks like a not finalized review. The second error seems to be cause by pre-existing Traversable instance of Vector. The question is why it's not picked up initially, causing the first error. Would it help to importing (with empty import list) one of the modules it claims contains an instance?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latter error is caused by boxed vectors having a Traversable instance and so providing an alternative instance of AssertClose. But we don't use boxed vectors, so this instance via Traversable is useless for us.

-- If the prefix is the empty string (i.e., @\"\"@), then the prefix is omitted
-- and only the expected and actual values are output.
assertClose :: forall a. (Fractional a, Ord a, Show a, HasCallStack)
=> String -- ^ The message prefix
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have those assertion messages all over the place. But they are not used, and assertClose is no longer exported. Shall I get rid of them?

In our tests I couldn't find a single assertion message either.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^^ @Mikolaj here

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'd do that. If we suddenly need them, we can re-add.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or if tasty devs request it as a condition for including the new operations in the tasty package. Perhaps ask them now?

Copy link
Contributor Author

@sfindeisen sfindeisen Jul 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked them here.

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

Thank you for the experiment. I turns out the instance is missing, except for boxed vectors. I wonder what deep reason is there for this and if we could add an orphan instance similar to the one for boxed vectors: https://hackage.haskell.org/package/vector-0.13.0.0/docs/src/Data.Vector.html#line-461

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

Or, alternatively, you could provide the AssertClose instance only for Storable vectors, which is what we use.

@sfindeisen
Copy link
Contributor Author

sfindeisen commented Jul 4, 2022

First of all, our vector package version seems to be vector-0.12.3.1. But I can't see this version number anywhere: where is it coming from? I can see a lot of crap like:

$ find . -type f -name '*cabal*' -print0 | xargs -0 grep -i vector 
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:/* package vector-0.12.3.1 */
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#ifndef VERSION_vector
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#define VERSION_vector "0.12.3.1"
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#endif /* VERSION_vector */
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#ifndef MIN_VERSION_vector
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#define MIN_VERSION_vector(major1,major2,minor) (\
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/build/autogen/cabal_macros.h:#endif /* MIN_VERSION_vector */
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:/* package vector-0.12.3.1 */
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#ifndef VERSION_vector
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#define VERSION_vector "0.12.3.1"
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#endif /* VERSION_vector */
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#ifndef MIN_VERSION_vector
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#define MIN_VERSION_vector(major1,major2,minor) (\
./dist-newstyle/build/x86_64-linux/ghc-9.2.2/horde-ad-0.1.0.0/l/testLibrary/build/testLibrary/autogen/cabal_macros.h:#endif /* MIN_VERSION_vector */
./horde-ad.cabal:                      HordeAd.Core.PairOfVectors
./horde-ad.cabal:                      HordeAd.Tool.MnistFcnnVector
./horde-ad.cabal:      , vector
./horde-ad.cabal:      , vector
./horde-ad.cabal:      , vector

The latest version, as you pointed out, is vector-0.13.0.0.

What is dist-newstyle and this autogen thing?

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

But I can't see this version number anywhere: where is it coming from?

Cabal solved the constraints and that's (close to) the newest version permissible given what other packages require. You can see some of the process with cabal build -v2, though there may be better flags for that.

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

What is dist-newstyle and this autogen thing?

That's where cabal builds things and, in particular, auto-generates some Haskell source files.

@sfindeisen
Copy link
Contributor Author

Or, alternatively, you could provide the AssertClose instance only for Storable vectors, which is what we use.

That's how I fixed it. Thanks. OK to merge?

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

Looks great. @tomjaguarpaw: any further comments from you?

@tomjaguarpaw
Copy link
Collaborator

I haven't really been following, so nothing to add from me!

@Mikolaj
Copy link
Owner

Mikolaj commented Jul 4, 2022

Thank you Staszek. Thank you Tom. The branch seems rebased so, Staszek, please feel free to merge (perhaps with the squash method).

@sfindeisen sfindeisen merged commit f81327b into Mikolaj:master Jul 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement epsilon-based floating point number comparisons in tests
4 participants