Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Criterion never ends when run by stack bench --profile #214

Closed
t3hmrman opened this issue May 27, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@t3hmrman
Copy link

commented May 27, 2019

Having a real headscratcher -- working on a tiny data-sketching library and I'm trying to do some benchmarking, but I'm finding that criterion loops endlessly for some reason when run with stack bench --profile (stack bench works great):

package.yaml:

benchmarks:
  countmin-bench:
    main: Bench.hs
    source-dirs: bench
    dependencies:
      - countmin
      - criterion
      - QuickCheck

## The next few lines don't work commented or uncommented
#    ghc-options:
#    - -threaded
#    - -rtsopts
#    - -with-rtsopts=-N

bench/Bench.hs:

{-# LANGUAGE NumericUnderscores #-}
import RIO
import Criterion.Main (defaultMain, bgroup, env, nf, bench)

import Data.Maybe (fromJust)
import BenchData (genRandomStrings)
import Hashes.FarmHash (hFarmHash)
import Hashes.MD5 (hMD5)
import Hashes.SHA1 (hSHA1)
import Lib (buildSketch, Sketch, CountMin(..))

-- | Default sketch, w/ all available hask fns used
defaultSketch :: Sketch
defaultSketch = fromJust $ buildSketch [hMD5, hFarmHash, hSHA1]

main :: IO ()
main = defaultMain [
        env (genRandomStrings 100_000) $ \ ~(values, _) ->
            bgroup "counting random strings (1)" [ bench "w/ Sketch" $ nf (totalCount . foldl' increment defaultSketch) values ]
       ]

-- main :: IO ()
-- main = genRandomStrings 100_000
--        >>= \(values, _) -> pure (foldl' increment defaultSketch values)
--        >>= putStrLn . show . totalCount

The commented out alternative main runs to completion w/ stack bench --profile -- it's not even doing any actual benchmarking/measurement, but I get the space usage I was after. When I run the uncommented code (which I think is a decent criterion main function), stack bench --profile never returns (after a while, I can see a partially written .prof with a ton of runs.

Here's the top portion of the .prof file left over when I stop stack bench --profile with ^C:

	Mon May 27 23:11 2019 Time and Allocation Profiling Report  (Final)

	   countmin-bench +RTS -p -RTS

	total time  =       32.16 secs   (32162 ticks @ 1000 us, 1 processor)
	total alloc = 38,942,052,296 bytes  (excludes profiling overheads)

COST CENTRE         MODULE                               SRC                                                   %time %alloc

increment           Lib                                  src/Lib.hs:(33,5)-(62,59)                              33.7   14.2
nf'.go.y            Criterion.Measurement.Types.Internal src/Criterion/Measurement/Types/Internal.hs:49:28-35   27.0   56.9
primitive           Control.Monad.Primitive              Control/Monad/Primitive.hs:97:3-16                      5.2    2.2
randomIvalInteger   System.Random                        System/Random.hs:(468,1)-(489,76)                       5.0    4.1
createBlock256      System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:(41,1)-(47,26)              4.0    3.5
mash                System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:(117,1)-(126,29)            3.1    1.3
>>=.\.(...)         Test.QuickCheck.Gen                  Test/QuickCheck/Gen.hs:68:15-35                         1.8    0.9
tfGenSplit          System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:(166,1)-(173,35)            1.7    4.4
randomIvalInteger.f System.Random                        System/Random.hs:(486,8)-(489,76)                       1.6    0.3
>>=.\               Test.QuickCheck.Gen                  Test/QuickCheck/Gen.hs:(66,7)-(69,20)                   1.4    0.0
split               System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:180:3-20                    1.2    0.6
mkTFGen             System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:(133,1)-(134,36)            1.1    3.8
tfGenNext'          System.Random.TF.Gen                 src/System/Random/TF/Gen.hs:(157,1)-(163,32)            0.8    2.5
randomIvalIntegral  System.Random                        System/Random.hs:462:1-71                               0.8    1.2

Just a note, I'm also aware of weigh and plan to give it a try (instead of using stack --profile), but was perplexed by the behavior above.

[EDIT] - Additionally, when I use a hardcoded list of values (rather than the random generation), it still doesn't return, here's what I see after stopping it early (w/ ^C):

	Mon May 27 23:15 2019 Time and Allocation Profiling Report  (Final)

	   countmin-bench +RTS -p -RTS

	total time  =       46.13 secs   (46132 ticks @ 1000 us, 1 processor)
	total alloc = 147,537,374,176 bytes  (excludes profiling overheads)

COST CENTRE  MODULE                               SRC                                                         %time %alloc

nf'.go.y     Criterion.Measurement.Types.Internal src/Criterion/Measurement/Types/Internal.hs:49:28-35         42.7   27.3
incrementStr Types                                src/Types.hs:37:5-54                                         30.3   64.5
increment    Lib                                  src/Lib.hs:(33,5)-(62,59)                                    23.6    8.2
nf'.go       Criterion.Measurement.Types.Internal src/Criterion/Measurement/Types/Internal.hs:(48,5)-(50,49)    1.8    0.0

Apologies the code changes a tiny bit -- I use incrementStr instead of increment but regardless, it's still weird that it just hangs... I must be using nf wrong or something.

[EDIT2] - I've also tried with whnf instead of nf, same result

@RyanGlScott

This comment has been minimized.

Copy link
Collaborator

commented May 27, 2019

I'm afraid I can't say whether this is a bug or not simply because there's no way for me to reproduce this. The example depends on some countmin library (which is not on Hackage) and a mysterious Lib module (which I have no idea where it may come from). There's also several other external dependencies that I don't recognize, so it's difficult to say whether there are any adverse effects from that code or not, as I'm not familiar with them.

Trimming the example down to something with as few dependencies as possible would be helpful.

@t3hmrman

This comment has been minimized.

Copy link
Author

commented May 28, 2019

Hi @RyanGlScott my apologies -- I thought it might be something really obvious with my possible misuse of criterion -- the code will be/is open source all the code (it's for a short blog post), please find it @ mrman/haskell-countmin on GitLab.

I assume you don't want to look through all that random code, so I'll try and make a more minimal example, I'm seeing if removing the hash functions works (maybe it's because of the auto-cost-center creation stuff getting put inside of the hash functions...). I'm trying to look through the profiling documentation, and maybe I'll switch to manual cost-center insertion.

@t3hmrman

This comment has been minimized.

Copy link
Author

commented Jun 1, 2019

Hey so I'm not sure what was wrong, it might have something to do with my Sketch type not implementing NFData but in the meantime I got around it by using nfIO (and returning a type that does have an NFData instance) to avoid that requirement:

main :: IO ()
main = defaultMain [
        bgroup "counting emails" [ bench "w/ Sketch" $ nfIO $ pure $ totalCount (foldl' incrementStr defaultSketch emails) ]
       ]

Not ideal, but it at least gets me past the problem I was having for now.

I'm going to see if I can implement NFData for my Sketch type (it couldn't be automatically derived because it contains some functions, IIRC).

I'm going to close this for now, I think I didn't use criterion right (not implementing NFData from the get-go)

@t3hmrman t3hmrman closed this Jun 1, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.