Skip to content

Sort out benchmarks#315

Merged
Bodigrim merged 16 commits intohaskell:masterfrom
Bodigrim:bench
Mar 28, 2021
Merged

Sort out benchmarks#315
Bodigrim merged 16 commits intohaskell:masterfrom
Bodigrim:bench

Conversation

@Bodigrim
Copy link
Contributor

This continues work started in #314 in order to simplify package structure.

  • I replaced criterion with a lightweight drop-in replacement tasty-bench. CI build times decrease from 20 minutes to 5 minutes.
  • Now tasty-bench does not depend on text, so we can abandon a hack with hs-source-dirs: ../src and benchmark a proper text package.
  • Further, there is no need to separate benchmarks into a dedicated package, we can move them in the main one, simplifying project structure. There is only one package now, with tests and benchmarks declared as such.
  • I merged two benchmark suites into one, reducing cabal config and decreasing linking times.
  • As promised, I restored benchmark compatibiltiy with GHC 7.10, and with 7.8 and 7.6 as well (because it was easy to do so).
  • Benchmark used to take impractically long time, like hours and hours. This is because they included benchmarks for String (extremely long and noisy because of huge memory allocation), and for ByteString (which is nice, but doubles run time and is out of scope for text). I removed them, now benchmarks can finish in around an hour.
  • Finally, I updated build instructions. They are very simple now: once test data has been downloaded, everything else is standard cabal invokations.

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Having easy-to-run benchmarks is very useful to me right now! :)

README.markdown Outdated
Comment on lines +17 to +19
To run benchmarks please clone and unpack
[test files](https://github.com/bos/text-test-data)
into `benchmarks/text-test-data`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me a bit to figure out how to unpack the test files here. Including the actual commands might be useful:

$ git clone https://github.com/bos/text-test-data benchmarks/text-test-data
$ cd benchmarks/text-test-data
$ make

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better: Refer to tests-and-benchmarks.markdown.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we just have a submodule for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjakobi I updated README following your suggestion, thanks.

@kozross A submodule is a good idea, I think, but let's do it in a separate PR.

@sjakobi
Copy link
Member

sjakobi commented Mar 15, 2021

With some of the benchmarks I was wondering whether they would ever finish, given that tasty-bench doubles the number of iterations each time the runtime deviation is too high. These are the ones that run longer than a second on my old ThinkPad:

$ grep -B2 -E '[0-9] s' bench-master0.out
      385 ms ± 3.3 ms
    StrictInitLength+ascii:    OK (3.31s)
      1.10 s ± 2.7 ms
--
      128 ms ± 5.1 ms
    LazyInitLength+ascii:      OK (3.72s)
      1.24 s ±  43 ms
--
  Multilang
    find_first:                OK (12.25s)
      4.09 s ± 167 ms
--
        879 ms ± 7.5 ms
      LazyText+ascii:          OK (6.51s)
        2.15 s ±  79 ms
--
        843 ms ± 8.9 ms
      LazyText+ascii:          OK (4.07s)
        1.30 s ±  19 ms
--
        111 ms ± 2.9 ms
      LazyText+ascii:          OK (3.73s)
        1.18 s ± 4.7 ms
--
        961 ms ±  26 ms
      LazyText+ascii:          OK (5.60s)
        1.84 s ±  25 ms
--
    intersperse
      Text+ascii:              OK (3.55s)
        1.13 s ±  36 ms
      LazyText+ascii:          OK (12.11s)
        3.98 s ±  74 ms
    isInfixOf
      Text+ascii:              OK (19.35s)
        6.40 s ± 134 ms
--
        822 ms ±  23 ms
      LazyText+ascii:          OK (6.44s)
        2.09 s ±  43 ms
--
    mapAccumR
      Text+ascii:              OK (6.01s)
        1.98 s ± 126 ms
      LazyText+ascii:          OK (6.37s)
        2.07 s ±  85 ms
--
        192 ms ± 2.7 ms
      LazyText+ascii:          OK (4.64s)
        1.49 s ±  34 ms
    replicate char
      Text+ascii:              OK (3.59s)
        1.14 s ± 4.6 ms
--
         18 ms ± 1.4 ms
      LazyText+ascii:          OK (31.63s)
        1.02 s ±  27 ms
--
    toLower
      Text+ascii:              OK (12.92s)
        4.25 s ±  93 ms
      LazyText+ascii:          OK (13.32s)
        4.41 s ±  42 ms
    toUpper
      Text+ascii:              OK (13.88s)
        4.60 s ±  42 ms
      LazyText+ascii:          OK (14.46s)
        4.79 s ±  53 ms
--
    words
      Text+ascii:              OK (18.92s)
        1.28 s ±  17 ms
      LazyText+ascii:          OK (35.66s)
        2.31 s ± 175 ms
    zipWith
      Text+ascii:              OK (10.67s)
        3.50 s ±  56 ms
      LazyText+ascii:          OK (12.86s)
        4.26 s ±  14 ms
--
           57 ms ± 4.6 ms
        LazyText+ascii:        OK (3.29s)
          1.04 s ± 9.6 ms
--
           74 ms ± 3.3 ms
        LazyText+ascii:        OK (3.81s)
          1.21 s ± 9.0 ms
--
          109 ms ± 3.1 ms
        LazyText+ascii:        OK (3.76s)
          1.20 s ±  20 ms
--
          110 ms ± 4.8 ms
        LazyText+ascii:        OK (3.65s)
          1.19 s ±  20 ms
--
           56 ms ± 3.1 ms
        LazyText+ascii:        OK (3.57s)
          1.13 s ± 3.7 ms
--
      intersperse
        Text+ascii:            OK (3.51s)
          1.14 s ±  17 ms
        LazyText+ascii:        OK (4.53s)
          1.45 s ±  12 ms
--
           85 ms ± 4.4 ms
        LazyText+ascii:        OK (3.35s)
          1.06 s ± 3.9 ms
--
           83 ms ± 1.4 ms
        LazyText+ascii:        OK (3.37s)
          1.06 s ± 5.6 ms
--
           53 ms ± 1.4 ms
        LazyText+ascii:        OK (3.28s)
          1.04 s ±  10 ms
--
           56 ms ± 3.3 ms
        LazyText+ascii:        OK (3.28s)
          1.04 s ± 4.8 ms
      toLower
        Text+ascii:            OK (11.51s)
          3.78 s ±  16 ms
        LazyText+ascii:        OK (12.25s)
          4.05 s ± 7.0 ms
      toUpper
        Text+ascii:            OK (12.53s)
          4.12 s ±  13 ms
        LazyText+ascii:        OK (13.26s)
          4.39 s ±  38 ms
--
      zipWith
        Text+ascii:            OK (7.99s)
          2.63 s ±  21 ms
        LazyText+ascii:        OK (12.23s)
          4.02 s ±  45 ms
--
    StripTags
      Text:                    OK (4.79s)
        1.59 s ±  90 ms
      TextByteString:          OK (4.07s)
        1.34 s ±  50 ms

@Bodigrim
Copy link
Contributor Author

These are the ones that run longer than a second on my old ThinkPad:

@sjakobi Mind you that criterion runs each benchmark for 5 second at least. Test data is extremely large, for instance, ascii.txt alone is 60 Mb long. According to your log, tasty-bench succeeded most of the time after running only three iterations of a benchmark, which I think is a reasonable behaviour.

It would be nice to trim test data to a more manageable size, because it takes 700 Mb unpacked %) But let's do it separately from refurbishing benchmarks.

@Bodigrim
Copy link
Contributor Author

I tweaked GC options to reduce noisiness of measurements.

@tathougies @Lysxia @Boarders @parsonsmatt please review.

@Bodigrim
Copy link
Contributor Author

@tathougies @Boarders @parsonsmatt unless there are comments / suggestions, I'd like to merge this by the end of the week.

@Boarders
Copy link

LGTM!

@Bodigrim Bodigrim merged commit 020b94c into haskell:master Mar 28, 2021
@Bodigrim Bodigrim deleted the bench branch March 28, 2021 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants