Skip to content
Fetching contributors…
Cannot retrieve contributors at this time
197 lines (154 sloc) 9.16 KB

DEVLOG: A collection of notes accumulated during development.

2011.06.24 Regression in stdGen performance.

I just added a simple benchmark to make sure that whatever fix I introduce for trac ticket #5133 does not regress performance. Yet in doing so I discovered that I'm getting much worse performance out of rev 130e421e912d than I'm seeing in my installed random- package.

Current version: How many random numbers can we generate in a second on one thread? Cost of rdtsc (ffi call): 100 Approx getCPUTime calls per second: 234,553 Approx clock frequency: 3,335,220,196 First, timing with System.Random interface: 68,550,189 random ints generated [constant zero gen] ~ 48.65 cycles/int 900,889 random ints generated [System.Random stdGen] ~ 3,702 cycles/int

random- version: How many random numbers can we generate in a second on one thread? Cost of rdtsc (ffi call): 75 Approx getCPUTime calls per second: 215,332 Approx clock frequency: 3,334,964,738 First, timing with System.Random interface: 71,683,748 random ints generated [constant zero gen] ~ 46.52 cycles/int 13,609,559 random ints generated [System.Random stdGen] ~ 245 cycles/int

A >13X difference!! Both are compiled with the same options. The only difference is which System.Random is used.

When did the regression occur?

  • e059ed15172585310f9c -- 10/13/2010 perf still good
  • 6c43f80f48178ac617 -- SplittableGen introduced, still good perf
  • 130e421e912d394653a4 -- most recent, bad performance

Ok... this is very odd. It was a heisenbug becuase it's disappeared now! I'll leave this note here to help remember to look for it in the future. -Ryan

[2011.06.24] Timing non-int types

The results are highly uneven:

Cost of rdtsc (ffi call):    84
Approx getCPUTime calls per second: 220,674
Approx clock frequency:  3,336,127,273
First, timing with System.Random interface:
  112,976,933 randoms generated [constant zero gen]         ~ 29.53 cycles/int
   14,415,176 randoms generated [System.Random stdGen]      ~ 231 cycles/int
   70,751 randoms generated [System.Random Floats]      ~ 47,153 cycles/int
   70,685 randoms generated [System.Random CFloats]     ~ 47,197 cycles/int
2,511,635 randoms generated [System.Random Doubles]     ~ 1,328 cycles/int
   70,494 randoms generated [System.Random CDoubles]    ~ 47,325 cycles/int
  858,012 randoms generated [System.Random Integers]    ~ 3,888 cycles/int
4,756,213 randoms generated [System.Random Bools]       ~ 701 cycles/int

As you can see, all the types that use the generic randomIvalFrac / randomFrac definitions perform badly. What's more, the above results INCLUDE an attempt to inline:

{-# INLINE randomIvalFrac #-}
{-# INLINE randomFrac #-}
{-# INLINE randomIvalDouble #-}

After reimplementing random/Float these are the new results:

Cost of rdtsc (ffi call): 100 Approx getCPUTime calls per second: 200,582 Approx clock frequency: 3,334,891,942 First, timing with System.Random interface: 105,266,949 randoms generated [constant zero gen] ~ 31.68 cycles/int 13,593,392 randoms generated [System.Random stdGen] ~ 245 cycles/int 10,962,597 randoms generated [System.Random Floats] ~ 304 cycles/int 11,926,573 randoms generated [System.Random CFloats] ~ 280 cycles/int 2,421,520 randoms generated [System.Random Doubles] ~ 1,377 cycles/int 2,535,087 randoms generated [System.Random CDoubles] ~ 1,315 cycles/int 856,276 randoms generated [System.Random Integers] ~ 3,895 cycles/int 4,976,373 randoms generated [System.Random Bools] ~ 670 cycles/int

(But I still need to propagate these changes throughout all types / API calls.)

[2011.06.28] Integer Generation via random and randomR

Back on the master branch I notice that while randomIvalInteger does well for small ranges, it's advantage doesn't scale to larger ranges:

range (-100,100): 5,105,290 randoms generated [System.Random Integers] ~ 653 cycles/int

range (0,2^5000): 8,969 randoms generated [System.Random BIG Integers] ~ 371,848 cycles/int

[2011.08.25] Validating release version rev 40bbfd2867

This is a bugfix release without SplittableGen. It passed (cd tests; make test) on my Mac Os 10.6 machine.

I ran GHC validate using the following fingerprint


First validating in the context of a slightly stale GHC head (7.3.20110727) on a mac.

[2011.09.30] Redoing timings after bugfix in version

It looks like there has been serious performance regression (3.33ghz nehalem still).

How many random numbers can we generate in a second on one thread?
  Cost of rdtsc (ffi call):    38
  Approx getCPUTime calls per second: 7,121
  Approx clock frequency:  96,610,524
  First, timing
148,133,038 randoms generated [constant zero gen]         ~ 0.65 cycles/int
 12,656,455 randoms generated [System.Random stdGen/next] ~ 7.63 cycles/int

  Second, timing System.Random.random at different types:
    676,066 randoms generated [System.Random Ints]        ~ 143 cycles/int
  3,917,247 randoms generated [System.Random Word16]      ~ 24.66 cycles/int
  2,231,460 randoms generated [System.Random Floats]      ~ 43.29 cycles/int
  2,269,993 randoms generated [System.Random CFloats]     ~ 42.56 cycles/int
    686,363 randoms generated [System.Random Doubles]     ~ 141 cycles/int
  2,165,679 randoms generated [System.Random CDoubles]    ~ 44.61 cycles/int
    713,702 randoms generated [System.Random Integers]    ~ 135 cycles/int
  3,647,551 randoms generated [System.Random Bools]       ~ 26.49 cycles/int
  4,296,919 randoms generated [System.Random Chars]       ~ 22.48 cycles/int

  Next timing range-restricted System.Random.randomR:
  4,307,214 randoms generated [System.Random Ints]        ~ 22.43 cycles/int
  4,068,982 randoms generated [System.Random Word16s]     ~ 23.74 cycles/int
  2,059,264 randoms generated [System.Random Floats]      ~ 46.92 cycles/int
  1,960,359 randoms generated [System.Random CFloats]     ~ 49.28 cycles/int
    678,978 randoms generated [System.Random Doubles]     ~ 142 cycles/int
  2,009,665 randoms generated [System.Random CDoubles]    ~ 48.07 cycles/int
  4,296,452 randoms generated [System.Random Integers]    ~ 22.49 cycles/int
  3,689,999 randoms generated [System.Random Bools]       ~ 26.18 cycles/int
  4,367,577 randoms generated [System.Random Chars]       ~ 22.12 cycles/int
      6,650 randoms generated [System.Random BIG Integers] ~ 14,528 cycles/int
Something went wrong with that request. Please try again.