Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid weak gamma values in Hedgehog.Internal.Seed #198

Merged
merged 1 commit into from May 25, 2018

Conversation

Projects
None yet
3 participants
@moodmosaic
Copy link
Member

moodmosaic commented May 23, 2018

Resolves #191 ― See #191 (comment) ― ✔ Passes all dieharder tests. :neckbeard:


$ dieharder-input | dieharder -a -g 200
#=============================================================================#
#            dieharder version 3.31.1 Copyright 2003 Robert G. Brown          #
#=============================================================================#
   rng_name    |rands/second|   Seed   |
stdin_input_raw|  3.00e+07  |2424813648|
#=============================================================================#
        test_name   |ntup| tsamples |psamples|  p-value |Assessment
#=============================================================================#
   diehard_birthdays|   0|       100|     100|0.88800885|  PASSED  
      diehard_operm5|   0|   1000000|     100|0.19714083|  PASSED  
  diehard_rank_32x32|   0|     40000|     100|0.38815709|  PASSED  
    diehard_rank_6x8|   0|    100000|     100|0.94344462|  PASSED  
   diehard_bitstream|   0|   2097152|     100|0.96453841|  PASSED  
        diehard_opso|   0|   2097152|     100|0.06116614|  PASSED  
        diehard_oqso|   0|   2097152|     100|0.23506030|  PASSED  
         diehard_dna|   0|   2097152|     100|0.40548706|  PASSED  
diehard_count_1s_str|   0|    256000|     100|0.93024945|  PASSED  
diehard_count_1s_byt|   0|    256000|     100|0.37068664|  PASSED  
 diehard_parking_lot|   0|     12000|     100|0.35955403|  PASSED  
    diehard_2dsphere|   2|      8000|     100|0.99480570|  PASSED  
    diehard_3dsphere|   3|      4000|     100|0.90990947|  PASSED  
     diehard_squeeze|   0|    100000|     100|0.07403175|  PASSED  
        diehard_sums|   0|       100|     100|0.00188117|   WEAK   
        diehard_runs|   0|    100000|     100|0.86278014|  PASSED  
        diehard_runs|   0|    100000|     100|0.55399860|  PASSED  
       diehard_craps|   0|    200000|     100|0.72844871|  PASSED  
       diehard_craps|   0|    200000|     100|0.66463648|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.21806049|  PASSED  
 marsaglia_tsang_gcd|   0|  10000000|     100|0.39660207|  PASSED  
         sts_monobit|   1|    100000|     100|0.40274382|  PASSED  
            sts_runs|   2|    100000|     100|0.65543595|  PASSED  
          sts_serial|   1|    100000|     100|0.39984641|  PASSED  
          sts_serial|   2|    100000|     100|0.12878157|  PASSED  
          sts_serial|   3|    100000|     100|0.32054279|  PASSED  
          sts_serial|   3|    100000|     100|0.92776893|  PASSED  
          sts_serial|   4|    100000|     100|0.88823427|  PASSED  
          sts_serial|   4|    100000|     100|0.95001042|  PASSED  
          sts_serial|   5|    100000|     100|0.15192563|  PASSED  
          sts_serial|   5|    100000|     100|0.30862638|  PASSED  
          sts_serial|   6|    100000|     100|0.82350748|  PASSED  
          sts_serial|   6|    100000|     100|0.97031733|  PASSED  
          sts_serial|   7|    100000|     100|0.47614863|  PASSED  
          sts_serial|   7|    100000|     100|0.30898843|  PASSED  
          sts_serial|   8|    100000|     100|0.47266871|  PASSED  
          sts_serial|   8|    100000|     100|0.07446469|  PASSED  
          sts_serial|   9|    100000|     100|0.05486187|  PASSED  
          sts_serial|   9|    100000|     100|0.34267304|  PASSED  
          sts_serial|  10|    100000|     100|0.73402378|  PASSED  
          sts_serial|  10|    100000|     100|0.89991549|  PASSED  
          sts_serial|  11|    100000|     100|0.91679991|  PASSED  
          sts_serial|  11|    100000|     100|0.12537296|  PASSED  
          sts_serial|  12|    100000|     100|0.99502899|   WEAK   
          sts_serial|  12|    100000|     100|0.56007136|  PASSED  
          sts_serial|  13|    100000|     100|0.24452691|  PASSED  
          sts_serial|  13|    100000|     100|0.11884602|  PASSED  
          sts_serial|  14|    100000|     100|0.54430001|  PASSED  
          sts_serial|  14|    100000|     100|0.75405497|  PASSED  
          sts_serial|  15|    100000|     100|0.32598548|  PASSED  
          sts_serial|  15|    100000|     100|0.37035555|  PASSED  
          sts_serial|  16|    100000|     100|0.99882550|   WEAK   
          sts_serial|  16|    100000|     100|0.51607634|  PASSED  
         rgb_bitdist|   1|    100000|     100|0.41313031|  PASSED  
         rgb_bitdist|   2|    100000|     100|0.02449850|  PASSED  
         rgb_bitdist|   3|    100000|     100|0.72811423|  PASSED  
         rgb_bitdist|   4|    100000|     100|0.99164240|  PASSED  
         rgb_bitdist|   5|    100000|     100|0.66028955|  PASSED  
         rgb_bitdist|   6|    100000|     100|0.32112062|  PASSED  
         rgb_bitdist|   7|    100000|     100|0.69304928|  PASSED  
         rgb_bitdist|   8|    100000|     100|0.76816554|  PASSED  
         rgb_bitdist|   9|    100000|     100|0.38328541|  PASSED  
         rgb_bitdist|  10|    100000|     100|0.90157436|  PASSED  
         rgb_bitdist|  11|    100000|     100|0.25196359|  PASSED  
         rgb_bitdist|  12|    100000|     100|0.65831596|  PASSED  
rgb_minimum_distance|   2|     10000|    1000|0.84039186|  PASSED  
rgb_minimum_distance|   3|     10000|    1000|0.33202552|  PASSED  
rgb_minimum_distance|   4|     10000|    1000|0.92499783|  PASSED  
rgb_minimum_distance|   5|     10000|    1000|0.43088655|  PASSED  
    rgb_permutations|   2|    100000|     100|0.24249630|  PASSED  
    rgb_permutations|   3|    100000|     100|0.02193102|  PASSED  
    rgb_permutations|   4|    100000|     100|0.29246097|  PASSED  
    rgb_permutations|   5|    100000|     100|0.82793302|  PASSED  
      rgb_lagged_sum|   0|   1000000|     100|0.85672954|  PASSED  
      rgb_lagged_sum|   1|   1000000|     100|0.66018089|  PASSED  
      rgb_lagged_sum|   2|   1000000|     100|0.14464533|  PASSED  
      rgb_lagged_sum|   3|   1000000|     100|0.61958921|  PASSED  
      rgb_lagged_sum|   4|   1000000|     100|0.41404517|  PASSED  
      rgb_lagged_sum|   5|   1000000|     100|0.21085881|  PASSED  
      rgb_lagged_sum|   6|   1000000|     100|0.59236601|  PASSED  
      rgb_lagged_sum|   7|   1000000|     100|0.85844213|  PASSED  
      rgb_lagged_sum|   8|   1000000|     100|0.67532583|  PASSED  
      rgb_lagged_sum|   9|   1000000|     100|0.22960920|  PASSED  
      rgb_lagged_sum|  10|   1000000|     100|0.42416912|  PASSED  
      rgb_lagged_sum|  11|   1000000|     100|0.85142981|  PASSED  
      rgb_lagged_sum|  12|   1000000|     100|0.66956390|  PASSED  
      rgb_lagged_sum|  13|   1000000|     100|0.02113307|  PASSED  
      rgb_lagged_sum|  14|   1000000|     100|0.38234099|  PASSED  
      rgb_lagged_sum|  15|   1000000|     100|0.72755886|  PASSED  
      rgb_lagged_sum|  16|   1000000|     100|0.46936398|  PASSED  
      rgb_lagged_sum|  17|   1000000|     100|0.47843518|  PASSED  
      rgb_lagged_sum|  18|   1000000|     100|0.45445810|  PASSED  
      rgb_lagged_sum|  19|   1000000|     100|0.09850389|  PASSED  
      rgb_lagged_sum|  20|   1000000|     100|0.01429892|  PASSED  
      rgb_lagged_sum|  21|   1000000|     100|0.79504340|  PASSED  
      rgb_lagged_sum|  22|   1000000|     100|0.90117125|  PASSED  
      rgb_lagged_sum|  23|   1000000|     100|0.86218324|  PASSED  
      rgb_lagged_sum|  24|   1000000|     100|0.20973249|  PASSED  
      rgb_lagged_sum|  25|   1000000|     100|0.29309874|  PASSED  
      rgb_lagged_sum|  26|   1000000|     100|0.66429288|  PASSED  
      rgb_lagged_sum|  27|   1000000|     100|0.62583441|  PASSED  
      rgb_lagged_sum|  28|   1000000|     100|0.86432468|  PASSED  
      rgb_lagged_sum|  29|   1000000|     100|0.99449190|  PASSED  
      rgb_lagged_sum|  30|   1000000|     100|0.30593771|  PASSED  
      rgb_lagged_sum|  31|   1000000|     100|0.03909729|  PASSED  
      rgb_lagged_sum|  32|   1000000|     100|0.92391541|  PASSED  
     rgb_kstest_test|   0|     10000|    1000|0.27848827|  PASSED  
     dab_bytedistrib|   0|  51200000|       1|0.99849294|   WEAK   
             dab_dct| 256|     50000|       1|0.76828849|  PASSED  
Preparing to run test 207.  ntuple = 0
        dab_filltree|  32|  15000000|       1|0.92608225|  PASSED  
        dab_filltree|  32|  15000000|       1|0.29066666|  PASSED  
Preparing to run test 208.  ntuple = 0
       dab_filltree2|   0|   5000000|       1|0.26840219|  PASSED  
       dab_filltree2|   1|   5000000|       1|0.66679931|  PASSED  
Preparing to run test 209.  ntuple = 0
        dab_monobit2|  12|  65000000|       1|0.06439569|  PASSED 
Avoid weak gamma values in Hedgehog.Internal.Seed
See also:
- http://www.pcg-random.org/posts/bugs-in-splitmix.html
- #191
- http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/SplittableRandom.java
- http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/SplittableRandom.java

In http://www.pcg-random.org/posts/bugs-in-splitmix.html, most of the output is horribly formatted.

So I went on and formatted the important part of the output from that article, where it highlights the bug:

>```
>Prelude System.Random.SplitMix> map mkSMGen
> [ 0x61c8864680b583eb
> , 0xf8364607e9c949bd
> , 0x88e48f4fcc823718
> , 0x7f83ab8da2e71dd1
> , 0x7957d809e827ff4c
> , 0xf8d059aee4c53639
> , 0x9cd9f015db4e58b7
> , 0xf4077b0dbebc73c0
> , 0x305cb877109d0686
> , 0x359e58eeafebd527
> , 0xbeb721c511b0da6d
> , 0x86466fd0fcc363a6
> , 0xefee3e7b93db3075
> , 0x79629ee76aa83059
> , 0x05d507d05e785673
> , 0x76442b62dddf926c ]
>
> [ SMGen {_seed = 16269636050940713694, _gamma =  1}
> , SMGen {_seed = 11331281847540010644, _gamma =  1}
> , SMGen {_seed = 5619286150931196027 , _gamma =  1}
> , SMGen {_seed = 10281499533683189089, _gamma =  1}
> , SMGen {_seed = 17485363145453140727, _gamma =  7}
> , SMGen {_seed = 17937441790592959953, _gamma =  7}
> , SMGen {_seed = 15529190628949695919, _gamma =  7}
> , SMGen {_seed = 6957216782316743206 , _gamma =  7}
> , SMGen {_seed = 13022414457527836317, _gamma = 15}
> , SMGen {_seed = 2127073997383595602 , _gamma = 15}
> , SMGen {_seed = 7685868475082380828 , _gamma = 15}
> , SMGen {_seed = 6396586367974718702 , _gamma = 15}
> , SMGen {_seed = 13838712901582905874, _gamma = 31}
> , SMGen {_seed = 15464227283579875280, _gamma = 31}
> , SMGen {_seed = 17142043965827948305, _gamma = 31}
> , SMGen {_seed = 10266498716653543189, _gamma = 31} ]
>Prelude System.Random.SplitMix>
>```

It compares just *one* implementation of SplitMix, [this](https://github.com/phadej/splitmix/blob/f6895c4137b14e4752b0575c178199201eb41810/src/System/Random/SplitMix.hs) one, at some particular point in time.

(I can see that the author went and fixed it recently as shown from the diff-output [here](phadej/splitmix@bd0692b) and [here](phadej/splitmix@c6ba2f3), so this means that now *The Bug* section of http://www.pcg-random.org/posts/bugs-in-splitmix.html is outdated.)

---

Here's the exact same test with *Hedgehog.Internal.Seed*:

```
λ :t from
from :: GHC.Word.Word64 -> Seed

λ Prelude.map from
  [ 0x61c8864680b583eb
  , 0xf8364607e9c949bd
  , 0x88e48f4fcc823718
  , 0x7f83ab8da2e71dd1
  , 0x7957d809e827ff4c
  , 0xf8d059aee4c53639
  , 0x9cd9f015db4e58b7
  , 0xf4077b0dbebc73c0
  , 0x305cb877109d0686
  , 0x359e58eeafebd527
  , 0xbeb721c511b0da6d
  , 0x86466fd0fcc363a6
  , 0xefee3e7b93db3075
  , 0x79629ee76aa83059
  , 0x05d507d05e785673
  , 0x76442b62dddf926c ]

[ Seed 7046029254386353131  11400714819323198485
, Seed 17885559969949501885 11400714819323198485
, Seed 9864166656744503064  11400714819323198485
, Seed 9188376289577737681  11400714819323198485
, Seed 8743694738624347980  11400714819323198485
, Seed 17928928724259255865 11400714819323198485
, Seed 11302328716527294647 11400714819323198485
, Seed 17584158569056203712 11400714819323198485
, Seed 3484863033197266566  11400714819323198485
, Seed 3863623312507393319  11400714819323198485
, Seed 13742489918233434733 11400714819323198485
, Seed 9675543792836633510  11400714819323198485
, Seed 17288824720004427893 11400714819323198485
, Seed 8746728143070965849  11400714819323198485
, Seed 420250731748546163   11400714819323198485
, Seed 8521984098521027180  11400714819323198485 ]
λ
```

This looks weird, as I'm getting the *exact same γ* through all the seeds. If I change Hedgehog.Internal.Seed's [`from`](https://github.com/hedgehogqa/haskell-hedgehog/blob/615880fcabcbd07f4c583fd9913c8a6985c16c44/hedgehog/src/Hedgehog/Internal/Seed.hs#L108-L110) according to [Oracle's JDK8 implementation](http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/SplittableRandom.java#l214):

```diff
λ :! git diff
diff --git a/hedgehog/src/Hedgehog/Internal/Seed.hs b/hedgehog/src/Hedgehog/Internal/Seed.hs
index 61cacb2..fe1a0be 100644
--- a/hedgehog/src/Hedgehog/Internal/Seed.hs
+++ b/hedgehog/src/Hedgehog/Internal/Seed.hs
@@ -98,11 +98,11 @@ random =
 -- | Create a 'Seed' using a 'Word64'.
 --
 from :: Word64 -> Seed
 from x =
-  Seed x goldenGamma
+  Seed (mix64 x) (mixGamma (x + goldenGamma))
λ
```

I can now reproduce the bug [*exactly* as shown in the article](http://www.pcg-random.org/posts/bugs-in-splitmix.html#the-bug):

```
λ Prelude.map from
  [ 0x61c8864680b583eb
  , 0xf8364607e9c949bd
  , 0x88e48f4fcc823718
  , 0x7f83ab8da2e71dd1
  , 0x7957d809e827ff4c
  , 0xf8d059aee4c53639
  , 0x9cd9f015db4e58b7
  , 0xf4077b0dbebc73c0
  , 0x305cb877109d0686
  , 0x359e58eeafebd527
  , 0xbeb721c511b0da6d
  , 0x86466fd0fcc363a6
  , 0xefee3e7b93db3075
  , 0x79629ee76aa83059
  , 0x05d507d05e785673
  , 0x76442b62dddf926c ]

[ Seed 15210016002011668638  1
, Seed 11409286845259996466  1
, Seed 1931727433621677744   1
, Seed 307741759840609752    1
, Seed 8606169619657412120   7
, Seed 13651108307767328632  7
, Seed 125750466559701114    7
, Seed 6781260234005250507   7
, Seed 15306535823716590088 15
, Seed 7344074043290227165  15
, Seed 9920554987610416076  15
, Seed 3341781972484278810  15
, Seed 12360157267739240775 31
, Seed 600595566262245170   31
, Seed 1471112649570176389  31
, Seed 8100917074368564322  31 ]
λ
```

This is actually great so far, as it points out that we got Hedgehog.Internal.Seed's [`from`](https://github.com/hedgehogqa/haskell-hedgehog/blob/615880fcabcbd07f4c583fd9913c8a6985c16c44/hedgehog/src/Hedgehog/Internal/Seed.hs#L108-L110) wrong and we can now 'fix' it according to the 'reference' source.

(At this point, I can tell that the only reliable source is the one found in Oracle's JDK8 implementation.)

---

And finally, If I go and change the glorious Hedgehog.Internal.Seed [`mixGamma`](https://github.com/hedgehogqa/haskell-hedgehog/blob/615880fcabcbd07f4c583fd9913c8a6985c16c44/hedgehog/src/Hedgehog/Internal/Seed.hs#L202) according to [Oracle's JDK8 implementation](http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/SplittableRandom.java#l214) (essentially, reverting #124):

```diff
λ :! git diff
diff --git a/hedgehog/src/Hedgehog/Internal/Seed.hs b/hedgehog/src/Hedgehog/Internal/Seed.hs
index 61cacb2..5428927 100644
--- a/hedgehog/src/Hedgehog/Internal/Seed.hs
+++ b/hedgehog/src/Hedgehog/Internal/Seed.hs
@@ -100,7 +100,7 @@ random =
 --
 from :: Word64 -> Seed
 from x =
-  Seed x goldenGamma
+  Seed (mix64 x) (mixGamma (x + goldenGamma))

 -- | A predefined gamma value's needed for initializing the "root" instances of
 --   'Seed'. That is, instances not produced by splitting an already existing
@@ -192,7 +192,7 @@ mixGamma x =
     y = mix64variant13 x .|. 1
     n = popCount $ y `xor` (y `shiftR` 1)
   in
-    if n >= 24 then
+    if n < 24 then
       y `xor` 0xaaaaaaaaaaaaaaaa
     else
       y
λ
```

It 'fixes' the gamma value:

```
λ Prelude.map from
  [ 0x61c8864680b583eb
  , 0xf8364607e9c949bd
  , 0x88e48f4fcc823718
  , 0x7f83ab8da2e71dd1
  , 0x7957d809e827ff4c
  , 0xf8d059aee4c53639
  , 0x9cd9f015db4e58b7
  , 0xf4077b0dbebc73c0
  , 0x305cb877109d0686
  , 0x359e58eeafebd527
  , 0xbeb721c511b0da6d
  , 0x86466fd0fcc363a6
  , 0xefee3e7b93db3075
  , 0x79629ee76aa83059
  , 0x05d507d05e785673
  , 0x76442b62dddf926c ]

[ Seed 15210016002011668638 12297829382473034411
, Seed 11409286845259996466 12297829382473034411
, Seed 1931727433621677744  12297829382473034411
, Seed 307741759840609752   12297829382473034411
, Seed 8606169619657412120  12297829382473034413
, Seed 13651108307767328632 12297829382473034413
, Seed 125750466559701114   12297829382473034413
, Seed 6781260234005250507  12297829382473034413
, Seed 15306535823716590088 12297829382473034405
, Seed 7344074043290227165  12297829382473034405
, Seed 9920554987610416076  12297829382473034405
, Seed 3341781972484278810  12297829382473034405
, Seed 12360157267739240775 12297829382473034421
, Seed 600595566262245170   12297829382473034421
, Seed 1471112649570176389  12297829382473034421
, Seed 8100917074368564322  12297829382473034421 ]
λ
```
@moodmosaic

This comment has been minimized.

Copy link
Member Author

moodmosaic commented May 23, 2018

In case you find #191 (comment) tl;dr, our from doesn't check the γ value at all―if γ is bad it leaves it as-is.

Now, once you fix from according to the paper

@@ -100,7 +100,7 @@ random =
 --
 from :: Word64 -> Seed
 from x =
-  Seed x goldenGamma
+  Seed (mix64 x) (mixGamma (x + goldenGamma))

image

you have to also 'fix' mixGamma otherwise you end up with weak γs:

@@ -192,7 +192,7 @@ mixGamma x =
     y = mix64variant13 x .|. 1
     n = popCount $ y `xor` (y `shiftR` 1)
   in
-    if n >= 24 then
+    if n < 24 then
       y `xor` 0xaaaaaaaaaaaaaaaa
     else
       y

The 'bug' in the paper is where it does n >= 24 where it should be the opposite.

@Lysxia

This comment has been minimized.

Copy link

Lysxia commented May 23, 2018

A "bad" mixGamma also affects split, I think that's the more important part of the issue.

@moodmosaic

This comment has been minimized.

Copy link
Member Author

moodmosaic commented May 23, 2018

Sure, these changes are what we need here on this PR.

@moodmosaic

This comment has been minimized.

Copy link
Member Author

moodmosaic commented May 25, 2018

/cc @jystic @thumphries

Once we merge this, I'll go and fix the one we have in the F# version. 🦄

@jystic

This comment has been minimized.

Copy link
Member

jystic commented May 25, 2018

💯 from me, feel free to merge

@moodmosaic moodmosaic merged commit 2bd6062 into hedgehogqa:master May 25, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@moodmosaic moodmosaic deleted the moodmosaic:bugfix/seed-from-and-mixgamma branch May 25, 2018

moodmosaic added a commit to moodmosaic/fsharp-hedgehog that referenced this pull request May 28, 2018

@charleso charleso referenced this pull request Aug 8, 2018

Closed

Switch to SplitMix #27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.