New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Random to use the PCG RNG #778

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
4 participants
@mgold
Contributor

mgold commented Dec 9, 2016

As we've discussed, this PR switches out the implementation of the random number generator while not altering the public API at all. It uses the PCG generator.

You can see my benchmarking code here, but you have two questions:

Is it fast?

Yes.

task old ops/sec new ops/sec speedup factor
flip a coin 959,511 6,709,634 7.0
flip 1000 coins 878 6,308 7.2
generate an integer 100-200 894,683 1,092,072 1.2
generate an integer 0-4094 895,621 1,075,137 1.2
generate an integer 0-4095 918,955 2,044,765 2.2
generate an integer 0-4096 871,978 1,125,907 1.3
generate a massive integer 452,718 1,412,437 3.1
generate a percentage 335,062 2,089,161 6.2
generate 1000 percentages 443 1,980 4.5
generate an float 100-200 334,993 2,092,694 6.2
generate a float 0-4094 336,427 2,072,984 6.2
generate a float 0-4095 333,357 2,061,712 6.2
generate a float 0-4096 325,416 2,022,222 6.2
generate a massive float 332,432 1,817,853 5.5

This is Chrome on Mac, although I ran it on Safari and Firefox and saw similar results. They can vary a bit between runs; I've seen speedup factors as low as 1.2x and higher than 10x, but 6-7x is common. Although the "how much" part is somewhat finicky, it definitely is faster. (Looks like bitwise op inlining paid off!)

Also note that integer generation is faster for ranges that are a power of two (e.g. 0-4095) because a faster algorithm can be used. There is no equivalent for floats, and the tests bear that out.

Is it random?

It is demonstrably more random than what we have now.

I've tested the statistical properties of PCG in the past and it has always out-performed the old/current implementation. However, I had to rewrite the code to produce the random numbers for 0.18, and I think I wound up with fewer random bits before node ran out of memory. However, I can definitely state that this patch passes the first of the dieharder test battery, while core fails. The author describes the test as follows:

Each test determines the number of matching intervals from 512 "birthdays" (by default) drawn on a 24-bit "year" (by default). This is repeated 100 times (by default) and the results cumulated in a histogram. Repeated intervals should be distributed in a Poisson distribution if the underlying generator is random enough, and a a chisq and p-value for the test are evaluated relative to this null hypothesis.

This implementation is much less sensitive to the choice of random seed for the first value. Contrast the current implementation:

> import Random
> List.range -53667 53667 |> List.map (Random.initialSeed >> Random.step Random.bool >> Tuple.first) |> List.all identity
True : Bool

This shows that all initial seeds less than 53 thousand (plus or minus) will generate True for their first boolean. True, you shouldn't use the random library like this, but timestamps are similarly distributed in a continuous range that is a small subset of 32-bit integers. So... we kind of are using the random library like this.

So: it's faster, it's more random, a seed is only one integer, and it uses fewer magic numbers. Sound good?

@process-bot

This comment has been minimized.

Show comment
Hide comment
@process-bot

process-bot Dec 9, 2016

Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

process-bot commented Dec 9, 2016

Thanks for the pull request! Make sure it satisfies this checklist. My human colleagues will appreciate it!

Here is what to expect next, and if anyone wants to comment, keep these things in mind.

@rtfeldman

This comment has been minimized.

Show comment
Hide comment
@rtfeldman

rtfeldman Dec 9, 2016

Member

🚀 🚀 🚀 🚀 🚀

Member

rtfeldman commented Dec 9, 2016

🚀 🚀 🚀 🚀 🚀

@rtfeldman

This comment has been minimized.

Show comment
Hide comment
@rtfeldman

rtfeldman Dec 9, 2016

Member

THIS IS SO RAD

Member

rtfeldman commented Dec 9, 2016

THIS IS SO RAD

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Dec 12, 2016

Member

This PR is really great, thank you for doing all this! I need to do some planning about when to release various things, but I just wanted to comment to say that things are looking good!

Member

evancz commented Dec 12, 2016

This PR is really great, thank you for doing all this! I need to do some planning about when to release various things, but I just wanted to comment to say that things are looking good!

@rtfeldman

This comment has been minimized.

Show comment
Hide comment
@rtfeldman

rtfeldman Apr 21, 2017

Member

@evancz I'd love to see this get the same treatment as https://github.com/elm-lang/core/pull/850 😄

Member

rtfeldman commented Apr 21, 2017

@evancz I'd love to see this get the same treatment as https://github.com/elm-lang/core/pull/850 😄

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 8, 2017

Member

I read through this today and merged it into dev here elm-lang@48e640e

@mgold, I notice you use (|>) in a bunch of bitwise operations. Do you mind switching this code to not use that operator, instead just saying Bitwise.and x y like normal? It would be annoying to have a performance regression because we were depending on a couple optimizations happening in the right order, so I'd just feel more comfortable if we did not use anything extra.

That said, thanks again for your excellent work on this! Excited to see how it goes when we are testing out 0.19 :)

Member

evancz commented Jul 8, 2017

I read through this today and merged it into dev here elm-lang@48e640e

@mgold, I notice you use (|>) in a bunch of bitwise operations. Do you mind switching this code to not use that operator, instead just saying Bitwise.and x y like normal? It would be annoying to have a performance regression because we were depending on a couple optimizations happening in the right order, so I'd just feel more comfortable if we did not use anything extra.

That said, thanks again for your excellent work on this! Excited to see how it goes when we are testing out 0.19 :)

@evancz evancz closed this Jul 8, 2017

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 8, 2017

Contributor

Pipe removal in #883.

Thanks so much for merging! You are quite welcome.

Contributor

mgold commented Jul 8, 2017

Pipe removal in #883.

Thanks so much for merging! You are quite welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment