New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch Random to use the PCG RNG #643

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@mgold
Contributor

mgold commented Jun 10, 2016

As discussed in person, this commit changes the Random library to use the PCG generator. If you have 2 minutes to spare, you can read more about PCG here.

Changes:

  • Seed now contains only an Int64, a new private type.
  • Added 64-bit arithmetic helpers (end of file), and two private functions, peel and next.
  • The guts of int, float, and initialSeed have all been pretty drastically changed. I also implemented bool directly with peel and next.
  • Commented the confusing things, and left a very large comment going over the algorithm generally.
  • Removed Haskell-esque magic factors and one-character variables.
  • Changed the "this is an implementation of" documentation.

Not changed:

  • The API. This is a patch change.
  • Other documentation. In particular,
    • The docs for Seed say, "Whenever you want to use a generator, you need to pair it with a seed." This is no longer true (at least for user-managed seeds).
    • The docs for initialSeed say, "A good way to get an unexpected seed is to use the current time." From a randomness perspective, the current time is not as random as it could be; it can be predicted approximately. But more importantly, there's no way to get the current time without using effects of some kind, at which point you might as well use the effects manager.
  • How the initial seed managed by the effects manager is produced. Currently, it's using the current time. That works well enough in practice, but I'll explain in a moment how to do even better.

Opportunities for expansion:

  • Expose constant, like we discussed.
  • Expose initialSeed2 : Int -> Int -> Seed. Currently, the least significant 32 bits are initialized to zero. Like using the current time, this works well enough for most people, but it's not as random as it could be. To implement, pass the new parameter (with >>> 0 to be safe) as the second argument to Int64, which is 0 in initialSeed.
  • Expose expert-only to- and from- seed functions, which third-party libraries can use.
  • Allow splittable random number generators, by removing magicIncrement and keeping the increment in the seed itself. More info is in the comments. Working implementation in mgold/elm-random-pcg. I've found that the best interface is not Seed -> (Seed, Seed) but Generator Seed (so if you want n independent seeds, use Random.list n).
  • Initialize the seed managed by the runtime better. The gold standard is Math.floor(Math.random()*0xFFFFFFFF), which you can wrap in a Task that you don't even need to expose publicly (though that might be nice). Task.map2 over that task twice and use it with initialSeed2. It's possible to do this and make the effect manager's seed really good without changing the API at all.
@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 7, 2016

Contributor

Closing due to performance concerns surfaced by project-fuzzball. If a 32-bit version turns out to be acceptably fast and statistically robust, I'll submit another PR.

Contributor

mgold commented Jul 7, 2016

Closing due to performance concerns surfaced by project-fuzzball. If a 32-bit version turns out to be acceptably fast and statistically robust, I'll submit another PR.

@mgold mgold closed this Jul 7, 2016

@evancz

This comment has been minimized.

Show comment
Hide comment
@evancz

evancz Jul 7, 2016

Member

Also, it would be good to know if #636 has changed the randomness for the better in the bad cases you found earlier. It will be out in whatever is the next release, but I think that's a good reference point for any changes we want to make.

Member

evancz commented Jul 7, 2016

Also, it would be good to know if #636 has changed the randomness for the better in the bad cases you found earlier. It will be out in whatever is the next release, but I think that's a good reference point for any changes we want to make.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 8, 2016

Contributor

I will rerun the statistical tests with that change in place and let you know.

Contributor

mgold commented Jul 8, 2016

I will rerun the statistical tests with that change in place and let you know.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 8, 2016

Contributor

The test suite hasn't finished yet but the results are basically unchanged. It still fails the first test after two seconds of scrutiny, so it's not magically better. That said, I have yet to see how PCG in 32 bits will perform.

Contributor

mgold commented Jul 8, 2016

The test suite hasn't finished yet but the results are basically unchanged. It still fails the first test after two seconds of scrutiny, so it's not magically better. That said, I have yet to see how PCG in 32 bits will perform.

@shmookey

This comment has been minimized.

Show comment
Hide comment
@shmookey

shmookey Jul 8, 2016

@mgold any chance you could elaborate (not necessarily here) on the performance figures you're seeing out of this? It's interesting and a little disappointing that an algorithm selling itself as among the very fastest would turn out to be unacceptably slow.

shmookey commented Jul 8, 2016

@mgold any chance you could elaborate (not necessarily here) on the performance figures you're seeing out of this? It's interesting and a little disappointing that an algorithm selling itself as among the very fastest would turn out to be unacceptably slow.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 9, 2016

Contributor

PCG is (apparently) fast on 64-bit hardware. When you have to simulate that using immutable 32-bit integers, and where bitwise ops become several layers of lookups rather than opcodes, the implementation slows down considerably. See this code.

As for figures, in a real-world use, the current elm-test and core Random run in about 30ms. With the next version of elm-test that uses mgold/elm-random-pcg, it runs in about 3s. With a stubbed version of PCG (same API but logic is gutted), it runs in about 250ms. We are still investigating these performance regressions. I mentioned "how PCG in 32 bits will perform", and let me clarify that I mean both speed and statistical quality. We're looking for a compromise point with acceptable levels of each.

Contributor

mgold commented Jul 9, 2016

PCG is (apparently) fast on 64-bit hardware. When you have to simulate that using immutable 32-bit integers, and where bitwise ops become several layers of lookups rather than opcodes, the implementation slows down considerably. See this code.

As for figures, in a real-world use, the current elm-test and core Random run in about 30ms. With the next version of elm-test that uses mgold/elm-random-pcg, it runs in about 3s. With a stubbed version of PCG (same API but logic is gutted), it runs in about 250ms. We are still investigating these performance regressions. I mentioned "how PCG in 32 bits will perform", and let me clarify that I mean both speed and statistical quality. We're looking for a compromise point with acceptable levels of each.

@mgold

This comment has been minimized.

Show comment
Hide comment
@mgold

mgold Jul 10, 2016

Contributor

For more performance notes, see here.

Contributor

mgold commented Jul 10, 2016

For more performance notes, see here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment