Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dSFMT: use fill_array_* API instead of genrand_* API #8808

Closed
wants to merge 1 commit into from

Conversation

rfourquet
Copy link
Member

Mixing both APIs led to bug #6573. As a result fill_array_* functions were no more used. This PR proposes instead to use them exclusively with MersenneTwister objects (I could do the same for the global RNG if there is interest), for faster random generation (up to 2x or 3x on my machine).

Looking at the dSFMT code, my understanding is that both take advantage of simd instructions, but use different algorithms:

  • fill_array_* generates random values (considering only Float64 values for now) into an array of size (at least) dsfmt_min_array_size, using a state array of size dsfmt_min_array_size,
  • whereas genrand_* cyclically overwrites a state array of size dsfmt_min_array_size filled with random values with new random values.

All it takes to use fill_array_* functions is to add a Float64 array of size dsfmt_min_array_size to MersenneTwister instances (which then allocate roughly twice as much), to be cyclically filled with random values (via fill_array_*), which allows rand(r::MersenneTwister) to pick those values one by one.

I guess the same speed differences could be observed for integer generation, but currently MersenneTwister RNGs generate only floats (I would be happy to enable this if #8380 or equivalent got merged, which the use of dsfmt_fill_array_close1_open2 anticipates here).

Both APIs are not very compatible, so let's use fill_array_* functions,
which are often faster but require MersenneTwister to have an additional
array field.
@rfourquet rfourquet mentioned this pull request Oct 24, 2014
@ViralBShah
Copy link
Member

It looks like you are still mixing the two calls in order to get a general purpose implementation, on the same object.

@ViralBShah
Copy link
Member

I hadn't understood what you were saying correctly. Assuming this works, this could get us back to the earlier performance level without the bug.

@ViralBShah
Copy link
Member

Could you do the same for the global RNG too? And also add a test that ensures that #6573 doesn't happen again in both cases?

@ViralBShah
Copy link
Member

We should make sure that Rmath-julia is also consistent with this.

@rfourquet
Copy link
Member Author

Yes I think it should give back exactly the same performance as before. The difference is that for small arrays, instead of falling back to genrand*functions (leading to the bug), we write to an internal array (member variable) big enough to use fill_array whose values are then consumed one by one.

I will try to add relevant test to prevent #6573, but if genrand functions were no used at all anymore, the bug should not happen again.

RE global RNG: in #8380 I tried to minimize the portion of the code which discriminates between global RNG or local MersenneTwister instance, so it would be easier to extend the current PR (to the global RNG) on top of that one, but if you prefer I can keep both PR independant.

@ViralBShah
Copy link
Member

Please extend the current PR.

@ViralBShah
Copy link
Member

The part of the #8399 PR where the global state is kept in Julia could be done as part of this PR too, if it makes it easier.

In any case, let's get the various dSFMT related PRs merged.

@rfourquet
Copy link
Member Author

Yes it makes it easier, as using the global state array from libdSFMT together with a global array in Julia needed to extend this PR to work with the global RNG would seem artificial and be more complicated than just having the global RNG be an instance of MersenneTwister.

@StefanKarpinski
Copy link
Sponsor Member

+100000 to making the global RNG be an instance of MersenneTwister. That makes designing the whole RNG API much cleaner and simpler. Not having access to the global RNG as a first class value is a bit weird too. It's just "out there" somewhere.

@rfourquet
Copy link
Member Author

Superseded by #8832.

@rfourquet rfourquet closed this Oct 28, 2014
@ViralBShah ViralBShah added the domain:randomness Random number generation and the Random stdlib label Nov 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants