Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faster rand!(::MersenneTwister, ::Array{T}) for IntTypes and Float16/32 #8958

Merged
merged 5 commits into from
Nov 12, 2014

Conversation

rfourquet
Copy link
Member

This PR tries to waste less random bits for array filling, and also uses dsfmt_fill_array_close1_open2 function on non-Float64 arrays.

Edit: better timings using this function:

function arr(T, n)
    a = Array(T, n)
    m = MersenneTwister()
    tic()
    rand!(m, a)
    toc()
end
for T in [Int8, Int16, Int32, Int64, Int128, Float16, Float32]
    arr(T, 10^8÷sizeof(T))
end
Type PR master (64aa84d) master/PR
Int8 0.364 3.74 10.3
Int16 0.367 1.94 5.3
Int32/Int64/Int128 0.360 1.05 2.9
Float16 2.71 5.37 2.0
Float32 0.455 1.27 2.8

The results are similar for unsigned types (and also a 3x speedup for rand!(::BitArray)).
(I would be interested if someone finds very different speedups).

@ViralBShah
Copy link
Member

Should we have some more tests for all the new stuff?

# this is obviously satisfied on the 32 low bits side, and on the high side, the entropy comes
# from bits 33:52 of A128[i] and then from bits 27:32 (which are discarded on the low side)
# this is similar for the 64 high bits of u
A128[i] = mask128(u, T)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great to get a few more eyes on this.

@ViralBShah
Copy link
Member

@rfourquet The refactoring in this PR is perfect and can be merged right away. The other stuff I would like some eyeballs on. In general, what would help is if you can make smaller PRs that are focussed on one particular feature, so that it is easy to go with the merges.

@rfourquet
Copy link
Member Author

OK, will try! (I admit I didn't expect that a refactoring without added functionality would be welcome).
This definitely need some eyeballs, there is room for mistakes.
I'm not sure if this need some more tests, as it's only internal implementation changes (no modified public API).

@staticfloat
Copy link
Sponsor Member

The only test I can think of is running Bigcrush on this to ensure that we're still providing the same level of randomness as before.

@ivarne
Copy link
Sponsor Member

ivarne commented Nov 11, 2014

We should at the bare minimum have tests that ensure that all the code paths are executed, and that no exceptions are raised. It is really sad to get a UndefVarError in a bug report because you had a typo in a code path that you never executed.

We can potentially also check return type and range.

@andreasnoack
Copy link
Member

I have just tried BigCrush on the scalar and array versions and they still pass all tests but LinearComp which they consistently fails (which is well known for the MT).

@ViralBShah
Copy link
Member

@rfourquet Refactoring that makes the code more readable or clear in any way is always acceptable. Also, we generally do not have enough tests for all the random stuff - the existing tests are in no way sufficient. So, as suggested, we should try to have more tests to exercise all code paths and whatever we can test of RNGs in the test framework.

@andreasnoack Does your BigCrush test suite run tests for all the float and int precisions?

@rfourquet
Copy link
Member Author

I added some tests which should exercise most code paths for added rand! methods (suggestions welcome, I'm a beginner at writing tests).

ViralBShah pushed a commit that referenced this pull request Nov 12, 2014
faster rand!(::MersenneTwister, ::Array{T}) for IntTypes and Float16/32
@ViralBShah ViralBShah merged commit a01e764 into master Nov 12, 2014
@rfourquet rfourquet deleted the rf/rand-fast-array branch November 17, 2014 09:30
@ViralBShah ViralBShah added the domain:randomness Random number generation and the Random stdlib label Nov 22, 2014
@rfourquet
Copy link
Member Author

I finally learnt how to use RNGTest, so I could check that BigCrush tests pass for all int precisions.
But for floats I don't understand: testu01 tests only double precision floats, so testing directly the raw FLoat32 (via Float64(rand(Float32))) fails many tests.
Then following the description of the function unif01_CreateDoubleGen (at page 11 of the manual), I tried to obtain 53 bits precision numbers from 3 Float32 (wich have 24 bits of precisision) with f53() = (rand(Float32)+rand(Float32)/exp2(24)+rand2(Float32)/exp2(48)) % 1.0. But testing f53 passes BigCrush!
So what would be a good method to test Float32 and Float16 without finding them passing more tests than FLoat64?

@andreasnoack
Copy link
Member

I think your methods is fine. Why is not okay that the tests pass?

@rfourquet
Copy link
Member Author

As the rand! method for Float16 and Float32 use the Float64 one, it seemed suspicious to me that they pass more tests (i.e. scomp_LinearComp) than raw Float64. But now I realize that one Float64 (52 bits of entropy) is used to initialize two Float32 (which need only 2x23=46 bits en entropy) or 4 Float16 (which need 4x10=40 bits of entropy), so that may be okay after all. Speaking of what, would we want an option to generate a Float64 which pass all BigCrush, which consumes more that one FLoat64 internally?
@andreasnoack I seem to remember seeing somewhere that you tested randn with RNGTest, is that possible?

@andreasnoack
Copy link
Member

I define testrandn()=cdf(Normal(), randn()) and pass that function to BigCrush. The surprising result is that is passes the tests even though rand doesn't.

I don't think we should modify our Mersenne-Twister to pass the scomp_LinearComp. Instead I think it would be better to supply alternative RNGs that pass all tests, e.g. the xorshift*, which we already have several gist versions of.

@rfourquet
Copy link
Member Author

Ah thanks!

@ViralBShah
Copy link
Member

It would certainly be interesting to see how to make Float64 pass BigCrush, and what it means for performance and APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants