Skip to content

Conversation

@ViralBShah
Copy link
Member

For #6573 I looked through the code but it appears that we are calling the fill_array routines correctly. For now, to fix this issue, I am disabling the fill_array routines, so that we only call dsfmt_gv_genrand_close_open for all random number generation.

ViralBShah added a commit that referenced this pull request Apr 19, 2014
Disable fill_array calls in DSFMT to fix 6573
@ViralBShah ViralBShah merged commit a117337 into master Apr 19, 2014
@ViralBShah ViralBShah deleted the vs/6573 branch April 19, 2014 09:05
@mbauman
Copy link
Member

mbauman commented Apr 19, 2014

Here's the crux of the issue:

From dSFMT.c:439 (the comment header to dsfmt_fill_array_close1_open2, which isn't the fill array method rand(N) uses, but it's at the beginning of the fill-array section): "This function can not be used after calling genrand_xxx functions, without initialization."

Seems like a very strange limitation. But it's exactly the requirement I noted in #6573. Very unfortunate…

I wonder how many other users of dSFMT might be unknowingly hitting this.

@StefanKarpinski
Copy link
Member

Thanks for tracking that down. That's a pretty terrible limitation that doesn't seem to be noted prominently enough at all.

@andreasnoack
Copy link
Member

You are okay if you use either the genrand or fill_array version, but you are not allowed to mix them unless you reinitialise the RNG. It is not about the global state. The same happens if you store the state youself.

julia> r=Base.LibRandom.DSFMT_state();
julia> Base.LibRandom.dsfmt_init_gen_rand(r, 0x00123);
julia> x=Base.LibRandom.dsfmt_fill_array_close_open!(r, Array(Float64,384));
julia> sum(x.==Base.LibRandom.dsfmt_genrand_close_open(r))
0
julia> x=Base.LibRandom.dsfmt_fill_array_close_open!(r, Array(Float64,384));
julia> sum(x.==Base.LibRandom.dsfmt_genrand_close_open(r))
1

@JeffBezanson
Copy link
Member

Oh wow, so DSFMT is essentially useless.

@mbauman
Copy link
Member

mbauman commented Apr 19, 2014

Right, as I understand it, they both update the state, but they do so in ways that are incompatible with the other.

@StefanKarpinski
Copy link
Member

This is such an utter and complete failure of both API and documentation. These bits of dangerously conflicting functionality don't belong in the same library – they should be split into separate libraries so they can't trample each other like this. We may need a new RNG in the longer term. It's unfortunate that initial testing of Random123 has shown it not to be nearly as fast.

@ViralBShah
Copy link
Member Author

Internally, I believe it still generates random numbers in batches, and the performance of calling genrand repeatedly seems OK. Don't know how I missed that warning.

ViralBShah added a commit that referenced this pull request Apr 20, 2014
used together, without reinitializing the state.

This commit removes wrappers around DSFMT's fill_array
functions. It is easy enough to add them back if required at a later
point.
@StefanKarpinski
Copy link
Member

No worries, I really don't think that burying a DIRE warning like that in some C file is acceptable. That sort of thing should be plastered all over the homepage of the library, not hidden away somewhere.

@ViralBShah
Copy link
Member Author

Or they could have just used a different struct for the state. We had tested the quality of random numbers from DSFMT, but not by mixing the individual and vectorised generation.

@StefanKarpinski
Copy link
Member

I guess one good thing might be that in real production code people may not be so likely to mix these.

@mbauman
Copy link
Member

mbauman commented Apr 21, 2014

I've not fully read the dSFMT papers, so I don't know if this is just unfounded paranoia, but the possibility of having the two states end up parallel to each other is worrying to me.

Really, the performance regression here isn't as bad as I expected. The iterative genrand is only 2x (with 400 elements) to 3x (with 100_000_000 elts) slower than fill_array on my machine. Making the opposite choice (only calling the fill_array flavors) is more problematic: the one element genrand smokes the minimum fill_array by 25x.

function dsfmt_genrand!(A::Array{Float64})
    n = length(A)
    @inbounds for i = 1:n
        A[i] = ccall((:dsfmt_gv_genrand_close_open, :librandom), Float64, ())
    end
end

function dsfmt_fillrand!(A::Array{Float64}) 
    n = length(A) # must be even and > 382
    ccall((:dsfmt_gv_fill_array_close_open, :librandom), Void, (Ptr{Float64}, Int), A, n)
end

@StefanKarpinski
Copy link
Member

The whole selling point of dSFMT in the first place is that it uses SIMD instructions to get a 2x performance improvement. If we're stuck using their non-SIMD code which is 2-3x slower than the SIMD version then what the hell is the point? This experience leaves me inclined to ditch dSFMT altogether.

@nalimilan
Copy link
Member

How about asking them about the traps you suspect are left in their code? ;-)

@ViralBShah
Copy link
Member Author

We are still using the SIMD code and I haven't been able to see a performance drop yet, much to my amazement.

@ViralBShah
Copy link
Member Author

Didn't see the comments above. I Will try to replicate the performance.

@ViralBShah ViralBShah added the randomness Random number generation and the Random stdlib label Nov 22, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

randomness Random number generation and the Random stdlib

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants