Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random: make RandomDevice() instances share the same /dev/urandom file #27936

Merged
merged 5 commits into from
Dec 3, 2019

Conversation

rfourquet
Copy link
Member

@rfourquet rfourquet commented Jul 4, 2018

This is a solution to the problem that too many RandomDevice() couldn't co-exist because each one would open a separate file.

EDIT: the initial OP below was about "add a global RANDOM_DEVICE generator Random", which was the initial solution to the same problem.

Creating a RandomDevice object is cheap but it opens a file
on unix systems, so one can't call rand(RandomDevice())
repeatedly too quicly (e.g. in a loop). It can therefore be
convenient to have a readily available generator.

It was surprisingly difficult to implement this global RandomDevice object (of course it would be easier to use a Ref{RandomDevice}), so users shouldn't be expected to have to do so themselves, which justifies I think to put it in Random. The current implementation is ugly (and possibly wrong), so improvement suggestions are welcome.

@rfourquet rfourquet changed the title add a global Random.RANDOM_DEVICE generator Random: add a global RANDOM_DEVICE generator Jul 5, 2018
@rfourquet rfourquet added the domain:randomness Random number generation and the Random stdlib label Jul 5, 2018
@rfourquet
Copy link
Member Author

bump

@rfourquet
Copy link
Member Author

Bump

@StefanKarpinski
Copy link
Sponsor Member

@rfourquet: I'm not sure what you need here... review? Who would be good to review?

@rfourquet
Copy link
Member Author

I need feedback on the idea! I think it's a good one, but I wouldn't dare merging this kind of change without approval. But a code review would be nice too as I touch internals that I'm not so comfortable with.

@StefanKarpinski
Copy link
Sponsor Member

It seems like a fine idea to me. We can always hold off on exporting it for a while in case we change our minds.

@ViralBShah
Copy link
Member

ViralBShah commented Oct 7, 2018

I have a couple of thoughts. I didn't like the idea of having a default open file descriptor globally - but it probably doesn't matter much in practice. Would this be frowned upon in production codes for any reason?

The other thing looking at this code is that it seems to be Unix only. So, code using it won't work on Windows. It would be much nicer if it at least worked the same on all supported operating systems, if we are going to introduce a new global.

One possibility is that we could provide a helper function that makes this convenient, and is a no-op on windows?

@rfourquet
Copy link
Member Author

rfourquet commented Oct 7, 2018

I will give context on what prompted me to implement this global RANDOM_DEVICE: the randtmp() code in Revise's tests, which generates a random temporary directory name; as @testset resets the state of the global RNG, Tim needed to seed!() the global RNG to get a a new different name in each different run @testset.

The obvious solution here would be dirname = joinpath(tempdir(), randstring(RANDOM_DEVICE, 10)), without needing to deal with the GLOBAL_RNG. RandomDevice() can be used instead, but the problem is that it will open many different files (on unix) and cause errors if they are not being closed fast enough. Then a global RandomDevice needs to be created to avoid this. As on unix a RandomDevice() object holds a file handle, the only sane solution (an "unsane" one being this PR) for instantiating a global in a module (Revise in this case) is to have const rng = Ref{RandomDevice}() and to initialize it at __init__. This seems a bit too much for developpers who don't care about the internal implementation of RandomDevice.

So I think that RANDOM_DEVICE should be exported instead of RandomDevice(), which I guess should be used only when thread-safety is an issue (or is RANDOM_DEVICE here already thread-safe?)

Would this be frowned upon in production codes for any reason?

I can't say, and don't know whom to ask!

it seems to be Unix only.

No, in all cases RANDOM_DEVICE is a global RNG, but only in the unix case must it be initialzed in __init__, as on Windows there is no file to open.

@StefanKarpinski
Copy link
Sponsor Member

Could we just cache the random device handle once it’s been opened the first time?

@rfourquet
Copy link
Member Author

rfourquet commented Nov 30, 2018

Could we just cache the random device handle once it’s been opened the first time?

We could, but I'm not very clear on what problem this solves... What are the drawbacks of opening unconditionally the pseudo-file "/dev/urandom" in each Julia session?

EDIT: sorry for such late reply!

@rfourquet
Copy link
Member Author

Does anyone know if it's thread-safe to have one global object with one handle to the "/dev/urandom" file ?

I will add the triage label here.

Also, another idea: instead of exporting yet another RANDOM_DEVICE object, what about having RandomDevice() be an empty struct and not open a file handle, but instead use a globally opened one, shared among all RandomDevice() instances? In this way, we don't have to warn people agains opening too many instances of RandomDevice to avoid too many open file handles.

@rfourquet rfourquet added the status:triage This should be discussed on a triage call label Oct 30, 2019
@StefanKarpinski
Copy link
Sponsor Member

How does that affect Julia's startup time? If it's negligible, then this is fine, if it's not, we may want to great a special object that opens this file lazily.

@rfourquet
Copy link
Member Author

How does that affect Julia's startup time?

I think it's negligible. But I will definitely check and do the thing lazily if needed. But actually, we create anyway a RandomDevice() at startup time to initialize the global RNG, so we might as well initialize RANDOM_DEVICE instead of a temporary. But I think I like the idea of all RandomDevice() instances sharing the same file better. I just need to know whether it's thread safe.

@StefanKarpinski
Copy link
Sponsor Member

At the moment, I believe that all I/O is locked so it should be but @JeffBezanson or @vtjnash should be able to say more.

@rfourquet
Copy link
Member Author

I've updated as follows: the file handle is now a global variable, which is assigned lazily. So RandomDevice() doesn't open a new file handle in general, which means it's fine to call rand(RandomDevice()) in a loop. I therefore removed RANDOM_DEVICE, and this PR just becomes kind of a bug fix.

It remains to be clarified whether this is thread-safe to read "/dev/urandom" from multiple thread (through the same file handle). Otherwise, I will just put the files in thread-local storage, and if required (for performance), put back a file handle in the RandomDevice struct which gets a reference to the global thread-local file handle (to avoid retrieving this handle on every rand call).

@chethega
Copy link
Contributor

chethega commented Nov 1, 2019

This should be threadsafe on unix (from 1.3 we use threadsafe libuv functions for IO). On prior versions, a segfault seems ok (just like print used to segfault). There is a data race on initialization of the RNG, but that's benign: worst case, we open nthreads many file descriptors to /dev/urandom.

No idea about windows: We use an ancient deprecated cryptographically questionable API.

@rfourquet
Copy link
Member Author

There is a data race on initialization of the RNG

Ah right of course. As we open "/dev/urandom" on startup anyway (for default_rng(), the lazy thing is not really necessary
I didn't realize till now that we don't do that anymore, with the threaded stuff.

Ok so I updated the PR to open the file lazily, per thread, like is done for Random.THREAD_RNGs. It actually doesn't complicate the code much.
But doesn't seem ideal, I have only two threads on my laptop, so not a big deal, but with more cores, that sucks to open so many file descriptors: the alternative is to open non-lazily 2 (/dev/random and /dev/urandom) file descriptors at startup, which are shared by all the thread afterwards (as it's confirmed that reading a file descriptor is thread-safe). As RandomDevice() is used for seeding default_rng(), it's not an unlikely scenario to end up creating RandomDevice() on each thread.

@rfourquet
Copy link
Member Author

No idea about windows: We use an ancient deprecated cryptographically questionable API.

No idea either! but at least it doesn't share the problem of the unix version of opening a file descriptor for each creation of an instance!

But definitely an issue if the API is deprecated, although probably more relevant to discuss this in #32954.

@StefanKarpinski
Copy link
Sponsor Member

No idea about windows: We use an ancient deprecated cryptographically questionable API.

Maybe it would be good to open an issue about this and ideally fix it and use a new, good API 😁

@rfourquet rfourquet changed the title Random: add a global RANDOM_DEVICE generator Random: make RandomDevice() instances share the same /dev/urandom file Nov 15, 2019
Creating a RandomDevice object is cheap but it opens a file
on unix systems, so one can't call `rand(RandomDevice())`
repeatedly too quicly (e.g. in a loop). It can therefore be
convenient to have a readily available generator.
@rfourquet
Copy link
Member Author

I believe this is in a good state to be merged. Would be good if someone (maybe @vtjnash ?) could review, the change is very simple here.

As a TL;DR: having each RandomDevice() open a new file is bad (usually can't be used in loops), so have the instances share the same file, opened at __init__ time. The alternative that I see which would allow lazily opening the file is either:

  1. have each thread have its own file, which are opened lazily: doesn't seem great on many-core machines
  2. have only one file opened lazily, but use locks.

I find both options to be overkill, and I would expect the files to be opened anyway in most programs. Opening the files seem to take sub-millisecond time (like 30 microseconds on my machine), so it's not likely to be a problem for start-up time.

@rfourquet rfourquet removed the status:triage This should be discussed on a triage call label Nov 15, 2019
@vtjnash
Copy link
Sponsor Member

vtjnash commented Nov 15, 2019

I wonder how long until we can switch this to use the new "modern" interface of getentropy (available since early 2017)

unlimited = deserialize(s)
return RandomDevice(unlimited=unlimited)
end
getfile(rd::RandomDevice) = @inbounds DEV_RANDOM[1 + rd.unlimited]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be @inbounds as it is not essential. And while equivalent, I think using ?: will read more clearly.

I'd also suggest going with the lazy approach you mentioned.

Suggested change
getfile(rd::RandomDevice) = @inbounds DEV_RANDOM[1 + rd.unlimited]
function getfile(rd::RandomDevice)
dev = rd.unlimited ? 2 : 1
isassigned(DEV_RANDOM, dev) || (DEV_RANDOM[dev] = open(rd.unlimited ? "/dev/urandom" : "/dev/urandom"))
return DEV_RANDOM[dev]
end
function __init__()
    ...
    Sys.iswindows() || resize!(empty!(DEV_RANDOM), 2)
    nothing
 end

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also suggest going with the lazy approach you mentioned.

Your suggestion is basically what I had in the second commit, but @chethega mentioned that this looks thread unsafe:

There is a data race on initialization of the RNG, but that's benign: worst case, we open nthreads many file descriptors to /dev/urandom.
#27936 (comment)

I was quite conviced. That's why I then gave up laziness because I found it overkill to have one file per thread (third commit), and annoying to explicitly use locks.

But I'm not well versed in data-races, what is the worst case here? to open a file twice? is DEV_RANDOM[dev] in your example always pointing to a valid file even if the location has been written concurrently?

(and thanks for your review!)

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you'd potentially leak up to nthreads copies of the file descriptors instead of 2. We really need to add a Threads.once function soon, since it's rather absurd to be making design decisions based on the lack of a simple little helper functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I had misunderstood @chethega's comment, which I thought was suggesting to "open nthread many file descriptors" ourselves, i.e. in an array like the default_rng(), to avoid the data-race! Sorry for the noise.
So I updated accordingly, with a TODO to add "thread-once" when availble.

@StefanKarpinski
Copy link
Sponsor Member

What's the status of this?

@rfourquet
Copy link
Member Author

This is good to go on my side, I reverted to a previous version of this PR which basically matches what @vtjnash suggested, except that it uses two global refs instead of an array of length 2 ("/dev/random" is likely to be only very rarely open, so if we are lazy, lets be so all the way and open only the necessary file(s)). A final quick review might be appropriate. I didn't give a second thought to the name getfile, it's not very important but I could change it if someone has a better name.

@rfourquet
Copy link
Member Author

Side note to whomever will merge: squash the commits, but set what is in the title of the PR as the commit message, as none of the intermediate commit messages is good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:randomness Random number generation and the Random stdlib
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants