Partially Fix Issue 18596: use arc4random when available for unpredictableSeed #6267

n8sh · 2018-03-12T10:14:35Z

EDIT: reduced this PR just to using arc4random when available. Other cases left for another PR.

Use arc4random when available, ~~otherwise replace MinstdRand0 with xorshift64*/32.~~

This pull request does not seek to make std.random.unpredictableSeed cryptographically secure, but just correct some basic deficiencies mentioned in https://issues.dlang.org/show_bug.cgi?id=18596:

Currently std.random.unpredictableSeed returns the result of a thread-local MinstdRand0 instance xor'd against the clock. MinstdRand0 is slow (due to integer division) and somewhat outdated. A particular weakness of using MinstdRand0 is that it is very likely that consecutive calls to unpredictableSeed will return numbers that are identical in the high bit, since MinstdRand0 only produces results in the range 1 .. 2 ^^ 31 - 1.

Proposed remedy:

There are modern PRNG algorithms that have comparable state size to MinstdRand0 (64 bits or 32 bits) but are faster than MinstdRand0 and have output that scores better on randomness tests like BigCrush.

XorShift64*/32 is one example. (Results of randomness tests and speed tests appear in charts in this paper.) It has the virtue that 0 is an illegal state, which means that we don't need a separate flag to indicate whether it is initialized.

And:

On some platforms we can use functions like arc4random which incorporate system entropy and remove the need to roll our own entropy-gathering function to set an initial state for a PRNG.

dlang-bot · 2018-03-12T10:14:38Z

Thanks for your pull request and interest in making D better, @n8sh! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
I have provided a detailed rationale explaining my changes
New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.

If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Auto-close	Bugzilla	Severity	Description
✓	18596	enhancement	std.random.unpredictableSeed could use something better than MinstdRand0

n8sh · 2018-03-12T10:19:26Z

More ambitious work was previously explored by @yshui in pull #5230.

@wilzbach re: #6021 (review)

FWIW we should start porting the good stuff from mir-random to Phobos, for example, we could begin with with unpredictableSeed.

This is a start, but there is still more to do before the parts you wrote for Linux and Windows are in. I wanted to start small.

wilzbach

A few initial comments and questions

wilzbach · 2018-03-12T10:23:43Z

std/random.d

-@property uint unpredictableSeed() @trusted nothrow @nogc
+alias unpredictableSeed = unpredictableSeedOf!uint;
+/// ditto
+@property UIntType unpredictableSeedOf(UIntType)() @nogc nothrow @trusted


As this is a new public symbol it would require a changelog entry + @andralex approval.
Maybe it's easier to set it to private for now?

Since you have more experience here, if you think splitting it will speed along the review process I'll do that and amend the title.

f you think splitting it will speed along the review process

Yeah I do think it will help. Approval of a new public symbol takes usually > one month (as once its added it can't be removed or modified anymore), so I would recommend setting it to private and once this PR is merged, opening a PR that just removes private. With this approach, you aren't blocked on the approval for new symbols.

AFAIK a public alias of a private member doesn't work so I instead removed the unpredictableSeedOf!UIntType template.

wilzbach · 2018-03-12T10:25:30Z

std/random.d

-    if (!seeded)
+    version (AnyARC4Random)
+    {
+        // On macOS if we just need 32 bits it is faster to use


AFAICT is this no longer only macOS, so "/On macOS/d"?

I wrote "on macOS" because I've timed this way is faster on macOS but I haven't timed it on other platforms. I suspect it's faster on OpenBSD etc. but I don't know.

Ah my point was just that the static if below is (UIntType.sizeof <= uint.sizeof), so you don't distinguish between macOS, but mention it in your motivation which is a bit confusing.

I suspect it's faster on OpenBSD etc. but I don't know.

I think so too. It's just returning one register value and no looping should be necessary.

wilzbach · 2018-03-12T10:29:23Z

std/random.d

+        // generators, scrambled" (Vigna 2016).
+        x ^= x >>> 12;
+        x ^= x << 25;
+        x ^= x >>> 37;


Can't we use the existing Xorshift range?

I would like to but I can't. It is broken for anything but 32-bit xorshift.

I mean xorshift with 32-bit words, although it supports several sizes of that.

EDIT: earlier reported at https://issues.dlang.org/show_bug.cgi?id=18327 "std.random.XorshiftEngine is parameterized by UIntType but only works with uint"

Hmm, that sucks. You are referring to https://issues.dlang.org/show_bug.cgi?id=18327, right?

(edit: I didn't see your response before)

wilzbach · 2018-03-12T10:32:16Z

std/random.d

+        ulong tid = cast(ulong) &_seeder; // Distinct for each thread.
+        tid *= m;
+        tid = (tid ^ (tid >>> 47)) * m;
+        result = (result ^ tid) * m;


This is repeated for three times, couldn't it be but in a shift function?

It was tiny enough that I didn't think it was worth the bother, but it might be clearer this way. Will do.

wilzbach · 2018-03-12T10:43:03Z

This is a start, but there is still more to do before the parts you wrote for Linux and Windows are in. I wanted to start small.

Fair enough. Thanks a lot for picking this up! I haven't done much with random numbers lately, so this unfortunately was lost in my TODO queue.

but there is still more to do before the parts

Do you plan to replace bootstrapSeed with getRandomX then and just use it as fallback?

FWIW we should start porting the good stuff from mir-random to Phobos

Thinking more about it, except for unpredictableSeed is probably quite hard to do so and it's more or less a lost cause due to the inability for making breaking changes. At some point we should probably just deprecate std.random and adapt mir-random to e.g. std.math.random

wilzbach · 2018-03-12T10:56:23Z

std/random.d

+    {
+        ulong result = void;
+        enum ulong m = 0xc6a4_a793_5bd1_e995UL; // MurmurHash2_64A constant.
+        void update_result(ulong x)


Sorry for the nitpicking, but the DStyle requires camelCase not snake_case.

wilzbach · 2018-03-12T11:05:09Z

AFAIK a public alias of a private member doesn't work so I instead removed the unpredictableSeedOf!UIntType template.

Yeah you are correct: https://run.dlang.io/is/Fk9UQy (sorry)

n8sh · 2018-03-12T11:15:52Z

Do you plan to replace bootstrapSeed with getRandomX then and just use it as fallback?

Yeah, either that or internally use getRandomX to implement unpredictableSeed, either way. Might vary by platform depending on benchmarks. On Windows I would probably use CryptGenRandom to produce the result like in mir.random.unpredictableSeed since it seems fast enough.

JackStouffer · 2018-03-12T13:13:12Z

This needs two notes added to the docs:

How secure/insecure this is for cryptographic purposes
I know it's common for users to re-seed their RNGs often with something like this function. As I understand it, that actually makes for lower entropy. If so, please make a note explaining this.

n8sh · 2018-03-12T13:27:21Z

How secure/insecure this is for cryptographic purposes

There is no change to the existing level of security (none). This is currently documented at the top of the file:

$(RED Disclaimer:) The _random number generators and API provided in this
module are not designed to be cryptographically secure, and are therefore
unsuitable for cryptographic or security-related purposes such as generating
authentication tokens or network sequence numbers. For such needs, please use a
reputable cryptographic library instead.

n8sh · 2018-03-12T13:53:53Z

I know it's common for users to re-seed their RNGs often with something like this function. As I understand it, that actually makes for lower entropy. If so, please make a note explaining this.

Added this:

Note:
In general periodically 'reseeding' a PRNG does not improve its quality
and in some cases may harm it. For an extreme example the Mersenne
Twister has `2 ^^ 19937 - 1` distinct states but after `seed(uint)` is
called it can only be in one of `2 ^^ 32` distinct states regardless of
how excellent the source of entropy is.

JackStouffer · 2018-03-15T19:59:36Z

Pinging people who are qualified to review this @wilzbach @quickfur @andralex

andralex

I'm not an expert but before we'd use the pid, the tid, and the time as sources of "surprise". Now we're only using the time. Isn't that a step backwards? E.g. several threads call the function simultaneously.

n8sh · 2018-03-19T04:30:23Z

I'm not an expert but before we'd use the pid, the tid, and the time as sources of "surprise". Now we're only using the time. Isn't that a step backwards?

Before once for each thread we used the pid and tid and time to initialize a thread-local PRNG, and used the output of that PRNG mixed with the time to produce seeds. We're still doing that. EDIT: The once-per-thread initialization using pid+tid+time is in the private bootstrapSeed function.

joseph-wakeling-sociomantic · 2018-03-21T16:23:51Z

otherwise replace MinstdRand0 with xorshift64*/32.

I'm not keen on that particular change because those are generators a user might reasonably want to use from within their own app, and hence it's probably not a good idea to have the unpredictable seed make use of those same algorithms.

Before making any change it's probably worth asking what the state of the art is in other languages and libraries.

joseph-wakeling-sociomantic · 2018-03-21T16:36:14Z

std/random.d

+}
+else
+{
+    private ulong _seeder; // 0 indicates uninitialized.


Minor note: this can be made a static variable inside the bootstrapSeed function, no?

It could be inside unpredictableSeed but I put it outside so it could be used in several different functions (for example, if in the future there are variants of unpredictableSeed with outputs of different sizes).

joseph-wakeling-sociomantic · 2018-03-21T16:38:09Z

Before making any change it's probably worth asking what the state of the art is in other languages and libraries.

To expand on that: the principal problem I see here is that the fallback (non-arc4) solution doesn't meaningfully improve on the mechanism used. It just switches out the pseudo-random algorithm that is used to transform the pid+tid+time-derived seed.

It would be better to look at improved mechanisms for that, in general, than to think that switching the pseudo-random component particularly gives us any benefit.

n8sh · 2018-03-21T17:24:42Z

What I would suggest is that we separate out concerns, i.e. we don't couple the decision to prefer arc4 where available, with the decision to rewrite the existing unpredictableSeed mechanism. Does that sound reasonable?

Sure.

…tableSeed

n8sh · 2018-03-22T08:01:20Z

Reduced this PR just to using arc4random when available. Other cases left for another PR.

joseph-wakeling-sociomantic · 2018-03-22T12:49:17Z

Now, specifically w.r.t. whether we should prefer (a)RC4 where it is available: how does this proposal fit with the fact that RC4 is being phased out in other contexts?
https://tools.ietf.org/html/rfc7465
https://www.computerworld.com/article/2489395/encryption/microsoft-continues-rc4-encryption-phase-out-plan-with--net-security-updates.html

@JackStouffer I'm not sure whether we should approve this without discussing RC4/arc4random itself. On the contrary I think it would be a good idea to consider, in detail, the range of options for improving unpredictableSeed.

In particular, w.r.t. this existing doc:

$(RED Disclaimer:) The _random number generators and API provided in this
module are not designed to be cryptographically secure, and are therefore
unsuitable for cryptographic or security-related purposes such as generating
authentication tokens or network sequence numbers. For such needs, please use a
reputable cryptographic library instead.

.... this disclaimer is not a good reason to accept solutions that are considered inadequate or deprecated in other contexts.

JackStouffer · 2018-03-22T13:03:35Z

.... this disclaimer is not a good reason to accept solutions that are considered inadequate or deprecated in other contexts.

On the contrary, the disclaimer is the very reason we can accept it where in other contexts, others cannot. If arc4random produces more randomized results (and is faster) than the current function when available, it should be of no consequence whether it's not suitable for securing data, as that is expressly not the purpose of std.random. As far as I can tell, arc4random is better than the current solution.

n8sh · 2018-03-22T13:05:46Z

Now, specifically w.r.t. whether we should prefer (a)RC4 where it is available: how does this proposal fit with the fact that RC4 is being phased out in other contexts?

FYI, on a number of platforms arc4random doesn't actually use RC4 but some modern cipher. See the code comments in this PR for details.

joseph-wakeling-sociomantic · 2018-03-22T13:13:59Z

On the contrary, the disclaimer is the very reason we can accept it where in other contexts, others cannot.

If other contexts are abandoning RC4, might it not be a good idea to actually examine what they are preferring?

After all, it's not a great look to be newly adopting something that others are actively abandoning. It might even have consequences for adoption (I wouldn't want to assume that, for example, there might not be new standards banning use of any library that uses RC4). We should at least consider whether there are alternative, straightforwardly better options that can be implemented just as readily.

I am strongly opposed to that disclaimer being used as an excuse to accept changes that deliver less than the best possible solution, given the current state of knowledge.

FYI, on a number of platforms arc4random doesn't actually use RC4 but some modern cipher. See the code comments in this PR for details.

That matches my recollection. FWIW I would personally prefer that if we do we only use those implementations that are known to have improved implementations.

joseph-wakeling-sociomantic · 2018-03-22T13:18:28Z

BTW, @n8sh, just for clarity: I am really pleased that you are working on unpredictableSeed and trying to improve it. I just think that, given the currently not-great situation of it, we might as well put the work in to try to understand all the possible improvements we could make.

It might well be a good effort/result tradeoff to switch to arc4random on selected platforms, but it's worth being clear what the state of the art is and how much trouble it would be to use/implement.

So, I would encourage you not to constrain yourself by concerns like minimal change, or the disclaimer about non-crypto. I'd rather suggest considering the question: if you were designing unpredictableSeed from scratch, to be as good as it can be, how should it be implemented?

wilzbach · 2018-03-22T13:20:21Z

if you were designing unpredictableSeed from scratch, to be as good as it can be, how should it be implemented?

Like mir-random (that applies for almost everything in std.random).
Imho std.random is an (almost) lost cause.

joseph-wakeling-sociomantic · 2018-03-22T13:20:49Z

So what does mir-random do for its unpredictable seed?

joseph-wakeling-sociomantic · 2018-03-22T13:21:32Z

BTW folks, I can't help but feel that, when one reviewer is raising concerns over a PR, it's rather uncool to just click "Approve" without offering some sort of response to those concerns?

wilzbach · 2018-03-22T13:27:32Z

Reduced this PR just to using arc4random when available. Other cases left for another PR.

Ugh. I am sorry @n8sh. I wanted to leave this PR open for a few days and didn't expect this reaction.
FTR

I liked the other bits of your PR and hope you aren't too demotivated by this discussion.
I agree with the others that unpredictableSeed doesn't need to be cyrptographically secure, because nothing in std.random is

wilzbach · 2018-03-22T13:35:30Z

I can't help but feel that, when one reviewer is raising concerns over a PR, it's rather uncool to just click "Approve" without offering some sort of response to those concerns?

Your concerns have already been addressed by other people. I disagree on your assessment (and I did comment stating this though that had a bit of a delay from my phone).
Hence, I show my support for these changes by approving them.

So what does mir-random do for its unpredictable seed?

This is/was a preparation for porting unpredictableSeed from mir-random to Phobos. Well, before it was cut down to the current state.
For example, it uses Linux's "new" built-in kernel API:
https://github.com/libmir/mir-random/blob/master/source/mir/random/engine/package.d

wilzbach · 2018-03-22T13:41:40Z

Oh and for the record LLVM's STL uses arc4random for its seeding too: https://github.com/llvm-mirror/libcxx/blob/master/src/random.cpp
(see also the other links and sources posted here: libmir/mir-random#13)

JackStouffer · 2018-03-22T13:49:21Z

@joseph-wakeling-sociomantic If there exists better functions than arc4random for this purpose, then I agree that it would make sense to use those if possible. My issue was that we would be stalling this PR for concerns about security when that's not the purpose of std.random.

Can't this also be addressed in a follow-up PR? This is ready to go and is an improvement over the status quo.

wilzbach · 2018-03-22T13:56:38Z

It might well be a good effort/result tradeoff to switch to arc4random on selected platforms, but it's worth being clear what the state of the art is and how much trouble it would be to use/implement.

For the record, arc4random is still state-of-the-art of BSD-like platforms (as mentioned C++'s STL uses it too) and is a lot better than combining the Thread ID + current timestamp. IIRC only FreeBSD/DragonFlyBSD don't use ChaCha20, but it's an OS API like /dev/urandom and /dev/random and as with /dev/random you are essentially up to having to trust the OS.

There are many articles online like
http://nshipster.com/random/#why-should-i-use-arc4random(3)-instead-of-rand(3)-or-random(3)? or https://security.stackexchange.com/questions/85601/is-arc4random-secure-enough, but the main point is that neither /dev/(u)random nor arc4random is perfect, but the arc4random system call is faster than the file-based /dev/(u)random API.
This PR is about (step-by-step) moving away from the home-brewed ThreadID + Timestamp seeding approach over which arc4random is obviously better.

wilzbach · 2018-03-22T14:06:14Z

FYI, on a number of platforms arc4random doesn't actually use RC4 but some modern cipher. See the code comments in this PR for details.

(for the record copied over from this excellent SO post):

OpenBSD 5.5 (May 1, 2014) switched arc4random to ChaCha20.
NetBSD 7 (Oct 8, 2015) switched arc4random to ChaCha20.
libbsd 0.8.0 (Nov 30, 2015) switched arc4random to ChaCha20. (libbsd provides arc4random for systems like GNU/Linux.)
macOS 10.2 (Sep 20, 2016) seems to have switched arc4random to AES.
Android (unknown version) switched arc4random to ChaCha20.
Illumos (unknown version) added arc4random with ChaCha20.

Also as pointed out on SO

FreeBSD and DragonFlyBSD are the only ones still using RC4
even LibreSSL uses arc4random on some systems...

joseph-wakeling-sociomantic · 2018-03-22T14:30:57Z

My issue was that we would be stalling this PR for concerns about security when that's not the purpose of std.random.

We are having a discussion about what the options are and why we might pick one or another, which leaves a detailed, documented record of the reasons why an important change like this gets made.

It's important to have that discussion -- and that record -- not only because it means that people can then look back later and understand the reasons for the change, but also because it means that we make sure that the change is really the right one for the project.

For example, arc4random is being preferred over /dev/urandom because it's faster ... but does it actually matter to be as fast as possible to generate an unpredictableSeed? How many unpredictableSeed calls do you want to make in any given application, and is it ever likely to be a bottleneck?

An alternative example: @n8sh remarked earlier that the choice of fallback solution was a compromise to ensure minimal behavioural change compared to existing unpredictableSeed. I'm interested in encouraging him to not feel constrained in that way, but to feel free to offer what he thinks is the best solution.

Can't this also be addressed in a follow-up PR? This is ready to go and is an improvement over the status quo.

Well, it would be nice to avoid code churn. It's arguably preferable to identify what we really think is the best option and make that happen ... than to have a series of contradictory changes scattered across different releases. For example, it doesn't look like the mir-random approach would be that hard to adapt for phobos. If that's viewed as preferable, why go for a halfway house?

BTW please note at no point have I said "This PR should not be merged." What I've asked for is clarity on the pros, cons, and alternatives.

I liked the other bits of your PR and hope you aren't too demotivated by this discussion.

For the record, @n8sh, if you're ever feeling demotivated by anything I say, or that I'm being unfair or creating hassle for you ... just raise the point with me. I don't bite, I do care that you feel positive about contributing, and my intention with any review is to ensure the most effective possible changes for phobos. My impression was that you were enjoying the discussion and welcoming the opportunity to clarify details of the impact of these changes.

@wilzbach note that as a result of this discussion, we now have a much better body of material explaining the context and meaning of this proposed change. That is considerably more helpful (for all of us, and for anyone wanting to understand this code) than just an "agree" or "disagree" based on reasons that only exist in the heads of the people (dis)agreeing.

WebDrake

Switching to my home account to write up a slightly more detailed set of feedback and thoughts.

While I recognize the general point about std.random not guaranteeing cryptographic security, I don't think that's a valid argument for introducing a dependency on a standard that is deprecated by the body that proposed it. But a possible compromise might be as follows:

use arc4random in unpredictableSeed only for version (SecureARC4Random) (i.e. where ChaCha20 or AES is the underlying implementation);
expose arc4random, arc4random_buf and arc4random_uniform publicly for any platform that defines them, so that any user can use them (note, we might want to do this in some other module than std.random given that it's platform-dependent);
ensure the different ARC4Random versions are documented, so the user knows how to check the state of things on their platform.

In this way, the user has a free choice to use arc4random directly regardless of the security level, but unpredictableSeed only uses it where the underlying implementation is not a deprecated/legacy version.

The notes below should give some more detail on some of these points.

WebDrake · 2018-03-22T18:38:17Z

std/random.d

+{
+    extern(C) private @nogc nothrow
+    {
+        uint arc4random() @safe;


Is there any reason why arc4random should not be exposed publicly on platforms that offer it, together with arc4random_buf and arc4random_uniform ... ? Perhaps given that these are system-dependent it might be better not to expose them via phobos, but I don't see any reason per se why they should not be made available.

Assuming it can be made public, I would suggest documenting the SecureARC4Random, LegacyARC4Random and AnyARC4Random versions in its documentation, so that the user knows how to check for themselves what quality standards it meets for their platform.

Is there any reason why arc4random should not be exposed publicly on platforms that offer it, together with arc4random_buf and arc4random_uniform ... ?

This PR has been edited not to introduce any new public symbols because new public symbols require @andralex's approval. See discussion: #6267 (comment)

(If we wanted to be more D-esque, the arc4random_buf implementation might be provided as arc4random_buf(void[] buf) and invoke the C API version underneath the hood; or that could be provided as an additional overload.)

This PR has been edited not to introduce any new public symbols because new public symbols require @andralex's approval. See discussion: #6267 (comment)

Fair enough, but note that this does not mean we cannot pursue the compromise outlined above: we'd just have to split between the changes to unpredictableSeed and the introduction of public arc4random functions.

In the event of serious hassle over the latter, we could always revisit the question of which platforms get the under-the-hood arc4random unpredictableSeed.

mir-random does something along those lines, but abstracts away the difference between platforms.

size_t genRandomNonBlocking()(scope ubyte[] buffer) @nogc nothrow @trusted { pragma(inline, true); return mir_random_genRandomNonBlocking(buffer.ptr, buffer.length); }

mir_random_genRandomNonBlocking (the awkward name is for C linkage) could be calling arc4random_buf, or it could be using Linux's getrandom syscall, etc. I think that's the right kind of interface.

Yep, introducing new public symbols is a tricky business.
And even if we get an approval, it should probably go to std.experimental.random as experience shows that it's really hard to get all use-cases right.

we wanted to be more D-esque,

There are a lot of things that could be done, e.g. ubyte[] (and of course camelCase), but I doubt that a user is ever going to care about the underlying details. What they care about is:

cross-platform

blocking/non-blocking

@nogc vs . GC

entropy quality

speed

Sometimes these concerns can be combined, but they don't want to special-case their code just because their code might be used on OSX too.

Is there any reason why arc4random should not be exposed publicly on platforms that offer it,

See reasons above + it's really easy to do so yourself as a user if you really want to.

introduction of public arc4random functions.

BTW declaration of extern(C) functions is done in DRuntime. Phobos is supposed to contain the cross-platform high-level abstractions.

That's what I assumed, hence the remarks about not exposing them via phobos.

WebDrake · 2018-03-22T19:16:41Z

std/random.d

+and in some cases may harm it. For an extreme example the Mersenne
+Twister has `2 ^^ 19937 - 1` distinct states but after `seed(uint)` is
+called it can only be in one of `2 ^^ 32` distinct states regardless of
+how excellent the source of entropy is.


This is a great piece of documentation, which I would suggest extending with discussion on how to more effectively seed generators (for example, MersenneTwisterEngine exposes a method to seed the generator using an InputRange of random words).

However, I would suggest not coupling it with these changes (it's independent of them) and submitting it in a separate PR.

While of course, separating PR has advantages, they also come with an additional overhead of creating and reviewing them.
So it's really ok to combine doc updates of the function currently touched in a PR. No need for the extra work.

WebDrake · 2018-03-22T19:24:07Z

std/random.d

-    static bool seeded;
-    static MinstdRand0 rand;
-    if (!seeded)
+    version (AnyARC4Random)


As a conservative first option, I would suggest only using arc4random for version (SecureARC4Random).

The consideration here is that in any case we still need to do something to improve unpredictableSeed for platforms that don't define arc4random. Platforms that have only a legacy arc4random may well benefit more from sharing that longer-term solution, than from the legacy arc4random. If we just switch now, the risk is that when the better solution is there, they don't get it, because everyone forgets that the legacy arc4random question needs revisiting.

I think it's worth keeping the pressure to do better ourselves, than just outsourcing to a platform implementation that uses a deprecated standard.

The consideration here is that in any case we still need to do something to improve unpredictableSeed for platforms that don't define arc4random.

These other solutions will also be platform-specific. There is nothing on the immediate horizon better than arc4random that will be usable on FreeBSD.

These other solutions will also be platform-specific. There is nothing on the immediate horizon better than arc4random that will be usable on FreeBSD.

OK, fair enough.

Would you be OK with the idea of breaking out the dangers-of-unwise-seeding doc into a separate patch (just to separate out the concerns)? Otherwise, given the case you've made, I'm fine to move forward with this.

Would you be OK with the idea of breaking out the dangers-of-unwise-seeding doc into a separate patch (just to separate out the concerns)?

Imho there's no need for this. PRs can have doc updates too, this is already a tiny PR and its related to this PR. This has never been an official criteria for a PR - it's only good practice to keep the focus of a PR small and to a single module only.

Imho there's no need for this. PRs can have doc updates too, this is already a tiny PR and its related to this PR.

You'll note I said separate patch in the comment you are responding to. (I did make an earlier remark about separate PR, but that was before responses that made clear that this PR should go forward essentially as-is.)

WebDrake · 2018-03-22T19:25:29Z

std/random.d

+// cryptographically secure sources of randomness are needed.
+
+// Performance note: ChaCha20 is about 70% faster than ARC4, contrary to
+// what one might assume from it being more secure.


Minor note/query: I believe at least some /dev/urandom implementations use ChaCha20. How does this impact the question of speed of /dev/urandom versus arc4random ... ? Is it worth considering as a factor?

WebDrake · 2018-03-22T19:35:07Z

std/random.d

+// of randomness, and also so other people reading this source code (as
+// Phobos is often looked to as an example of good D programming practices)
+// do not mistakenly use insecure versions of arc4random in contexts where
+// cryptographically secure sources of randomness are needed.


Oh, and one note that my review missed: this discussion comment is great, but I would suggest moving it inside the unpredictableSeed function, where it is most relevant. Assuming that arc4random might be made public, some of the discussion points could be covered in its documentation instead.

n8sh · 2018-03-22T21:17:34Z

Well, it would be nice to avoid code churn. It's arguably preferable to identify what we really think is the best option and make that happen ... than to have a series of contradictory changes scattered across different releases.

The roadmap forward would be adding more version blocks, so each subsequent PR only adds new cases rather than deleting or reversing previous PRs. Not removing MinstdRand0 in this PR means that there will have to be a little churn when it is replaced because those lines were touched by this PR as an indentation change, but besides that the structure of the code still lends itself to incremental upgrades.

Can't this also be addressed in a follow-up PR? This is ready to go and is an improvement over the status quo.

I second this sentiment.

wilzbach · 2018-03-22T21:47:07Z

FYI: I now put this PR on the merge-queue. Thanks again @n8sh for the hard work!
@joseph-wakeling-sociomantic @WebDrake sorry for being a bit harsh and you are right that often what is obvious to one (as it's "in the head"), isn't obvious to other, so thanks a lot for all your comments and help!
I did put the PR on the merge queue because we have never really been picky about documentation concerns of a PR as once the code as has been added everyone (and not only the PR author) can improve the documentation.

This is a great piece of documentation, which I would suggest extending with discussion on how to more effectively seed generators

This doesn't block this PR - documentation updates can always follow-up after this has been merged ;-)

WebDrake · 2018-03-22T21:50:22Z

I did put the PR on the merge queue because we have never really been picky about documentation concerns of a PR as once the code as has been added everyone (and not only the PR author) can improve the documentation.

No worries, your call ;-) I tend to encourage people to take every opportunity to separate out the concerns of a changeset, just to be clear what really goes together; but of course there are bigger fish to fry.

Thanks @n8sh for a nice piece of work, and for all your patience and effort responding to questions.

n8sh requested a review from wilzbach as a code owner March 12, 2018 10:14

dlang-bot added the Enhancement label Mar 12, 2018

wilzbach reviewed Mar 12, 2018

View reviewed changes

n8sh force-pushed the unpredictableSeedOf-arc4random branch from 74ef55f to 7ba2c23 Compare March 12, 2018 10:52

n8sh changed the title ~~Fix Issue 18595 & Issue 18596: add unpredictableSeedOf!UIntType and unpredictableSeed could use something better than MinstdRand0~~ Fix 18596: unpredictableSeed could use something better than MinstdRand0 Mar 12, 2018

wilzbach reviewed Mar 12, 2018

View reviewed changes

n8sh force-pushed the unpredictableSeedOf-arc4random branch 2 times, most recently from 15e5ec5 to bfaf31c Compare March 12, 2018 11:04

n8sh force-pushed the unpredictableSeedOf-arc4random branch from bfaf31c to 84287ed Compare March 12, 2018 11:28

n8sh force-pushed the unpredictableSeedOf-arc4random branch from 84287ed to 5eab123 Compare March 12, 2018 13:53

n8sh force-pushed the unpredictableSeedOf-arc4random branch 5 times, most recently from a3ce0b0 to 9eea216 Compare March 12, 2018 14:58

andralex reviewed Mar 19, 2018

View reviewed changes

n8sh force-pushed the unpredictableSeedOf-arc4random branch from 9eea216 to 2e12e1e Compare March 19, 2018 04:53

joseph-wakeling-sociomantic reviewed Mar 21, 2018

View reviewed changes

Partially Fix Issue 18596: use arc4random when available for unpredic…

f39686c

…tableSeed

n8sh force-pushed the unpredictableSeedOf-arc4random branch from 2e12e1e to f39686c Compare March 22, 2018 07:59

n8sh changed the title ~~Fix 18596: unpredictableSeed could use something better than MinstdRand0~~ Partially Fix Issue 18596: use arc4random when available for unpredictableSeed Mar 22, 2018

JackStouffer approved these changes Mar 22, 2018

View reviewed changes

wilzbach approved these changes Mar 22, 2018

View reviewed changes

WebDrake reviewed Mar 22, 2018

View reviewed changes

wilzbach added the auto-merge label Mar 22, 2018

dlang-bot merged commit b87d28f into dlang:master Mar 22, 2018

n8sh mentioned this pull request Mar 30, 2018

Fix Issue 18595 & Issue 18596: add unpredictableSeed!UIntType, support non-arc4random entropy sources, & replace MindstdRand0 with SplitMix #6388

Closed

Partially Fix Issue 18596: use arc4random when available for unpredictableSeed #6267

Partially Fix Issue 18596: use arc4random when available for unpredictableSeed #6267

Conversation

n8sh commented Mar 12, 2018 • edited

dlang-bot commented Mar 12, 2018 • edited

Bugzilla references

n8sh commented Mar 12, 2018

wilzbach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8sh Mar 12, 2018 • edited

Choose a reason for hiding this comment

wilzbach Mar 12, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wilzbach commented Mar 12, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wilzbach commented Mar 12, 2018

n8sh commented Mar 12, 2018

JackStouffer commented Mar 12, 2018

n8sh commented Mar 12, 2018

n8sh commented Mar 12, 2018

JackStouffer commented Mar 15, 2018

andralex left a comment

Choose a reason for hiding this comment

n8sh commented Mar 19, 2018 • edited

joseph-wakeling-sociomantic commented Mar 21, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joseph-wakeling-sociomantic commented Mar 21, 2018

n8sh commented Mar 21, 2018

n8sh commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

JackStouffer commented Mar 22, 2018

n8sh commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

wilzbach commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

wilzbach commented Mar 22, 2018

wilzbach commented Mar 22, 2018

wilzbach commented Mar 22, 2018

JackStouffer commented Mar 22, 2018

wilzbach commented Mar 22, 2018

wilzbach commented Mar 22, 2018

joseph-wakeling-sociomantic commented Mar 22, 2018

WebDrake left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wilzbach Mar 22, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

n8sh commented Mar 22, 2018

wilzbach commented Mar 22, 2018

WebDrake commented Mar 22, 2018

n8sh commented Mar 12, 2018 •

edited

dlang-bot commented Mar 12, 2018 •

edited

n8sh Mar 12, 2018 •

edited

wilzbach Mar 12, 2018 •

edited

n8sh commented Mar 19, 2018 •

edited

wilzbach Mar 22, 2018 •

edited