Support generation of strong random numbers #1372

g-andrade · 2017-03-12T17:25:39Z

This PR proposes two new additions to the crypto module, both named strong_rand_uniform, for effortless generation of cryptographically secure numbers:

crypto:strong_rand_uniform/0: generates floats on the open interval ]0.0, 1.0[
crypto:strong_rand_uniform/1: generates integers on an arbitrary closed interval [1, N]

These follow the same interfaces as rand:uniform/0 and rand:uniform/1, and both use OpenSSL's BN_rand_range method.

Generated floating point values are limited to an effective entropy of up to 51 bits but are expected to be uniformly distributed between 0.0 and 1.0.

Supersedes #1363.

g-andrade · 2017-03-14T23:55:51Z

A conflict had popped up in the meantime, in crypto.c; I've rebased over master and it's ok now.

RaimoNiskanen · 2017-03-17T10:54:04Z

I have finally had the time to think through this PR, and think we should adapt it more to be a rand plugin, plus change rand to actually alow plugins. Thereby we get a uniform API for different random generators and also normal standard deviation strong random floats for free.

By the way, do you have an actual use case for strong random integers?

We are phasing out the use of mpint's. Use plain binaries and get_bn_from_bin() instead.

As building blocks we need most of your suggested functions, but I suggest:

Make strong_rand_uniform_nif/2 reflect the backend libcrypto function BN_rand_range() - rename it to strong_rand_range_nif(Range :: binary()) -> binary(), that use integers in binaries, not mpint's, only takes the Range width and returns an integer [0 .. Range-1]. This reduces the BIGNUM handling in the C code, and makes a more flexible building block.
Create a function strong_rand_range(Range :: integer() | binary()) -> binary() that calls the NIF above and also returns [0 .. Range-1] in a binary.
Create a new NIF strong_rand_float_nif/0 that calls BN_rand(p_rnd, 52, -1, 0), then uses BN_bn2bin() to get the bytes, be64toh() from endian.h to get the integer, and then construct the IEEE double in C to return it via enif_make_double() after subtracting 1.0. This should optimize generation of floats since BN_rand should be better at power of 2 ranges than BN_rand_range and constructing the double in C as well as subtracting 1.0 should be faster in C than in Erlang and roughly the same code size. This is maybe premature optimization as the strong_rand_uniform/0 function you already have would do just fine to just rename strong_rand_float/0 if you fix it to be able to return 0.0.
Create a wrapper function strong_rand_float() -> float() that calls the above NIF to generate a random float in the range [0.0 .. 1.0), that is including 0.0 but excluding 1.0. It seems the corresponding function in the deprecated random module is ambiguously documented and in the rand module incorrectly so. The latter can return 0.0.
I do not know if strong_rand_range/1 and strong_rand_float/0 should be documented, nor exported, since the our intention now is to call them via the rand module and the interface below. But maybe they are useful enough on their own to be documented...
Create exported and documented seed generators for the rand module: rand_seed() -> State :: rand:state() and rand_seed_s() -> State :: rand:state(). Where State is {AlgHandler,0}, AlgHandler = #{type => crypto, max => infinity, next => fun crypto:strong_rand_next/1, uniform => fun crypto:strong_rand_uniform/1, uniform_n => fun crypto:strong_rand_uniform/2, jump => fun crypto:strong_rand_jump/1}.
Create exported plugin functions (that ignore the seed) to be called from the rand module:
- strong_rand_next(Seed) -> {bytes_to_integer(strong_rand_range(1 bsl 64)),Seed}
- strong_rand_uniform({_,_} = State) -> {strong_rand_float(),State}
- strong_rand_uniform(Max, {_,_} = State) -> {bytes_to_integer(strong_rand_range(Max)) + 1,State}
- strong_rand_jump({_,_} = State) -> State
To actually allow for plugins open up the types in the rand module: rand:state() and rand:export_state() to not be -opaque anymore. This probably means that types rand:alg(), rand:alg_seed() and rand:alg_handler() needs to be exported as well. Plus that they must be generalized to e.g rand:alg() :: rand_alg() | atom() where the rand module internally should use rand_alg() instead of today alg().

To use this you call crypto:rand_seed() and after that R = rand:uniform(65536), or if you do not want the process dictionary magic S0 = crypto:rand_seed_s() and after that {R,S1} = {rand:uniform_s(65536, S0) where the S0..Sn waiving is just to please the API.

for usage in rand

g-andrade · 2017-03-18T18:04:53Z

By the way, do you have an actual use case for strong random integers?

I reckon it's something that has been missing for some time, and I find it as useful as having strong random bytes; now, one could argue a strong random byte generator (e.g. strong_rand_bytes) is enough to derive randomness for any other data type, but then the case for e.g. floats becomes particularly tricky, as it's very easy to take a naive (but wrong) approach; if the standard library were to provide for it, a lot of people won't head into this pitfall in the future.

I've implemented most of your suggestions, with the notable exception of not having the 'crypto rand plugin' functions' exposed, as it's consistent with the corresponding internal rand module functions for the built-in algorithms - besides, is it expectable people hot-swap the crypto module in runtime? In any case, I don't mind doing it differently.

As for the crypto:strong_rand_float function, I didn't NIF-ize it yet, as I would first like to know whether you think the current solution is going in the right direction.

RaimoNiskanen · 2017-03-20T11:19:47Z

lib/crypto/c_src/crypto.c

+    }
+
+    bn_rand = BN_new();
+    if (BN_rand_range(bn_rand, bn_range) != 1) {


Should be if (! BN_rand_range(bn_rand, bn_range)) { since the return value of BN_rand_range() is a boolean, not a numerical value

Mmmmh, I considered that (as it's a very common pattern), but the documentation explicitly states that either '0' or '1' shall be returned for failure or success, respectively.
It would still work after that change, but I worry whether it could suddenly behave unexpectedly if, let's say, one day the interface gets extended and it starts returning '2' or '0xBEEF' to signal something else entirely?

Allright, i found other functions in the OpenSSL documentation that return 0 or 1 as this one, and -1 if not implemented, so keep the != 1.

RaimoNiskanen · 2017-03-20T11:21:13Z

lib/crypto/doc/src/crypto.xml

+
+        <p><em>Example</em></p>
+        <pre>
+crypto:rand_seed(),


_ = crypto:rand_seed(), to be more Dialyzer friendly

Fixed on 1f236ff

RaimoNiskanen · 2017-03-20T11:22:07Z

lib/crypto/doc/src/crypto.xml

+        <pre>
+crypto:rand_seed(),
+_IntegerValue = rand:uniform(42), % [1; 42]
+_FloatValue = rand:uniform().     % [0.0; 1.0]</pre>


The range should be % [0.0; 1.0[

Fixed on 6f6c478

RaimoNiskanen · 2017-03-20T14:32:46Z

lib/stdlib/src/rand.erl

 seed(Alg) ->
    seed_put(seed_s(Alg)).

-spec seed_s(AlgOrExpState::alg() | export_state()) -> state().
+-spec seed_s(AlgOrStateOrExpState::builtin_alg() | state() | export_state()) -> state().
 seed_s(Alg) when is_atom(Alg) ->


To minimize the use of guards maybe reorder these into:

seed_s({AlgHandler,_Seed}) when is_map(AlgHandler) -> seed_s({Alg0,Seed}) -> seed_s(Alg) ->

Then we rely on alg_handler() being a map and does not use that alg() must be an atom, since it could be possible to widen the alg() type one day...

Fixed on 195edd9

RaimoNiskanen · 2017-03-20T14:33:14Z

lib/stdlib/src/rand.erl

+%% Algorithm state
+-type state() :: {alg_handler(), alg_seed()}.
+-type builtin_alg() :: exs64 | exsplus | exs1024.
+-type alg() :: builtin_alg() | term().


-type alg() :: builtin_alg() | atom()

Fixed on 54b89c8

RaimoNiskanen · 2017-03-20T14:58:53Z

I also think this is an obviously missing feature, but my colleagues sometimes point out that it is a thin argument...

Floats are tricky to get right. Our current implementation in the rand module actually due to its implementation has got a strange distribution of the returned numbers in that the smaller the numbers the shorter the distance between them. And uniform ranges are also hard to get right. The rand implementation as of today can produce bad distribution for big ranges, due to its implementation.

We will have to fix that for the rand module. But for strong random numbers the distribution has to be good. Therefore I think this PR is a valuable contribution.

The state of this PR looks very good, not exactly like I said but just as I wanted it! So this is definitely the right direction. A few nitpicks above.

The reason I want to use export entry funs (e.g fun crypto:rand_plugin_uniform/1) as plugin interface is that it is possible to upgrade the crypto application. And if you do that a process in the system that holds a reference to the crypto funs would get killed. Therefore it feels safer to have them as internally exported, undocumented, and called as export entry funs.

I really do not know if crypto:strong_rand_{range,float} should be exported and documented or not. What do you think?

RaimoNiskanen · 2017-03-21T14:52:13Z

lib/crypto/src/crypto.erl

+    end.
+strong_rand_range_nif(_BinRange) -> ?nif_stub.
+
+strong_rand_float() ->


I just got an idea. Wouldn't this produce exactly the same distribution of numbers?

strong_rand_float() -> BinFraction = strong_rand_range(1 bsl 53), bytes_to_integer(BinFraction) / 9007199254740992.0. % math:pow(2, 53)

If that is true it is much faster and there would probably be no need for a NIF.
I also want to use it in the rand module, unless someone proves me wrong.

Indeed, I ended up rewriting it using a similar approach based on both your and @okeuday 's suggestions.

okeuday · 2017-03-21T16:44:02Z

@RaimoNiskanen Yeah, that approach works. It has been in quickrand for awhile (here):

strong_float() ->
    % 53 bits maximum for double precision floating point representation
    % erlang:round(53.0 / 8) == 7 bytes for random number
    <<I:56/integer>> = crypto:strong_rand_bytes(7),
    I / ?BITS56. % scaled by maximum random number (2 ^ (7 * 8)) - 1

RaimoNiskanen · 2017-03-22T10:18:38Z

@okeuday: I see (fairly certainly) two problematic details with that code, the first is the same "error" as in the current rand module:

I / ?BITS56 feeds 56 random bits into the division, so if the top 1..3 bits are zero we still have 55..52 random bits. This causes numbers in the interval [0.5; 1.0[ to get the distance 2^-53, in [0.25; 0.5[ the distance 2^-54, and in [0.125; 0.25[ the distance 2^-55. They are not equidistant over [0.0; 1.0[.
Since the division is with ?BITS56 (16#FFFFFFFFFFFFFF), not 2^56, there will probably be strange rounding artifacts in the produced range, plus 1.0 will be part of the generated range, which I think is wrong for a random range functions since all I have seen include the lower bound and exclude the upper.

Therefore I suggest masking to 53 bits before the division and dividing with 2.0^53 which should make the resulting numbers equidistant (2.0^-53) over [0.0; 1.0[.

The suggested binary syntax solution also produces equidistant numbers, but with distance 2.0^-52 since subtracting 1.0 shifts in a zero lowest bit, so that bit is not random, which is unfortunate.

The division can also be optimized:

strong_rand_float() ->
    BinFraction = strong_rand_range(1 bsl 53),
    bytes_to_integer(BinFraction) * math:pow(2, -53).

Floating point multiplication should be faster than division and the compiler constant evaluates math:pow(2, -53). For me it was a surprise that it does so. Maybe safer to replace with 1.11022302462515657e-16, in a descriptive macro.

okeuday · 2017-03-22T17:52:24Z

@RaimoNiskanen Using 56 bits instead of 53 bits should not be a problem here due to 56 bits having a random value that is easier to think of as an integer in the range of [0..72057594037927935] which has uniform distribution due to crypto:strong_rand_bytes/1, assigned to the integer I. While the division I / ?BITS56 may be rounded differently based on the hardware implementation of IEEE754 (e.g., a double rounding that occurs with an extended-based system that stores into a double-precision value when compared to a single/double system) the result should remain uniformly distributed in the range [0.0 .. 1.0] despite the potential variation in different double precision rounding with different hardware.

I believe the range [0.0 .. 1.0] is more useful for various math when compared to the range [0.0 .. 1.0[ and it is my expectation that the range [0.0 .. 1.0[ is a more popular implementation choice for source code that generates random floating point values simply due to the dependence on the IEEE754 double precision binary format with the assignment of 52 bits of randomness, as was done in this pull request.

I agree that using a multiplication instead of a division is better and a good change for efficiency. I also agree that a macro for the value of math:pow(2, -53) seems safer, though it may only matter for older versions of Erlang (1.11022302462515657e-16 is a machine epsilon value for binary64 though the float.h DBL_EPSILON is the more typical math:pow(2, -52) machine epsilon value).

Be friendlier to Dialyzer

Fix documented range (interval is half-open.)

Fix plugin alg type

Minimize use of guards.

g-andrade · 2017-03-22T22:24:27Z

@RaimoNiskanen ,

The reason I want to use export entry funs (e.g fun crypto:rand_plugin_uniform/1) as plugin interface is that it is possible to upgrade the crypto application. And if you do that a process in the system that holds a reference to the crypto funs would get killed. Therefore it feels safer to have them as internally exported, undocumented, and called as export entry funs.

Fix pushed.

I really do not know if crypto:strong_rand_{range,float} should be exported and documented or not. What do you think?

I think keeping them out of sight would lead to more elegant use of the funcionality, as it would provide people with a single, consistent solution - "use the rand plugin" - while at the same time keeping the crypto interface as slim as it can be.

g-andrade · 2017-03-22T22:35:05Z

As for the alternative approaches to generating uniform numbers over [0.0, 1.0] / [0.0, 1.0[ - very interesting brain food. I've pushed this:

-define(HALF_DBL_EPSILON, 1.1102230246251565e-16). % math:pow(2, -53)

strong_rand_float() ->
    WholeRange = strong_rand_range(1 bsl 53),
    ?HALF_DBL_EPSILON * bytes_to_integer(WholeRange).

Which should generate numbers over the half-open [0.0, 1.0[ interval. If the closed interval is to be preferred, I reckon generating the random integer up to 2**53 should do the job (it being a power of two, there should be no loss of precision.)

RaimoNiskanen · 2017-03-23T16:06:42Z

@g-andrade: Looks good! I prefer the half-open interval partly because the integer range function and most other range functions i have seen use half-open intervals and partly see the last paragraph. I agree that extending the strong rand range to ((1 bsl 53) + 1) would be the right way to close the interval.

@okeuday
Using 53 bits integer and divide with 2^53 avoids rounding since all such integers have an exact representation as IEEE754 doubles. Using more bits and larger 2^N divisor causes rounding. Using 2^N - 1 divisor also causes rounding.

The resulting numbers after rounding are still uniformly distributed, but not evenly so since the distance between two adjacent numbers varies over the range. It is true that for every sub range to [0.0..1.0) sufficiently larger than the machine epsilon the probability is the same, but every possible number is not equally probable. I think that is annoying. See also http://xoroshiro.di.unimi.it/ "Generating uniform doubles in the unit interval" for a discussion.

I also think that the half open range [0.0..1.0) is more useful than the closed range since then you can e.g generate one set of numbers in [0.0..1.0) another set in [1.0..2.0) and join them without getting a probability spike for 1.0.

okeuday · 2017-03-23T19:45:01Z

@RaimoNiskanen Thank you for the reference. I have switched my code to use only 53 bits to avoid rounding and it can remain an alternative for the [0.0 .. 1.0] range.

RaimoNiskanen · 2017-03-24T09:11:58Z

@okeuday

Just to underline again: using a non-2^N divisor will also cause rounding.

To avoid rounding for the [0.0 .. 1.0] range one should produce a random number in the range [0 .. 2^53] and then divide by 2.0^53 i.e the integer range should contain the upper bound so you can use a 2^N divisor. But then you need to produce a random integer on a range size not 2^N but 1+2^N, which is cumbersome but supported by libcrypto's BN_rand_range.

RaimoNiskanen · 2017-04-03T10:31:29Z

I will add a cleanup commit and run it once more in the daily tests. Therefore removing the 'testing' label, which may be confusing....

g-andrade mentioned this pull request Mar 12, 2017

Support random floating point number generation #1363

Closed

IngelaAndin added team:PS Assigned to OTP team PS feature labels Mar 12, 2017

IngelaAndin assigned RaimoNiskanen Mar 14, 2017

IngelaAndin added the testing currently being tested, tag is used by OTP internal CI label Mar 14, 2017

IngelaAndin assigned IngelaAndin and unassigned IngelaAndin Mar 14, 2017

Support generation of strong random numbers

d07008a

g-andrade force-pushed the crypto/strong_random_numbers branch from ce3e998 to d07008a Compare March 14, 2017 23:54

Restyle crypto strong numeric generators

e50f63f

for usage in rand

g-andrade added 2 commits March 18, 2017 18:06

Support cryptographically strong rand plugin

77039e6

No longer expose strong_rand_(range|float)

5eae0da

g-andrade force-pushed the crypto/strong_random_numbers branch from aee9118 to 5eae0da Compare March 18, 2017 18:06

RaimoNiskanen reviewed Mar 20, 2017

View reviewed changes

RaimoNiskanen removed the testing currently being tested, tag is used by OTP internal CI label Mar 21, 2017

RaimoNiskanen reviewed Mar 21, 2017

View reviewed changes

g-andrade added 3 commits March 22, 2017 21:48

fixup! Support cryptographically strong rand plugin

1f236ff

Be friendlier to Dialyzer

fixup! Support cryptographically strong rand plugin

6f6c478

Fix documented range (interval is half-open.)

fixup! Support cryptographically strong rand plugin

54b89c8

Fix plugin alg type

g-andrade added 3 commits March 22, 2017 21:51

fixup! Support cryptographically strong rand plugin

195edd9

Minimize use of guards.

Allow for crypto upgrades when using rand plugin

ec1e5bc

Attempt faster approach to strong random floats

c84e541

RaimoNiskanen added the testing currently being tested, tag is used by OTP internal CI label Mar 24, 2017

RaimoNiskanen removed the testing currently being tested, tag is used by OTP internal CI label Apr 3, 2017

RaimoNiskanen merged commit c84e541 into erlang:master Apr 4, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support generation of strong random numbers #1372

Support generation of strong random numbers #1372

g-andrade commented Mar 12, 2017

g-andrade commented Mar 14, 2017

RaimoNiskanen commented Mar 17, 2017

g-andrade commented Mar 18, 2017

RaimoNiskanen Mar 20, 2017

g-andrade Mar 22, 2017 •

edited

Loading

RaimoNiskanen Mar 23, 2017

RaimoNiskanen Mar 20, 2017

g-andrade Mar 22, 2017

RaimoNiskanen Mar 20, 2017

g-andrade Mar 22, 2017

RaimoNiskanen Mar 20, 2017 •

edited

Loading

g-andrade Mar 22, 2017

RaimoNiskanen Mar 20, 2017

g-andrade Mar 22, 2017

RaimoNiskanen commented Mar 20, 2017 •

edited

Loading

RaimoNiskanen Mar 21, 2017

g-andrade Mar 22, 2017

okeuday commented Mar 21, 2017

RaimoNiskanen commented Mar 22, 2017

okeuday commented Mar 22, 2017

g-andrade commented Mar 22, 2017

g-andrade commented Mar 22, 2017 •

edited

Loading

RaimoNiskanen commented Mar 23, 2017 •

edited

Loading

okeuday commented Mar 23, 2017

RaimoNiskanen commented Mar 24, 2017 •

edited

Loading

RaimoNiskanen commented Apr 3, 2017

Support generation of strong random numbers #1372

Support generation of strong random numbers #1372

Conversation

g-andrade commented Mar 12, 2017

g-andrade commented Mar 14, 2017

RaimoNiskanen commented Mar 17, 2017

g-andrade commented Mar 18, 2017

Choose a reason for hiding this comment

g-andrade Mar 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RaimoNiskanen Mar 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RaimoNiskanen commented Mar 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

okeuday commented Mar 21, 2017

RaimoNiskanen commented Mar 22, 2017

okeuday commented Mar 22, 2017

g-andrade commented Mar 22, 2017

g-andrade commented Mar 22, 2017 • edited Loading

RaimoNiskanen commented Mar 23, 2017 • edited Loading

okeuday commented Mar 23, 2017

RaimoNiskanen commented Mar 24, 2017 • edited Loading

RaimoNiskanen commented Apr 3, 2017

g-andrade Mar 22, 2017 •

edited

Loading

RaimoNiskanen Mar 20, 2017 •

edited

Loading

RaimoNiskanen commented Mar 20, 2017 •

edited

Loading

g-andrade commented Mar 22, 2017 •

edited

Loading

RaimoNiskanen commented Mar 23, 2017 •

edited

Loading

RaimoNiskanen commented Mar 24, 2017 •

edited

Loading