-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support generation of strong random numbers #1372
Support generation of strong random numbers #1372
Conversation
ce3e998
to
d07008a
Compare
A conflict had popped up in the meantime, in |
I have finally had the time to think through this PR, and think we should adapt it more to be a By the way, do you have an actual use case for strong random integers? We are phasing out the use of As building blocks we need most of your suggested functions, but I suggest:
To use this you call |
for usage in rand
I reckon it's something that has been missing for some time, and I find it as useful as having strong random bytes; now, one could argue a strong random byte generator (e.g. I've implemented most of your suggestions, with the notable exception of not having the 'crypto rand plugin' functions' exposed, as it's consistent with the corresponding internal As for the |
aee9118
to
5eae0da
Compare
} | ||
|
||
bn_rand = BN_new(); | ||
if (BN_rand_range(bn_rand, bn_range) != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be if (! BN_rand_range(bn_rand, bn_range)) {
since the return value of BN_rand_range()
is a boolean, not a numerical value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmmmh, I considered that (as it's a very common pattern), but the documentation explicitly states that either '0' or '1' shall be returned for failure or success, respectively.
It would still work after that change, but I worry whether it could suddenly behave unexpectedly if, let's say, one day the interface gets extended and it starts returning '2' or '0xBEEF' to signal something else entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allright, i found other functions in the OpenSSL documentation that return 0 or 1 as this one, and -1 if not implemented, so keep the != 1.
lib/crypto/doc/src/crypto.xml
Outdated
|
||
<p><em>Example</em></p> | ||
<pre> | ||
crypto:rand_seed(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ = crypto:rand_seed(),
to be more Dialyzer friendly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 1f236ff
lib/crypto/doc/src/crypto.xml
Outdated
<pre> | ||
crypto:rand_seed(), | ||
_IntegerValue = rand:uniform(42), % [1; 42] | ||
_FloatValue = rand:uniform(). % [0.0; 1.0]</pre> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The range should be % [0.0; 1.0[
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 6f6c478
lib/stdlib/src/rand.erl
Outdated
seed(Alg) -> | ||
seed_put(seed_s(Alg)). | ||
|
||
-spec seed_s(AlgOrExpState::alg() | export_state()) -> state(). | ||
-spec seed_s(AlgOrStateOrExpState::builtin_alg() | state() | export_state()) -> state(). | ||
seed_s(Alg) when is_atom(Alg) -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To minimize the use of guards maybe reorder these into:
seed_s({AlgHandler,_Seed}) when is_map(AlgHandler) ->
seed_s({Alg0,Seed}) ->
seed_s(Alg) ->
Then we rely on alg_handler()
being a map and does not use that alg()
must be an atom, since it could be possible to widen the alg()
type one day...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 195edd9
lib/stdlib/src/rand.erl
Outdated
%% Algorithm state | ||
-type state() :: {alg_handler(), alg_seed()}. | ||
-type builtin_alg() :: exs64 | exsplus | exs1024. | ||
-type alg() :: builtin_alg() | term(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-type alg() :: builtin_alg() | atom()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed on 54b89c8
I also think this is an obviously missing feature, but my colleagues sometimes point out that it is a thin argument... Floats are tricky to get right. Our current implementation in the We will have to fix that for the The state of this PR looks very good, not exactly like I said but just as I wanted it! So this is definitely the right direction. A few nitpicks above. The reason I want to use export entry funs (e.g I really do not know if |
end. | ||
strong_rand_range_nif(_BinRange) -> ?nif_stub. | ||
|
||
strong_rand_float() -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just got an idea. Wouldn't this produce exactly the same distribution of numbers?
strong_rand_float() ->
BinFraction = strong_rand_range(1 bsl 53),
bytes_to_integer(BinFraction) / 9007199254740992.0. % math:pow(2, 53)
If that is true it is much faster and there would probably be no need for a NIF.
I also want to use it in the rand
module, unless someone proves me wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I ended up rewriting it using a similar approach based on both your and @okeuday 's suggestions.
@RaimoNiskanen Yeah, that approach works. It has been in
|
@okeuday: I see (fairly certainly) two problematic details with that code, the first is the same "error" as in the current
Therefore I suggest masking to 53 bits before the division and dividing with 2.0^53 which should make the resulting numbers equidistant (2.0^-53) over [0.0; 1.0[. The suggested binary syntax solution also produces equidistant numbers, but with distance 2.0^-52 since subtracting 1.0 shifts in a zero lowest bit, so that bit is not random, which is unfortunate. The division can also be optimized: strong_rand_float() ->
BinFraction = strong_rand_range(1 bsl 53),
bytes_to_integer(BinFraction) * math:pow(2, -53). Floating point multiplication should be faster than division and the compiler constant evaluates math:pow(2, -53). For me it was a surprise that it does so. Maybe safer to replace with 1.11022302462515657e-16, in a descriptive macro. |
@RaimoNiskanen Using 56 bits instead of 53 bits should not be a problem here due to 56 bits having a random value that is easier to think of as an integer in the range of [0..72057594037927935] which has uniform distribution due to I believe the range [0.0 .. 1.0] is more useful for various math when compared to the range [0.0 .. 1.0[ and it is my expectation that the range [0.0 .. 1.0[ is a more popular implementation choice for source code that generates random floating point values simply due to the dependence on the IEEE754 double precision binary format with the assignment of 52 bits of randomness, as was done in this pull request. I agree that using a multiplication instead of a division is better and a good change for efficiency. I also agree that a macro for the value of |
Be friendlier to Dialyzer
Fix documented range (interval is half-open.)
Fix plugin alg type
Fix pushed.
I think keeping them out of sight would lead to more elegant use of the funcionality, as it would provide people with a single, consistent solution - "use the |
As for the alternative approaches to generating uniform numbers over [0.0, 1.0] / [0.0, 1.0[ - very interesting brain food. I've pushed this: -define(HALF_DBL_EPSILON, 1.1102230246251565e-16). % math:pow(2, -53)
strong_rand_float() ->
WholeRange = strong_rand_range(1 bsl 53),
?HALF_DBL_EPSILON * bytes_to_integer(WholeRange). Which should generate numbers over the half-open [0.0, 1.0[ interval. If the closed interval is to be preferred, I reckon generating the random integer up to 2**53 should do the job (it being a power of two, there should be no loss of precision.) |
@g-andrade: Looks good! I prefer the half-open interval partly because the integer range function and most other range functions i have seen use half-open intervals and partly see the last paragraph. I agree that extending the strong rand range to ((1 bsl 53) + 1) would be the right way to close the interval. @okeuday The resulting numbers after rounding are still uniformly distributed, but not evenly so since the distance between two adjacent numbers varies over the range. It is true that for every sub range to [0.0..1.0) sufficiently larger than the machine epsilon the probability is the same, but every possible number is not equally probable. I think that is annoying. See also http://xoroshiro.di.unimi.it/ "Generating uniform doubles in the unit interval" for a discussion. I also think that the half open range [0.0..1.0) is more useful than the closed range since then you can e.g generate one set of numbers in [0.0..1.0) another set in [1.0..2.0) and join them without getting a probability spike for 1.0. |
@RaimoNiskanen Thank you for the reference. I have switched my code to use only 53 bits to avoid rounding and it can remain an alternative for the [0.0 .. 1.0] range. |
Just to underline again: using a non-2^N divisor will also cause rounding. To avoid rounding for the [0.0 .. 1.0] range one should produce a random number in the range [0 .. 2^53] and then divide by 2.0^53 i.e the integer range should contain the upper bound so you can use a 2^N divisor. But then you need to produce a random integer on a range size not 2^N but 1+2^N, which is cumbersome but supported by |
I will add a cleanup commit and run it once more in the daily tests. Therefore removing the 'testing' label, which may be confusing.... |
This PR proposes two new additions to the crypto module, both named
strong_rand_uniform,
for effortless generation of cryptographically secure numbers:These follow the same interfaces as
rand:uniform/0
andrand:uniform/1
, and both use OpenSSL'sBN_rand_range
method.Generated floating point values are limited to an effective entropy of up to 51 bits but are expected to be uniformly distributed between 0.0 and 1.0.
Supersedes #1363.