RNG and entropy improvements #64

hannesm · 2020-05-13T15:40:20Z

See individual commits for further details. This PR is supposed to improve the RNG and entropy story for MirageOS (adapting the FreeBSD design)

in high load scenarios (lots of events), do not call rdseed/rdrand on every event loop iteration
limit the frequency of reseeding (1 second)
only reseed if a reasonable amount of entropy has been collected (64 bytes, as used by FreeBSD)
use rdrand in favour of rdseed during "normal operation" (every second, in a timer loop)

I'd be happy if anyone is interested in looking into more detail here (keen to answer questions if you have any, maybe @cfcs is interested or others). The code as-is is ready for MirageOS, I'd shuffle it slightly around (into mirage-crypto-rng.mirage) and develop a separate mirage-crypto-rng.lwt for entropy collection with lwt (in the same vein).

…ame in path

- val _default_generator is a g option ref (defaults to None) - provide default_generator : unit -> g, which raises if the default generator is not yet set - provide set_default_generator : g -> unit, which raises if the default generator is already set - remove `Null` generator (moved to use site in test_rsa)

dinosaure

Seems good for me!

return is a closure taking a buffer

previously, the RNG was created and then entropy was fed into it. now, the startup-entropy is fed directly while it is being constructed.

…) do it: - in the interrupt_hook (called on every Lwt loop enter), collect some bits from __rdtsc - in a separate task, call rdrand periodically (each second) for each pool the choice to use rdrand instead of rdseed is due to the fact that rdseed is roughly 3 times as expensive and the CPU can run out of it. since we're mixing the output into the fortuna pools anyways (and not use the output as only seed of the RNG), this should be safe. the previous behaviour was, especially under high load, a bit problematic, since each time the event loop was triggered, rdseed got invoked which is pretty expensive.

…y if pool0 exceeds minimum size

…ument Fortuna: control the upper bound of the reseeding with entropy pool rate to 1s (as recommended by Schneier et al, don't reseed too often, they recommend 100ms, but since our rdrand task uses 1s as well, no need to do this more frequently)

mirage specific parts are in Mirage_crypto_rng_mirage (mirage-crypto-rng.mirage)

…bootstrap be more honest about entropy sources

hannesm · 2020-05-14T18:10:43Z

this PR is now ready. to summarize, entropy collection:

unix uses getrandom(), hardware-assisted CPU rng, whirlwind for seeding
lwt uses the above for seeding, also periodic (1s) rdrand for each pool, and periodic (10s) getrandom into each pool (one syscall, distribute random over pools), plus timer
mirage uses whirlwind and hardware-assisted cpu rng for seeding, and periodic (1s) rdrand feed into each pool
--> the rng is no longer fork()-safe (this is a down-side, but it is faster now, and no longer amplifies high loads)

instead of a mirage-crypto-entropy opam package, everything is now in mirage-crypto-rng, which has three sublibraries: unix (as before), lwt (new), mirage (earlier known as mirage-crypto-entropy).

with this design, virtio-rnd can be easily integrated (though the sources is hardcoded and needs to be extended at a central place -- which makes sense since they need unique ids).

failure/initialization semantics: due to intricate entropy collection and RNG initialization, by default the default generator is not set. set_default_generator raises on second call (not sure anymore about this -- its meant to avoid people shooting themselves into their feet (installing a generator, setting up entropy collection, and then installing another generator)), but it may be inconvenient in the current ecosystem where some libraries call Mirage_crypto_rng_unix.initialize (tls / dns). in the mirage setting, this should be fine since the mirage tools injects the initialization call.

…) now warn (apart from mirage, which errors)

cfcs

Looks good to me, left some comments / suggestions.

Why is it that we cannot just auto-seed from getrandom on Unix on library initialization?

rng/entropy.ml

rng/lwt/mirage_crypto_rng_lwt.ml

rng/lwt/mirage_crypto_rng_lwt.mli

rng/entropy.ml

rng/fortuna.ml

hannesm · 2020-05-18T09:07:11Z

thanks for your review @cfcs. to answer the remaining questions:

Why is it that we cannot just auto-seed from getrandom on Unix on library initialization?

If I understand you correctly, you propose the Mirage_crypto_rng_unix.initialize being called at toplevel, and thus applications get a seeded default RNG? Unfortunately this does not work:

the application still needs to depend on mirage-crypto-rng.unix to get the code linked
only if -linkall is passed to ocamlopt, the Mirage_crypto_rng_unix top-level statements are evaluated, since nothing else is used by the user application
if Lwt is used in the application, it's better (entropy collection, periodic reseeding) to use Mirage_crypto_rng_lwt.initialize in the Lwt event loop

(But why is this allowed on lwt when it's disallowed on mirage? The semantic inconsistency seems unnecessary)

On MirageOS, the mirage tool emits when you use the Mirage_random.S a call to Mirage_crypto_rng_mirage.initialize () as part of its generated main.ml. This means here we can be strict and require this to be the only initialization call.
Historical reasons: at the moment, some libraries (at least: tls, dns-client.lwt, dns-client.unix, maybe more) emit calls to Mirage_crypto_rng_unix.initialize (), and I'd like to keep the ecosystem in a working state (multiple calls to Mirage_crypto_rng_unix.initialize are harmless~~, but once the preferable Mirage_crypto_rng_lwt.initialize is used, nobody should call Mirage_crypto_rng_unix.initialize since that will disable the entropy collection for the default rng)~~. I'm convinced these calls to RNG initialization from library code should go away (and move to the application).

Thinking again about the topic, I pushed a8c7bbd which let's the entropy collection in Mirage_crypto_rng_lwt always feed the default_generator; thus the entropy will reach the right generator (i.e. a call to Mirage_crypto_rng_unix.initialize after Mirage_crypto_rng_lwt.initialize is mostly harmless). This will allow for a smooth migration. Once there aren't any libraries calling Mirage_crypto_rng_unix.initialize, we can move the warnings in initialization to errors (and so unify the semantics between unix/lwt/mirage). WDYT?

cfcs · 2020-05-18T12:48:27Z

@hannesm Thanks for your answers, I think both make sense, and agree with the way forward for initialize (and that libraries should not have to muck about with the rng initialization independently).

@Julow

…0.7.0) CHANGES: * CPU feature detection (AESNI, SSE3, PCLMULQ) at runtime instead of compile time (mirage/mirage-crypto#53 @Julow, fixed MirageOS support mirage/mirage-crypto#61, review by @hannesm) performance hit up to 5% * Revise entropy collection (mirage/mirage-crypto#64 @hannesm review by @dinosaure @cfcs) mirage-crypto-entropy has been folded into mirage-crypto-rng.{unix,lwt,mirage} - the RNG is no longer fork() safe, if you use fork in your code, be sure to reseed the RNG in the child process - on Unix and Lwt, the used RNG is Fortuna, seeded by getrandom(), rdrand/rdseed, and whirlwind - Mirage_crypto_rng_lwt does entropy collection for Lwt applications - entropy collection is now similar to FreeBSD: - rdrand/rdseed is executed in a separate task (by default every second) - on Unix, getrandom() is executed in another separate task (by default every 10 seconds) - on every enter of the Lwt event loop, some bits of rdtsc are collected (rdrand/rdseed is not on each even loop enter anymore) - Fortuna only uses entropy pools if the given period is exhausted (defaults to 1s), and the pool size exceeds 64 bytes - The unseeded generator exception prints instructions how to seed the RNG * 32 bit support (for ghash), requested by @TImada in mirage/mirage-crypto#60, mirage/mirage-crypto#65 @hannesm * use Eqaf_cstruct.find_uint8 instead of Cs.ct_find_uint8 (mirage/mirage-crypto#52 @dinosaure) * add (:standard) in C flags to allow cross-compilation mirage/mirage-crypto#47 @samoht * Mirage_crypto.Uncommon: remove several functions (Cs.create, Option), requires OCaml 4.08 (mirage/mirage-crypto#49 mirage/mirage-crypto#51 @hannesm) * remove ocplib-endian dependency, use Bytes directly (since 4.07) mirage/mirage-crypto#51 @hannesm * bitfn.h cleanup (mirage/mirage-crypto#56 mirage/mirage-crypto#58 @hannesm) * fix build if opam is not available (mirage/mirage-crypto#66 @hannesm) * update test.yml GitHub actions (mirage/mirage-crypto#44 mirage/mirage-crypto#57 @imbsky) * Travis CI for arm64 (mirage/mirage-crypto#55 @hannesm)

hannesm added 2 commits May 13, 2020 13:12

mirage-crypto-rng: alias Hmac_drbg.Make directly, avoiding a module n…

d9c276d

…ame in path

hannesm mentioned this pull request May 13, 2020

release 0.7.0 #50

Closed

dinosaure approved these changes May 13, 2020

View reviewed changes

hannesm force-pushed the rng branch from b201dc6 to 5a54549 Compare May 14, 2020 15:32

hannesm added 10 commits May 14, 2020 18:36

fortuna: define seeded implicitly - if the counter is 0

369296b

Mirage_crypto_rng.accumulate API change: source is an argument,

f7dc8f5

return is a closure taking a buffer

mirage-crypto-entropy: initialize with seed from whirlwind and rdseed

8c9d16a

previously, the RNG was created and then entropy was fed into it. now, the startup-entropy is fed directly while it is being constructed.

fortuna: fold accumulator and g into one record, add pool entropy onl…

04952ab

…y if pool0 exceeds minimum size

entropy: actually pass the time to the RNG

10108f8

mirage_crypto_rng: add pools : int to API to avoid hardcoding 32

b7bb378

mirage-crypto-entropy is now mostly Mirage_crypto_rng.Entropy,

1a6050d

mirage specific parts are in Mirage_crypto_rng_mirage (mirage-crypto-rng.mirage)

mirage-crypto-entropy is no longer around

0ee64fb

hannesm force-pushed the rng branch from 5a54549 to 4f9233b Compare May 14, 2020 16:37

hannesm added 2 commits May 14, 2020 18:54

mirage-crypto-rng-unix: use Fortuna, seed with getrandom and Entropy.…

f2b502c

…bootstrap be more honest about entropy sources

Mirage-crypto-rng: add lwt entropy collection

adbb465

hannesm force-pushed the rng branch from 4f9233b to adbb465 Compare May 14, 2020 17:59

hannesm added 2 commits May 14, 2020 20:13

adapt speed to new rng initialization protocol

fdda3b7

rng: allow set_default_generator being set a second time; initialize(…

7abb618

…) now warn (apart from mirage, which errors)

cfcs approved these changes May 17, 2020

View reviewed changes

hannesm added 2 commits May 18, 2020 10:39

address review remarks from @cfcs

c1e7b5c

mirage-crypto-rng-lwt: always feed the default generator

a8c7bbd

hannesm mentioned this pull request May 18, 2020

fix 32 bit support (ghash_ctmul: a ghash implementation from BearSSL that works on 32 bit LE platforms) #65

Merged

hannesm merged commit e9a9571 into mirage:master May 18, 2020

hannesm mentioned this pull request May 18, 2020

[new release] mirage-crypto-pk, mirage-crypto and mirage-crypto-rng (0.7.0) ocaml/opam-repository#16468

Merged

dinosaure mentioned this pull request Jul 4, 2021

Delete any occurrences of Unix.gettimeofday to be compatible with MirageOS aantron/dream#102

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RNG and entropy improvements #64

RNG and entropy improvements #64

hannesm commented May 13, 2020

dinosaure left a comment

hannesm commented May 14, 2020 •

edited

Loading

cfcs left a comment

hannesm commented May 18, 2020

cfcs commented May 18, 2020

RNG and entropy improvements #64

RNG and entropy improvements #64

Conversation

hannesm commented May 13, 2020

dinosaure left a comment

Choose a reason for hiding this comment

hannesm commented May 14, 2020 • edited Loading

cfcs left a comment

Choose a reason for hiding this comment

hannesm commented May 18, 2020

cfcs commented May 18, 2020

hannesm commented May 14, 2020 •

edited

Loading