Skip to content

Conversation

jacmet
Copy link

@jacmet jacmet commented Sep 15, 2023

arc4random(3) is problematic to use during early boot, as it (as implemented by glibc or libbsd) uses getrandom(2) without the GRND_NONBLOCK flag, causing it to block until the kernel entropy pool is initialized, which may take a very long time.

One example of this is cryptsetup(8), which may be used for bringing up a LUKS encrypted root filesystem during early boot.

So instead prefer getrandom(.., GRND_NONBLOCK) (which will fail before the pool is initialized) and /dev/urandom (which will succeed, but cause the kernel to print a warning about access before the pool is initialized) before falling back to arc4random().

arc4random(3) is problematic to use during early boot, as it (as implemented
by glibc or libbsd) uses getrandom(2) without the GRND_NONBLOCK flag,
causing it to block until the kernel entropy pool is initialized, which may
take a very long time.

One example of this is cryptsetup(8), which may be used for bringing up a
LUKS encrypted root filesystem during early boot.

So instead prefer getrandom(.., GRND_NONBLOCK) (which will fail before the
pool is initialized) and /dev/urandom (which will succeed, but cause the
kernel to print a warning about access before the pool is initialized)
before falling back to arc4random().

Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
@hawicz
Copy link
Member

hawicz commented Sep 15, 2023

Rather than fiddling around with the order in json_c_get_random_seed() to try to guess at what will work for all users of the json-c API, I feel like it's be better to provide a way for programs that know they're going to run in environments where random number generation is broken to explicitly adjust the behavior. e.g. add something like a json_c_avoid_blocking_randomness(int) and if that's set then skip arc4random() (and any other randomness function that might be determined to block).

@jacmet
Copy link
Author

jacmet commented Sep 16, 2023

Rather than fiddling around with the order in json_c_get_random_seed() to try to guess at what will work for all users of the json-c API, I feel like it's be better to provide a way for programs that know they're going to run in environments where random number generation is broken to explicitly adjust the behavior. e.g. add something like a json_c_avoid_blocking_randomness(int) and if that's set then skip arc4random() (and any other randomness function that might be determined to block).

That is also fine, but significantly more effort (E.G. new json-c release, affected reverse dependencies have to (conditionally) add the call, new releases of those projects and integration in distributions / build systems.

What was the reason exactly for adding (and preferring) arc4random() in the first place? From the https://github.com/json-c/json-c/blob/master/ChangeLog#L83-L85 I see:

  • Use getrandom() if available; with GRND_NONBLOCK to allow use of json-c
    very early during boot, such as part of cryptsetup.
  • Use arc4random() if it's available.

So there was already focus on the cryptsetup use case when it was added. How important is the quality of the random seed? From a quick look it seems as if it is only used for the hash table lookups, E.G. unlikely to be an issue for early userspace.

@hawicz
Copy link
Member

hawicz commented Sep 17, 2023

Yes, it's "just" to initialize the hash function to make it more difficult for an attacker to generate content that would be subject to collisions, thus making a DoS more likely.
If we're going to make an effort to mitigate such an attack, we should use the best (cryptographically secure, fast) random number source that's available, which AIUI is arc4random.

Actually, looking at what arc4random actually does, it seems like the right fix for cryptsetup and other early/broken environment use cases is to upgrade to a version of libbsd that actually does pass GRND_NONBLOCK: https://git.hadrons.org/cgit/libbsd.git/tree/src/getentropy_linux.c#n206
The first version of libbsd with that code looks like it's 0.9.0, from 2018.

There's not much point in moving arc4random to the end of the list of things we try in json-c, because we'd effectively never end up using it. Perhaps we can detect whether the arc4random that's available is from a "too early" libbsd, and just not try to use it at all in that case, though I don't actually know a reliable way to do that offhand.

@jacmet
Copy link
Author

jacmet commented Sep 17, 2023

Yes, it's "just" to initialize the hash function to make it more difficult for an attacker to generate content that would be subject to collisions, thus making a DoS more likely. If we're going to make an effort to mitigate such an attack, we should use the best (cryptographically secure, fast) random number source that's available, which AIUI is arc4random.

Actually, looking at what arc4random actually does, it seems like the right fix for cryptsetup and other early/broken environment use cases is to upgrade to a version of libbsd that actually does pass GRND_NONBLOCK: https://git.hadrons.org/cgit/libbsd.git/tree/src/getentropy_linux.c#n206 The first version of libbsd with that code looks like it's 0.9.0, from 2018.

There's not much point in moving arc4random to the end of the list of things we try in json-c, because we'd effectively never end up using it. Perhaps we can detect whether the arc4random that's available is from a "too early" libbsd, and just not try to use it at all in that case, though I don't actually know a reliable way to do that offhand.

Sorry, but I don't quite get that. All the different interfaces to random numbers (on Linux at least) boil down to the same thing (the kernel random number generator). They do not return "better" or "worse" random numbers, they only differ in the behaviour when the kernel entropy pool is not yet initialized or empty:

  • getrandom() reads from (the equivalent of) /dev/random or /dev/urandom. If the kernel random number entropy pool is not initialized yet (E.G. the early userspace situation), then it will block if called without GRND_NONBLOCK, and return -1 if called with it. E.G. from man getrandom(2):
GRND_NONBLOCK
              By default, when reading from the random source, getrandom() blocks if no random bytes are available, and when reading from the urandom source, it blocks if the entropy pool has not yet been initialized.  If the GRND_NONBLOCK flag  is
              set, then getrandom() does not block in these cases, but instead immediately returns -1 with errno set to EAGAIN.
  • /dev/{u,}random reads from the kernel random number entropy pool. If the entropy pool is not initialized yet, then /dev/random will block, /dev/urandom not
  • arc4random calls getrandom() without the GRND_NONBLOCK flag, E.G. it blocks until the pool is initialized

So I don't quite get why we should prefer arc4random(). If it is for non-Linux setups, then they won't have the getrandom() and /dev/{,u}random interfaces, so the ordering doesn't matter.

Using arc4random() from a newer version of libbsd on Linux would just mean that it would end up calling getrandom(,, GRND_NONBLOCK) which would return -1, and it would fall back to using /dev/urandom, exactly as my proposed patch.

@jacmet
Copy link
Author

jacmet commented Sep 23, 2023

@hawicz ping?

@jacmet
Copy link
Author

jacmet commented Nov 18, 2023

@hawicz ping? How can we make progress on this?

@jacmet
Copy link
Author

jacmet commented Dec 1, 2023

@hawicz it would be great to get this merged, thanks

@jacmet
Copy link
Author

jacmet commented Dec 22, 2023

@hawicz ping?

@hawicz
Copy link
Member

hawicz commented Dec 24, 2023

The reason I prefer to keep arc4random first is so json-c makes use of the code that is purpose built to do randomness generation correctly, and thus users of json-c gain a bit of system-wide consistency in how random numbers are generated, and I don't need to think as much about fallbacks or relative suitability and order of the other mechanisms that random_seed.c might call.
arc4random appears to be the "popular" modern choice for generating randomness, and though I'm usually loath to pick something just based on popularity, in this case it seems that this will gain the potential benefit of future improvements/bug-fixes to the RNG without the need to change json-c itself.
If there's a problem with randomness generation, it seems more sensible to upgrade the RNG function rather than all the users of that function, i.e. address the root cause. In the problematic cryptsetup case, is there something that would prevent fixing the blocking issue by upgrading to a newer libbsd (or now glibc,, since apparently >=2.36 includes it)

@jacmet
Copy link
Author

jacmet commented Dec 26, 2023

On Linux, arc4random, getrandom, /dev/{u,}random all get their data from the same entropy pool, so there is no difference in the "quality" of the output. In Glibc and libbsd, arc4random is implemented using getrandom(). The big problem with arc4random() is that it is isn't suitable for early user space because it doesn't take any flags / blocks for a (potentially very long time) for the kernel entropy pool to be filled (E.G. it calls getrandom() without the GRND_NONBLOCK flags):

GRND_NONBLOCK
              By default, when reading from the random source, getrandom() blocks if no random bytes are available, and when reading from the urandom source, it blocks if the entropy pool has not yet been initialized.  If the GRND_NONBLOCK flag  is
              set, then getrandom() does not block in these cases, but instead immediately returns -1 with errno set to EAGAIN.

For a lot (most?) JSON-C users during early boot, calling getrandom(, GRND_NONBLOCK) (which fails fast with -1) and then falling back to /dev/urandom (which succeeds, even though theoretically the "randomness" cannot be guaranteed yet) is very much preferably to hanging for a very long time until arc4random() returns, as the hash collision problem the random seed is supposed to protect against is not an issue (E.G. cryptsetup uses JSON-C to read a JSON formatted configuration file of a few KB).

Once early boot is completed and the kernel entropy pool is initialized, calling getrandom() or arc4random() is completely equivalent. The difference is only about early user space where arc4random() is not suitable and getrandom() / dev/urandom is.

arc4random() is arguably a legacy function from BSD, so I am surprised to hear that you think it is the popular choice, except that it is a bit more portable given libbsd.

@jacmet
Copy link
Author

jacmet commented Feb 5, 2024

@hawicz ping?

@jacmet
Copy link
Author

jacmet commented Feb 29, 2024

@hawicz Can we make progress on this issue please?

@jacmet
Copy link
Author

jacmet commented Mar 22, 2024

what can I do to make progress on this?

@jacmet
Copy link
Author

jacmet commented Jun 13, 2024

@hawicz ping?

@jacmet
Copy link
Author

jacmet commented Oct 15, 2024

hello?

@jacmet
Copy link
Author

jacmet commented Jan 9, 2025

Any news on this?

@hawicz
Copy link
Member

hawicz commented Jan 11, 2025

We should aim to have the preferred interface that json-c uses be the simplest, and closest to "standard" one. I'm still against the idea of abandoning the API that is effectively the posix_random() function (see https://www.austingroupbugs.net/view.php?id=859) just because at one point it had a bug. At most, I can see adding a check to CMakeLists.txt to turn off HAVE_ARC4RANDOM if you're building on a machine with a known broken version of libbsd, but since that was fixed 7 years ago I would hope such machines are few and far between.

@jacmet
Copy link
Author

jacmet commented Jan 12, 2025

@hawicz what do you mean with brokenness in old libbsd versions? Looking at libbsd git, I don't see any recent changes affecting the behaviour at bootup, E.G. getentrophy still uses getrandom(, GRND_NONBLOCK) if available: https://gitlab.freedesktop.org/libbsd/libbsd/-/blob/main/src/getentropy_linux.c?ref_type=heads#L197-214 which fails until the urandom pool is initialized: https://man7.org/linux/man-pages/man2/getrandom.2.html

As a reminder, the issue is when json-c is used early in the boot sequence. My setup is a LUKS-encrypted volume, and LUKS nowadays has some JSON-formatted metadata that cryptsetup wants to read using JSON-C. If the urandom pool is not initialized when this is called (and there are no background randomness sources) then this will block FOREVER, just to initialize the hash tables.

As explained in the libbsd sources, getrandom(.., GRND_NONBLOCK) returns with an error when the urandom pool is not initialized, whereas a read from /dev/urandom will succeed (but trigger a kernel warning as it cannot guarantee the quality for the randomness): https://gitlab.freedesktop.org/libbsd/libbsd/-/blob/main/src/getentropy_linux.c?ref_type=heads#L102-105

I am not sure what the threat model is exactly for wanting a cryptographically strong seed value for the hash tables, but presumably it doesn't trump the denial of service of hanging forever at early boot?

Once the urandom source has been seeded, the behaviour of getrandom(.., GRND_NONBLOCK) and /dev/urandom is identical.

@jacmet
Copy link
Author

jacmet commented Apr 15, 2025

@hawicz can we make progress on this please? Since last I noticed that getrandom() has gained a GRND_INSECURE flag since Linux 5.6 (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=75551dbf112c992bc6c99a972990b3f272247e23) which would probably make sense to use so getrandom() works at early bootup rather than falling back to /dev/urandom. For the hash table initialization here, this seems like a sensible solution, and also match what is done in E.G. systemd: systemd/systemd@87cb1ab

jacmet added a commit to jacmet/json-c that referenced this pull request Apr 15, 2025
As explained in json-c#832, getrandom(..,
GRND_NONBLOCK) fails with EAGAIN if the Linux kernel random number pool is
not yet initialized.

The use case here is initializing the hash table seed rather than strong
randomness for crypto, so use GRND_INSECURE if available (Linux 5.6+) to
make getrandom() return best effort random data during bootup rather than
fail.

Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
jacmet added a commit to jacmet/json-c that referenced this pull request Apr 15, 2025
As explained in json-c#832, getrandom(..,
GRND_NONBLOCK) fails with EAGAIN if the Linux kernel random number pool is
not yet initialized.

The use case here is initializing the hash table seed rather than strong
randomness for crypto, so use GRND_INSECURE if available (Linux 5.6+) to
make getrandom() return best effort random data during bootup rather than
fail.

Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
@jacmet
Copy link
Author

jacmet commented Apr 15, 2025

@hawicz I have sent a separate PR for GRND_INSECURE here: #890

@hmh
Copy link

hmh commented May 30, 2025

Second opinion on this:

IMHO:

  1. If the random number generator is going to be used only once or twice at json-c init time, please just call get_random() with the fix in random_seed.c: get_getrandom_seed(): use GRND_INSECURE if available #890 (to use GNRD_INSECURE when defined).

This will provide randomness that is at the very least as good as what you could get from ark4random(), and likely a lot better. It is more expensive to use, but that is effectively irrelevant if you are going to call it only once.

With the fix for GRND_INSECURE, it would be in fact a better and safer choice than ark4random as the very existence of this issue proves.

You need alternatives to get_random when it is unavailable. However, supposing ark4random is undesired only because of blocking use of get_random(), that still reduces to use get_random (GRND_INSECURE), then fallback to ark4random(), then to whatever else is usable if neither is available.

  1. If the random numbers are going to be used very often (and it is not for cryptography use), then you´d want a faster userspace PRNG, and ark4random instead of get_random() makes sense. But you end up with the need of PRNG alternatives where ark4random() is not available, and as shown in this issue, it is desirable to avoid (IMHO as a build-time choice that defaults to the status-quo) the use of ark4random in some cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants