Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support the getrandom system call if glibc doesn't provide a wrapper #5910

Closed
wants to merge 2 commits into from

Conversation

kroeckx
Copy link
Member

@kroeckx kroeckx commented Apr 8, 2018

Glibc only provides getrandom() since version 2.25, while the Linux
kernel has support for it since version 3.17.

Checklist
  • documentation is added or updated

@@ -119,26 +126,52 @@ size_t rand_pool_acquire_entropy(RAND_POOL *pool)
# error "Seeding uses urandom but DEVRANDOM is not configured"
# endif

# if defined(__GLIBC__) && defined(__GLIBC_PREREQ)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line and the next can be merged into a single three-part conditional.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it can't.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. We went through it earlier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, folks.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 8, 2018

Is someone able to test this on FreeBSD or OpenBSD? I don't have access to it. FreeBSD only added this like 2 weeks ago, so you need something very recent.

@richsalz
Copy link
Contributor

richsalz commented Apr 8, 2018

And what happens on a FreeBSD system that is older than two weeks?

@kroeckx
Copy link
Member Author

kroeckx commented Apr 8, 2018

It falls back to trying /dev/urandom ...

@kroeckx
Copy link
Member Author

kroeckx commented Apr 8, 2018

FreeBSD also has a sysctl MIB (CTL_KERN.KERN_ARND), but it seems undocumented, so I'm not sure how to properly support it.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 8, 2018

It seems to be documented in the random(4) manpage since FreeBSD 11.0

@kroeckx
Copy link
Member Author

kroeckx commented Apr 8, 2018

So I added something that should try to use a syscall on FreeBSD > 11. It would be nice if someone can test that on both FreeBSD 11 and 12.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 9, 2018

@kaduk: Is this something you can test on FreeBSD?

@kaduk
Copy link
Contributor

kaduk commented Apr 9, 2018

@kroeckx yes, I can test. I probably don't have a less-than-two-weeks-old FreeBSD 12 already stood up, but it shouldn't be too hard to spin one up for a one-off.
Is testing anything more involved than just 'make && make test'?

# endif
# endif

# if (defined(__FreeBSD__) && __FreeBSD__ >= 12)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to use __FreeBSD_version from sys/param.hand check the actual value against a fine-grained constant, since people might be running a snapshot of 12-current that is too old to have this support. Unfortunately, the addition of getrandom does not seem to appear in https://www.freebsd.org/doc/en/books/porters-handbook/versions.html so some detective work will be required...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know 12 was already that old. I guess I could go for 1200061 which is the first version there after it's been merged.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something that calls itself 12 has existed since 11.0 was released.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed a commit for that. Any idea when the sysctl support was added so I can do the same for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I seem to have found it: freebsd/freebsd-src@3ef9d41

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think going for >= 8 should work.

/* Supported since OpenBSD 5.6 */
# if defined(__OpenBSD__) && OpenBSD >= 201411
# if defined(OPENSSL_RAND_SEED_OS)
# define OPENSSL_RAND_SEED_GETRANDOM
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels pretty weird to be conditionally #defineing OPENSSL_RAND_SEED_GETRANDOM in the body of a function like this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't like it and had no better idea at that time, but I have now, let me add an other commit.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 9, 2018

I would like you to check that the system calls happen and doesn't fall back to /dev/urandom. Something like strace apps/openssl rand 1

# if defined(OPENSSL_HAVE_GETRANDOM)
# include <sys/random.h>
# endif

# if defined(OPENSSL_RAND_SEED_OS)
# if !defined(DEVRANDOM)
# error "OS seeding requires DEVRANDOM to be configured"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually true now, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used as fallback if the kernel doesn't have a system call. But it's true that if a system call is available, that it's no longer required.


do {
len = buflen;
if (sysctl(mib, 2, buf, &len, NULL, 0) == -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This usage seems to have alignment issues if the ultimate buffer is just a char *.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's malloc()'d. I'm not sure why you think there is an alignment issue, or how I need to resolve it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, a few lines up it's claimed that the data is returned in multiples of 'long', as if the long type was used internally. The compiler will not complain at us if we pass a void* where a long* is expected, leaving it to the programmer to ensure that the pointer value in question has sufficient alignment for the type in question (since formally, we're supposed to only actually be using a pointer to the type that we end up dereferencing it as). As you note, this buffer is going to start its life as a malloc output, which is not intrinsically type long. That said, malloc output is required to be sufficiently aligned for use as any native type, so there should not be a problem in practice at runtime. (There would maybe be issues if this function was exposed to user-supplied argument values, but I did not attempt to trace if that is possible.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this again, the old code used a u_long and ignored the len parameter, the current code uses the len and restricts it to 256 byte.

if (sysctl(mib, 2, buf, &len, NULL, 0) == -1)
return done;
done += len;
buf += len;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arithmetic on a void pointer is a GNU extension
(and so at least the strict-warnings build fails. "Who does non-strict-warnings builds anyway?")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't even compile this ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was more about "I have no idea whether a non-strict-warnings build would succeed; I did not expect you to have built it on FreeBSD already.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any progress on FreeBSD?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finally got to install a FreeBSD 12-current VM and test a recent OpenSSL on it.
The test suite runs clean and I confirmed with gdb that we are using the getrandom() function from libc and it is returning (at least nominally) random data into the buffer.

@kaduk
Copy link
Contributor

kaduk commented Apr 9, 2018

It turns out that the truss(1) output is not terribly useful for the sysctl case, since so much information is hidden behind pointers that it's tough to be sure that the sysctl in the trace is actually the sysctl we're interested in. That said, a well-placed fprintf() does confirm that the sysctl version runs successfully.

Interestingly, on my FreeBSD 12.0-CURRENT #1 r324846: Sun Oct 22 13:40:14 CDT 2017, I can successfully configure and build --with-rand-seed=getrandom and no build-time error. OPENSSL_HAVE_GETRANDOM ends up not being defined, and syscall_random() falls through to sysctl_random(). This seems a surprising behavior to me; it would be nice if we could arrange for the build to fail if getrandom is requested but not available. (And should we be trying to use sysctl when the 'os' or default seeding is not in use?)

@kroeckx
Copy link
Member Author

kroeckx commented Apr 9, 2018

I'm not sure why you think it's surprising. I've updated the "--with-rand-seed=getrandom" documentation to say it's using a system call, and sysctl is a system call.

INSTALL Outdated
@@ -224,7 +224,8 @@
os: Use a trusted operating system entropy source.
This is the default method if such an entropy
source exists.
getrandom: Use the L<getrandom(2)> system call if available.
getrandom: Use the L<getrandom(2)> or equivalant system
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalant -> equivalent

# endif

# if defined(__linux) && defined(SYS_getrandom)
return (int)syscall(SYS_getrandom, buf, buflen, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm only a layman in this matter, so please excuse me if my question is off-topic or stupid: In this LWN Article the author Jonathan Corbet describes that there was a lot of discussion about whether the glibc getrandom() call should be a thread cancellation point or not. Finally it was decided that it should be.

The most extensive argument, though, was over whether getrandom() should be a thread cancellation point.
In other words, what should happen if pthread_cancel() is called on a thread that is currently blocked
in getrandom()? The original patch did make getrandom() into a cancellation point; it still behaves
that way in the version merged for 2.25, but it had to survive a lot of argument to get there.

While the glibc getrandom() is a thread cancellation point, your raw syscall is not. Does this difference have any practical relevance for OpenSSL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's relevant. Cancellation points are just a bunch of functions where the cancellation can happen, which are typically function that can block.

@mspncp
Copy link
Contributor

mspncp commented Apr 12, 2018

+1 from me for Kurt's pull request, but I don't have the expertise to assess the correctness of the implementation.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 12, 2018

I was planning on updating some comment, let me do that now.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 12, 2018

I intend to squash this all into 1 commit and change the commit message to:
Add support for getting entropy using system calls

@kroeckx
Copy link
Member Author

kroeckx commented Apr 14, 2018

I had to resolve conflicts, I've fixed them and squashed all the commits.

# endif
# endif

# if (defined(__FreeBSD__) && __FreeBSD_version >= 1200061)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth splitting this like was done for __GLIBC_PREREQ ?

If neither __FreeBSD__ nor __FreeBSD_version is defined, isn't there is a syntax error?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for the __GLIBC_PREREQ() on the next line is that it takes parameters, so you need to be sure it exist before you use it.

* of longs.
*/
if (buflen % sizeof(long) != 0)
return 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there reason to punish caller that hard? Why not read buflen - buflen % sizeof(long), and then read one long to temporary storage and and copy buflen %sizeof(long) bytes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that the loop is the only example I could find for something that is undocumented, and the comment is based on what I saw in the source, and I'm not even 100% sure that's correct. But that loop clearly has a problem if the implementation is really using a long.

I think us trying to detect which implementation is in the kernel, and then starting using longs is the wrong way to go. I have no idea if there are other cases where it returns a buffer that's shorter than the one we ask. But the code with longs didn't look at the length and always filled a whole long.

So I think the easiest way to deal with this is have the calling code always request a multiple of the size of a long, and throw away the rest if it needs less. We currently always have a multiple of the size of a long anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We currently always have a multiple of the size of a long anyway.

Then it would arguably be appropriate to use assert here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that ossl_assert()?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case yes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I rather meant

if (!ossl_assert(buflen % sizeof(long) == 0))
    return 0;

so that it still return 0 in case condition is not met, and debug build crashes. If you don't have condition just assert is more appropriate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't know it could return a value. I just merged this like an hour ago. I'll make a new PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andreasschulze
Copy link

@mspncp pushed me to test that patch.

OS: Debian Stretch (9.4)
libc: 2.24
openssl-1.1.1-pre5 + 5910.patch + "./Configure shared --with-rand-seed=getrandom,os ..."
postfix-3.3.0 build with that libssl-dev, smtpd running chroot

  • kernel 2.6 (cheap OpenVZ virtualization)
    -> /var/spool/postfix/dev/urandom required, otherwise postfix smtpd fail on connect with
    "warning: TLS library problem: error:2406E06E:random number generator:RAND_DRBG_reseed:error retrieving entropy:crypto/rand/drbg_lib.c:431:
  • Kernel 4.9.0-6 (Debian native)
    -> /var/spool/postfix/dev/urandom not required

so, it looks like the code works as expected.
Andreas

@mspncp
Copy link
Contributor

mspncp commented Apr 19, 2018

Thanks for testing! BTW: --with-rand-seed=getrandom,os can be omitted, the default value is os which on linux uses all reasonable random sources (getrandom first, then /dev/*random, etc.).

ping @openssl/committers, can anyone of you review this, so we can merge it before the next release?

@mspncp mspncp added this to the 1.1.1 milestone Apr 19, 2018
@mspncp mspncp added the approval: review pending This pull request needs review by a committer label Apr 19, 2018
Copy link
Contributor

@mspncp mspncp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add support for gettng entropy using system calls

The commit message contains a typo (gettng -> getting). Also, it is a bit vague; after all, a read() is also a system call. How about using the following?

Add fallback for missing glibc getentropy() call

@@ -224,7 +224,8 @@
os: Use a trusted operating system entropy source.
This is the default method if such an entropy
source exists.
getrandom: Use the L<getrandom(2)> system call if available.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

equivalant -> equivalent

if (sysctl(mib, 2, buf, &len, NULL, 0) == -1)
return done;
done += len;
buf += len;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any progress on FreeBSD?

@kroeckx
Copy link
Member Author

kroeckx commented Apr 20, 2018

We use getrandom() from glibc, not getentropy. And it's also not glibc specific. How about:
Add support for getting entropy from the OS without using files

FreeBSD was tested, but only the version that doesn't use getrandom(), which is was most people currently will end up using anyway.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 20, 2018

Or just: Add support for getrandom() or equivalent system calls and use them by default

@mspncp
Copy link
Contributor

mspncp commented Apr 20, 2018

We use getrandom() from glibc, not getentropy.

Yeah, my fault.

Or just: Add support for getrandom() or equivalent system calls and use them by default

That's ok with me.

@kroeckx
Copy link
Member Author

kroeckx commented Apr 21, 2018

So can someone review this?

@mspncp
Copy link
Contributor

mspncp commented Apr 22, 2018

If nobody else shows up, you can merge with my approval. But I'd really prefer if some else would have a look at this who is more familiar with the technical details.

@kroeckx kroeckx closed this Apr 22, 2018
levitte pushed a commit that referenced this pull request Apr 22, 2018
…y default

Reviewed-by: Dr. Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com>
GH: #5910
@lukastribus
Copy link

@kroeckx when hitting this code path, openssl version -r still shows Seeding source: os-specific instead of getrandom(), but the library does use getrandom().

Can you take a look at that and make sure this ouput is consistent with the actual behavior?

@andreasschulze
Copy link

works as expected here: $ openssl11 version -a
OpenSSL 1.1.1-pre6 (beta) 1 May 2018
built on: Tue May 8 16:51:33 2018 UTC
platform: debian-amd64
options: bn(64,64) rc4(16x,int) des(int) idea(int) blowfish(ptr)
compiler: gcc -fPIC -pthread -m64 -g -O2 -fdebug-prefix-map=/build/dv-openssl-9iOTHP/dv-openssl-1.1.1pre6=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wa,--noexecstack -Wall -g -O2 -fdebug-prefix-map=/build/dv-openssl-9iOTHP/dv-openssl-1.1.1pre6=. -fstack-protector-strong -Wformat -Werror=format-security -DOPENSSL_USE_NODELETE -DL_ENDIAN -DOPENSSL_PIC -DOPENSSL_CPUID_OBJ -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DKECCAK1600_ASM -DRC4_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DGHASH_ASM -DECP_NISTZ256_ASM -DX25519_ASM -DPADLOCK_ASM -DPOLY1305_ASM -DNDEBUG -Wdate-time -D_FORTIFY_SOURCE=2
OPENSSLDIR: "/etc/ssl"
ENGINESDIR: "/usr/lib/x86_64-linux-gnu/engines-1.1"
Seeding source: getrandom-syscall os-specific

@mspncp
Copy link
Contributor

mspncp commented May 23, 2018

openssl version -r still shows Seeding source: os-specific instead of getrandom(), but the library does use getrandom().

Can you take a look at that and make sure this ouput is consistent with the actual behavior?

The os-specific option is a bit more generic than options like getrandom-syscall; it means: "use the optimal source depending on the operating system". The current os default for linux is to try 1) the glibc getrandom() call, 2) a raw getrandom() syscall, or 3) read from /dev/Xrandom, in this order and provided the respective method is available (AFAIR). So yes, it is expected that getrandom() is used if it exists and if os-specific is configured (which is the default).

@lukastribus
Copy link

So basically Seeding source is not what OpenSSL actually uses as entropy source, instead it is about the exact content of --with-rand-seed in the configure phase (or it's default: os).

And in the --with-rand-seed command, os is basically equivalent to getrandom,devrandom (at least on Linux and BSD).

@mspncp
Copy link
Contributor

mspncp commented May 23, 2018

Correct.

@mspncp
Copy link
Contributor

mspncp commented May 23, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: review pending This pull request needs review by a committer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants