New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support the getrandom system call if glibc doesn't provide a wrapper #5910
Conversation
@@ -119,26 +126,52 @@ size_t rand_pool_acquire_entropy(RAND_POOL *pool) | |||
# error "Seeding uses urandom but DEVRANDOM is not configured" | |||
# endif | |||
|
|||
# if defined(__GLIBC__) && defined(__GLIBC_PREREQ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line and the next can be merged into a single three-part conditional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it can't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. We went through it earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, folks.
Is someone able to test this on FreeBSD or OpenBSD? I don't have access to it. FreeBSD only added this like 2 weeks ago, so you need something very recent. |
And what happens on a FreeBSD system that is older than two weeks? |
It falls back to trying /dev/urandom ... |
FreeBSD also has a sysctl MIB (CTL_KERN.KERN_ARND), but it seems undocumented, so I'm not sure how to properly support it. |
It seems to be documented in the random(4) manpage since FreeBSD 11.0 |
So I added something that should try to use a syscall on FreeBSD > 11. It would be nice if someone can test that on both FreeBSD 11 and 12. |
@kaduk: Is this something you can test on FreeBSD? |
@kroeckx yes, I can test. I probably don't have a less-than-two-weeks-old FreeBSD 12 already stood up, but it shouldn't be too hard to spin one up for a one-off. |
crypto/rand/rand_unix.c
Outdated
# endif | ||
# endif | ||
|
||
# if (defined(__FreeBSD__) && __FreeBSD__ >= 12) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably better to use __FreeBSD_version
from sys/param.h
and check the actual value against a fine-grained constant, since people might be running a snapshot of 12-current that is too old to have this support. Unfortunately, the addition of getrandom does not seem to appear in https://www.freebsd.org/doc/en/books/porters-handbook/versions.html so some detective work will be required...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't know 12 was already that old. I guess I could go for 1200061 which is the first version there after it's been merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, something that calls itself 12 has existed since 11.0 was released.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a commit for that. Any idea when the sysctl support was added so I can do the same for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Current oldest reference I could find is:
https://lists.freebsd.org/pipermail/svn-src-all/2012-July/056019.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to have found it: freebsd/freebsd-src@3ef9d41
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think going for >= 8 should work.
crypto/rand/rand_unix.c
Outdated
/* Supported since OpenBSD 5.6 */ | ||
# if defined(__OpenBSD__) && OpenBSD >= 201411 | ||
# if defined(OPENSSL_RAND_SEED_OS) | ||
# define OPENSSL_RAND_SEED_GETRANDOM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels pretty weird to be conditionally #define
ing OPENSSL_RAND_SEED_GETRANDOM
in the body of a function like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't like it and had no better idea at that time, but I have now, let me add an other commit.
I would like you to check that the system calls happen and doesn't fall back to /dev/urandom. Something like strace apps/openssl rand 1 |
# if defined(OPENSSL_HAVE_GETRANDOM) | ||
# include <sys/random.h> | ||
# endif | ||
|
||
# if defined(OPENSSL_RAND_SEED_OS) | ||
# if !defined(DEVRANDOM) | ||
# error "OS seeding requires DEVRANDOM to be configured" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this actually true now, though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used as fallback if the kernel doesn't have a system call. But it's true that if a system call is available, that it's no longer required.
|
||
do { | ||
len = buflen; | ||
if (sysctl(mib, 2, buf, &len, NULL, 0) == -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This usage seems to have alignment issues if the ultimate buffer is just a char *
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's malloc()'d. I'm not sure why you think there is an alignment issue, or how I need to resolve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, a few lines up it's claimed that the data is returned in multiples of 'long', as if the long type was used internally. The compiler will not complain at us if we pass a void*
where a long*
is expected, leaving it to the programmer to ensure that the pointer value in question has sufficient alignment for the type in question (since formally, we're supposed to only actually be using a pointer to the type that we end up dereferencing it as). As you note, this buffer is going to start its life as a malloc output, which is not intrinsically type long. That said, malloc output is required to be sufficiently aligned for use as any native type, so there should not be a problem in practice at runtime. (There would maybe be issues if this function was exposed to user-supplied argument values, but I did not attempt to trace if that is possible.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at this again, the old code used a u_long and ignored the len parameter, the current code uses the len and restricts it to 256 byte.
if (sysctl(mib, 2, buf, &len, NULL, 0) == -1) | ||
return done; | ||
done += len; | ||
buf += len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arithmetic on a void pointer is a GNU extension
(and so at least the strict-warnings build fails. "Who does non-strict-warnings builds anyway?")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't even compile this ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was more about "I have no idea whether a non-strict-warnings build would succeed; I did not expect you to have built it on FreeBSD already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any progress on FreeBSD?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally got to install a FreeBSD 12-current VM and test a recent OpenSSL on it.
The test suite runs clean and I confirmed with gdb that we are using the getrandom() function from libc and it is returning (at least nominally) random data into the buffer.
It turns out that the truss(1) output is not terribly useful for the sysctl case, since so much information is hidden behind pointers that it's tough to be sure that the sysctl in the trace is actually the sysctl we're interested in. That said, a well-placed fprintf() does confirm that the sysctl version runs successfully. Interestingly, on my |
I'm not sure why you think it's surprising. I've updated the "--with-rand-seed=getrandom" documentation to say it's using a system call, and sysctl is a system call. |
INSTALL
Outdated
@@ -224,7 +224,8 @@ | |||
os: Use a trusted operating system entropy source. | |||
This is the default method if such an entropy | |||
source exists. | |||
getrandom: Use the L<getrandom(2)> system call if available. | |||
getrandom: Use the L<getrandom(2)> or equivalant system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equivalant -> equivalent
# endif | ||
|
||
# if defined(__linux) && defined(SYS_getrandom) | ||
return (int)syscall(SYS_getrandom, buf, buflen, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm only a layman in this matter, so please excuse me if my question is off-topic or stupid: In this LWN Article the author Jonathan Corbet describes that there was a lot of discussion about whether the glibc getrandom() call should be a thread cancellation point or not. Finally it was decided that it should be.
The most extensive argument, though, was over whether getrandom() should be a thread cancellation point.
In other words, what should happen if pthread_cancel() is called on a thread that is currently blocked
in getrandom()? The original patch did make getrandom() into a cancellation point; it still behaves
that way in the version merged for 2.25, but it had to survive a lot of argument to get there.
While the glibc getrandom() is a thread cancellation point, your raw syscall is not. Does this difference have any practical relevance for OpenSSL?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's relevant. Cancellation points are just a bunch of functions where the cancellation can happen, which are typically function that can block.
+1 from me for Kurt's pull request, but I don't have the expertise to assess the correctness of the implementation. |
I was planning on updating some comment, let me do that now. |
I intend to squash this all into 1 commit and change the commit message to: |
f32d435
to
bcdae92
Compare
I had to resolve conflicts, I've fixed them and squashed all the commits. |
# endif | ||
# endif | ||
|
||
# if (defined(__FreeBSD__) && __FreeBSD_version >= 1200061) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth splitting this like was done for __GLIBC_PREREQ
?
If neither __FreeBSD__
nor __FreeBSD_version
is defined, isn't there is a syntax error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for the __GLIBC_PREREQ() on the next line is that it takes parameters, so you need to be sure it exist before you use it.
crypto/rand/rand_unix.c
Outdated
* of longs. | ||
*/ | ||
if (buflen % sizeof(long) != 0) | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there reason to punish caller that hard? Why not read buflen - buflen % sizeof(long), and then read one long to temporary storage and and copy buflen %sizeof(long) bytes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that the loop is the only example I could find for something that is undocumented, and the comment is based on what I saw in the source, and I'm not even 100% sure that's correct. But that loop clearly has a problem if the implementation is really using a long.
I think us trying to detect which implementation is in the kernel, and then starting using longs is the wrong way to go. I have no idea if there are other cases where it returns a buffer that's shorter than the one we ask. But the code with longs didn't look at the length and always filled a whole long.
So I think the easiest way to deal with this is have the calling code always request a multiple of the size of a long, and throw away the rest if it needs less. We currently always have a multiple of the size of a long anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We currently always have a multiple of the size of a long anyway.
Then it would arguably be appropriate to use assert here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that ossl_assert()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I rather meant
if (!ossl_assert(buflen % sizeof(long) == 0))
return 0;
so that it still return 0 in case condition is not met, and debug build crashes. If you don't have condition just assert is more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't know it could return a value. I just merged this like an hour ago. I'll make a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mspncp pushed me to test that patch. OS: Debian Stretch (9.4)
so, it looks like the code works as expected. |
Thanks for testing! BTW: ping @openssl/committers, can anyone of you review this, so we can merge it before the next release? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add support for gettng entropy using system calls
The commit message contains a typo (gettng -> getting). Also, it is a bit vague; after all, a read() is also a system call. How about using the following?
Add fallback for missing glibc getentropy() call
@@ -224,7 +224,8 @@ | |||
os: Use a trusted operating system entropy source. | |||
This is the default method if such an entropy | |||
source exists. | |||
getrandom: Use the L<getrandom(2)> system call if available. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
equivalant -> equivalent
if (sysctl(mib, 2, buf, &len, NULL, 0) == -1) | ||
return done; | ||
done += len; | ||
buf += len; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any progress on FreeBSD?
We use getrandom() from glibc, not getentropy. And it's also not glibc specific. How about: FreeBSD was tested, but only the version that doesn't use getrandom(), which is was most people currently will end up using anyway. |
Or just: Add support for getrandom() or equivalent system calls and use them by default |
Yeah, my fault.
That's ok with me. |
So can someone review this? |
If nobody else shows up, you can merge with my approval. But I'd really prefer if some else would have a look at this who is more familiar with the technical details. |
…y default Reviewed-by: Dr. Matthias St. Pierre <Matthias.St.Pierre@ncp-e.com> GH: #5910
@kroeckx when hitting this code path, Can you take a look at that and make sure this ouput is consistent with the actual behavior? |
works as expected here: $ openssl11 version -a |
The |
So basically And in the |
Correct. |
Glibc only provides getrandom() since version 2.25, while the Linux
kernel has support for it since version 3.17.
Checklist