Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New osrandom_engine in C #3229

Merged
merged 12 commits into from Dec 9, 2016
Merged

New osrandom_engine in C #3229

merged 12 commits into from Dec 9, 2016

Conversation

@tiran
Copy link
Contributor

@tiran tiran commented Nov 10, 2016

Inspired by Python/random.c and the old implementation. Let's ask travis if it works on all platforms.

Signed-off-by: Christian Heimes christian@python.org

  • Test /dev/urandom path
  • Test getrandom() path
  • Test getentropy() path
  • Test CryptGenRandom path
@mention-bot
Copy link

@mention-bot mention-bot commented Nov 10, 2016

@tiran, thanks for your PR! By analyzing the history of the files in this pull request, we identified @reaperhulk, @glyph and @alex to be potential reviewers.

@glyph
Copy link
Contributor

@glyph glyph commented Nov 10, 2016

On the face of it, this is a terrible idea; ~400 lines of C to replace ~40 lines of Python; 2 open file descriptors for your whole process lifetime instead of 1.

However, knowing the history of this I am sure there's a very good reason for this change. Can you link it to an issue describing said reasons, please? I think it's super important to thoroughly document what is going on here, since it is incredibly confusing; without very clear explication I am concerned that a future maintainer will haplessly reintroduce the bug that this is fixing.

(Secretly, of course, I want to read that description and have a brilliant idea that will salvage the Python implementation and not have to drag around half a thousand lines of duplicative tricky wrapper C code; alas, if I haven't by now, I very much doubt I'll be able to.)

@tiran tiran force-pushed the tiran:new_osrandom_engine branch from b96bbe5 to 26836ff Nov 10, 2016
@tiran
Copy link
Contributor Author

@tiran tiran commented Nov 10, 2016

Ha, mentionbot betrayed me to @glyph :)

@reaperhulk ask me to look into the matter. My patch is still in a very early phase. Basically I'm just abusing the build and CI testing to test my quick hack on other platforms. I'll take care of documentation later.

@reaperhulk
Copy link
Member

@reaperhulk reaperhulk commented Nov 11, 2016

@glyph: The answer is that we run into endless problems with calling into Python from C on the margins. The most recent of which were apps like Kodi, which embed Python as a scripting option. A Kodi plugin adds cryptography, which registers a callback into Python...but there's only one instance of OpenSSL in the process so now things that aren't even aware of Python are attempting to call back into Python. You could argue that disabling the osrandom engine is the right solution in such a scenario but "right" and "things people will do" are not always overlapping sets.

We also continue to have subinterpreter problems which are probably soluble but I am so very tired of trying to fix it when moving it entirely to C will instantly resolve this entire class of issue.

Combine that with our interpreter shutdown issues in newer Pythons and the fact that we need to entirely abandon our Python-based locking callbacks (see: #3226 and linked issues) and there's a strong incentive to move this entirely to C because frankly users will have less problems.

The cost is that we are once again hoisting the complexity CPython formerly provided for us into our own domain and will be responsible for managing that. I don't believe we can continue to say "this isn't a big deal" though. Every time cryptography fails in a weird, undebuggable way when dangerous, unsupported libraries like PyCrypto do not we lose.

Sorry for the semi-rambling response, but I thought you deserved a reasonable dump of my thoughts on why this has become a desirable scenario. The tl;dr is "being reliably functional in all the environments Python is used in trumps the ideological purity of using os.urandom"

@reaperhulk
Copy link
Member

@reaperhulk reaperhulk commented Nov 11, 2016

@tiran Thanks for the awesome work here so far. I look forward to reviewing it.

As an aside: CCRandomGenerateBytes might be an avenue we could pursue in a future PR on the mac side (available in 10.10+). SecRandomCopyBytes is also available since 10.7 but we'd have to investigate the implementation of each of those to see if they're truly suitable.

@tiran tiran force-pushed the tiran:new_osrandom_engine branch 4 times, most recently from 77e9f2b to 1d38d6f Nov 12, 2016
@tiran tiran changed the title New osrandom_engine in C (WIP) New osrandom_engine in C Nov 12, 2016
@tiran
Copy link
Contributor Author

@tiran tiran commented Nov 12, 2016

@reaperhulk The PR is ready for review. I added a CCRandomGenerateBytes implementation.

@tiran tiran force-pushed the tiran:new_osrandom_engine branch from 1d38d6f to aa43143 Nov 12, 2016
@tiran
Copy link
Contributor Author

@tiran tiran commented Nov 12, 2016

All Net/Free/Open BSD seem to use the same syscall name SYS_getentropy. I'm using #ifdef SYS_getentropy to select getentropy() instead of /dev/urandom.
https://bz-attachments.freebsd.org/attachment.cgi?id=149067

@tiran tiran force-pushed the tiran:new_osrandom_engine branch from aa43143 to 6ed7f91 Nov 12, 2016
Copy link
Member

@reaperhulk reaperhulk left a comment

First round of review, haven't looked closely at the actual functionality yet

/****************************************************************************
* BSD getentropy
*/
#if CRYPTOGRAPHY_OSRANDOM_ENGINE == CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

This should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

static int osrandom_rand_status(void) {
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM */

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

This comment should be CRYPTOGRAPHY_OSRANDOM_ENGINE_CC_RANDOM

This comment has been minimized.

@tiran

tiran Nov 13, 2016
Author Contributor

Thx, I messed up some find and replace operations. The error was consistent over all files so it just worked.

static int osrandom_rand_status(void) {
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM */

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

}
}
}
#endif

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

Should this have a CRYPTOGRAPHY_OSRANDOM_NEEDS_DEV_URANDOM comment?

* Linux getrandom engine with fallback to dev_urandom
*/

#if CRYPTOGRAPHY_OSRANDOM_ENGINE == CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

}
return 1;
}
#endif /* CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY */

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

/* for defined(BSD) */
#include <sys/param.h>

#ifdef BSD

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

Does FreeBSD define this? I think they typically use __FreeBSD__. What about macOS (which apparently has SYS_getentropy as of 10.12)?

#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_CC_RANDOM
#elif defined(BSD) && defined(SYS_getentropy)
/* OpenBSD 5.6+ */
#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

This should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY with the naming fixes requested above

#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM
#elif defined(__linux__) && defined(SYS_getrandom)
/* Linux 3.4.17+ */
#define CRYPTOGRAPHY_OSRANDOM_ENGINE CRYPTOGRAPHY_OSRANDOM_ENGINE_GETENTROPY

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

And this should be CRYPTOGRAPHY_OSRANDOM_ENGINE_GETRANDOM

#define CRYPTOGRAPHY_OSRANDOM_NEEDS_DEV_URANDOM 1
#endif

/* Open SSL 1.1.0+ has no ERR_R_RAND_LIB */

This comment has been minimized.

@reaperhulk

reaperhulk Nov 12, 2016
Member

What do you think about always passing 0 and not worrying about a macro here?

@reaperhulk
Copy link
Member

@reaperhulk reaperhulk commented Nov 12, 2016

Coverage is failing because _osrandom_rand_bytes and _osrandom_rand_status need to be removed from src/cryptography/hazmat/bindings/openssl/binding.py

@tiran tiran force-pushed the tiran:new_osrandom_engine branch 2 times, most recently from da3f779 to 807f437 Nov 13, 2016
assert name
if sys.platform.startswith('linux'):
assert name in ['getrandom', '/dev/urandom']
elif sys.platform == 'win32':

This comment has been minimized.

@reaperhulk

reaperhulk Nov 13, 2016
Member

Our CI can't cover this branch (we don't get coverage from our windows builders, although this is being executed) so we'll need to figure out a way to rewrite this. Maybe make the asserts a separate function that takes a string _engine_name_assert(sys.platform) and then

def _engine_name_assert(platform):
    if platform.startswith('linux'):
        assert name in ['getrandom', '/dev/urandom']
   elif platform == 'win32':
        assert name == 'CryptGenRandom'

and then a separate test for _engine_name_assert that explicitly passes win32?

@alex
Copy link
Member

@alex alex commented Nov 13, 2016

Failing docs tests are relevant; looks like fallback needs to be added to the wordlist :-)

@tiran tiran force-pushed the tiran:new_osrandom_engine branch 2 times, most recently from 952ebe9 to 2f00fae Nov 14, 2016
@tiran
Copy link
Contributor Author

@tiran tiran commented Nov 14, 2016

I added fallback to spelling wordlist. On my machine the doc tests are failing with a different error. My spell checker knows about fallback but does not recognize personalization:

hazmat/primitives/cryptographic-hashes.rst:129: (personalization)
@tiran tiran force-pushed the tiran:new_osrandom_engine branch from 2f00fae to 075de30 Nov 14, 2016
@reaperhulk
Copy link
Member

@reaperhulk reaperhulk commented Nov 14, 2016

https://codecov.io/gh/pyca/cryptography/compare/3a15b03e92c9fdeadff04ddd2ce505028b279b86...075de30232f1fd6f1e0d541844e01eab93a006b3#74657374732F68617A6D61742F6261636B656E64732F746573745F6F70656E73736C2E7079-280...281 is our coverage problem that's causing the status check failures.

We do have FreeBSD 10 (but not 11) in our jenkins cluster, but no OpenBSD and we don't get coverage from those builders right now. 😢

In an ideal world what would CI look like here?

@tiran tiran force-pushed the tiran:new_osrandom_engine branch 3 times, most recently from 669295e to d4ae881 Nov 14, 2016
if (urandom_cache.fd >= 0) {
do {
n = close(fd);
} while (n < 0 && errno == EINTR);

This comment has been minimized.

@vstinner

vstinner Nov 15, 2016

It would be kind to call PyErr_CheckSignals() to fail fast on CTRL+C: PEP 475.

This comment has been minimized.

@tiran

tiran Nov 15, 2016
Author Contributor

The engine does not not use any Python API at all. I don't consider SIGINT a problem here. The engine is a low level part of OpenSSL's CPRNG. The CPRNG API should take care of of interrupts.

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

Is closign an fd in a loop safe? I seem to recall it being dangerous :-/

Copy link
Member

@reaperhulk reaperhulk left a comment

Looks like this needs a rebase as well now

@@ -40,6 +40,10 @@ greater.
Activates the OS random engine. This will effectively disable OpenSSL's
default CSPRNG.

.. method:: osrandom_engine_implementation()

This comment has been minimized.

@reaperhulk

reaperhulk Nov 29, 2016
Member

.. versionadded:: 1.7 please

This comment has been minimized.

@tiran

tiran Nov 29, 2016
Author Contributor

fixed

@tiran tiran force-pushed the tiran:new_osrandom_engine branch from e7ddb5f to c67e987 Nov 29, 2016
@reaperhulk reaperhulk dismissed their stale review Dec 1, 2016

changes made

LICENSE Outdated
The code used in the OpenSSL locking callback is derived from the same in
Python itself, and is licensed under the terms of the PSF License Agreement.
The code used in the OpenSSL locking callback and OS random engine is derived
from the same in Python itself, and is licensed under the terms of the PSF

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

Ugh, nit, not technically a bug in your PR (but if I go fix it separately it'll just cause a conflict). Can you s/Python/CPython/

| Windows | ``CryptGenRandom()`` |
+------------------------------------------+------------------------------+
| Linux >= 3.4.17 with working | ``getrandom(GRND_NONBLOCK)`` |
| ``SYS_getrandom`` syscall | |

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

You can drop the leading SYS here.

This comment has been minimized.

@tiran

tiran Dec 6, 2016
Author Contributor

Unless you really insist, I'd rather keep SYS_getrandom because that is what the random engine actually uses. It's going to make it easier to distinguish between platforms with the syscall and platforms with a glibc wrapper later.

#ifndef GRND_NONBLOCK
#define GRND_NONBLOCK 0x0001
#endif /* GRND_NONBLOCK */
#endif /* _linux__ */

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

Missing leading _ here

#ifdef __linux__
/* for SYS_getrandom */
#include <sys/syscall.h>
#ifndef GRND_NONBLOCK

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

In what circumstances it this not defined?

This comment has been minimized.

@tiran

tiran Dec 6, 2016
Author Contributor

The constant is defined in user land headers. Sometimes the running Kernel has the syscall but the userland hasn't caught up yet.

@@ -282,6 +271,23 @@ def test_activate_builtin_random_already_active(self):
e = backend._lib.ENGINE_get_default_RAND()
assert e == backend._ffi.NULL

def test_osrandom_engine_implementation(self):
name = backend.osrandom_engine_implementation()
assert name in ['/dev/urandom', 'CCRandomGenerateBytes',

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

CCRandomGenerateBytes is not used anywhere.

}
}

static const char* osurandom_get_implementation(void) {

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

* next to the function name, rather than the type please.

if (urandom_cache.fd >= 0) {
do {
n = close(fd);
} while (n < 0 && errno == EINTR);

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

Is closign an fd in a loop safe? I seem to recall it being dangerous :-/

/* Can fail with:
* - ENOSYS: Kernel does not support the syscall.
* - ENOPERM: seccomp prevents syscall.
* - EAGAIN: Kernel CRPNG has not been seeded yet.

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

In the even of EAGAIN I'm pretty sure we want to still use getrandom, not fall back.

This comment has been minimized.

@tiran

tiran Dec 6, 2016
Author Contributor

Let's discuss this matter on IRC later. I don't want to repeat the /dev/urandom discussion from python-dev on github. :)

This comment has been minimized.

@alex

alex Dec 7, 2016
Member

Ok, you know how to ping me.

This comment has been minimized.

@socketpair

socketpair Nov 15, 2018
Contributor

  1. Please give link to that conversation.
  2. It would be nice to place explanation in documentation.
long n;
while (size > 0) {
do {
n = syscall(SYS_getrandom, buffer, size, GRND_NONBLOCK);

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

What does this do on large (>256) size?

This comment has been minimized.

@tiran

tiran Dec 6, 2016
Author Contributor

http://man7.org/linux/man-pages/man2/getrandom.2.html

If the
/dev/urandom pool has been initialized, reads of up to 256 bytes will
always return as many bytes as requested and will not be interrupted
by signals. No such guarantees apply for larger buffer sizes. For
example, if the call is interrupted by a signal handler, it may
return a partially filled buffer, or fail with the error EINTR.

}

static int osrandom_rand_status(void) {
if (urandom_cache.fd < 0) {

This comment has been minimized.

@alex

alex Dec 6, 2016
Member

Simplifies to return urandom_cache.fd > 0

This comment has been minimized.

@tiran

tiran Dec 6, 2016
Author Contributor

>= 0 or != -1. Theoretically the fd can by 0,

tiran and others added 11 commits Nov 10, 2016
Inspired by Python/random.c and the old implementation.

Signed-off-by: Christian Heimes <christian@python.org>
* Fix naming bug caused by search 'n replace mistake
* Make it easier to override osrandom auto-detection
* Add engine ctrl and backend API to get implementation from ENGINE

Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
* read() returns size_t

Signed-off-by: Christian Heimes <christian@python.org>
This change allows us to test all the engines in our CI:
* getentropy (tested by macOS sierra)
* getrandom (tested on several linux builders)
* /dev/urandom (tested on FreeBSD, OS X 10.11 and below, & older linux)
* CryptGenRandom (tested on windows builders)

I also fixed bugs preventing compilation in the getentropy code
Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
Signed-off-by: Christian Heimes <christian@python.org>
@tiran tiran force-pushed the tiran:new_osrandom_engine branch from c05ea01 to 8575fb2 Dec 6, 2016
@alex
Copy link
Member

@alex alex commented Dec 7, 2016

The EAGAIN issue is the last remaining issue for me.

Add error reporting strings for various error cases. This gives us much
nicer and understandable error messages.

SYS_getrandom() EAGAIN is now an error. Cryptography refuses to
initialize its osrandom engine when the Kernel's CPRNG hasn't been
seeded yet.

Signed-off-by: Christian Heimes <christian@python.org>
@tiran tiran force-pushed the tiran:new_osrandom_engine branch from 8d8df86 to 2abbab4 Dec 8, 2016
@reaperhulk reaperhulk merged commit 2e71776 into pyca:master Dec 9, 2016
6 checks passed
6 checks passed
codecov/patch 100% of diff hit (target 100%)
Details
codecov/project 100% (target 100%)
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
jenkins/docs Docs built successfully.
Details
jenkins/pep8 PEP8 passed.
Details
jenkins/pr Completed successfully.
Details
@tiran tiran deleted the tiran:new_osrandom_engine branch Dec 9, 2016
jsonn pushed a commit to jsonn/pkgsrc that referenced this pull request Dec 19, 2016
1.7.1 - 2016-12-13
~~~~~~~~~~~~~~~~~~

* Fixed a regression in ``int_from_bytes`` where it failed to accept
  ``bytearray``.

1.7 - 2016-12-12
~~~~~~~~~~~~~~~~

* Support for OpenSSL 1.0.0 has been removed. Users on older version of OpenSSL
  will need to upgrade.
* Added support for Diffie-Hellman key exchange using
  :meth:`~cryptography.hazmat.primitives.asymmetric.dh.DHPrivateKeyWithSerialization.exchange`
* The OS random engine for OpenSSL has been rewritten to improve compatibility
  with embedded Python and other edge cases. More information about this change
  can be found in the
  `pull request <https://github.com/pyca/cryptography/pull/3229>`_.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

7 participants
You can’t perform that action at this time.