Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace C implementation of OS Random engine with Python one that just calls os.urandom #2073

Merged
merged 29 commits into from Jul 1, 2015

Conversation

glyph
Copy link
Contributor

@glyph glyph commented Jun 27, 2015

This is alternate fix for #2007 that moves file descriptor shenanigans (and C code) out of cryptography entirely. It is intended to supersede #2035 .

apparently (?) ENGINE_by_id treats its ID as an opaque *pointer* key and
not actually as a string, and while CPython's CFFI support seems to
manage to preserve the pointer identity when using the same Python
string, PyPy doesn't.  Fix things to use a cffi-wrapped pointer again
and tests pass on PyPy.
@glyph glyph mentioned this pull request Jun 27, 2015
void (*add)(const void *, int, double);
int (*pseudorand)(unsigned char *, int);
int (*status)();
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be written typedef struct { stuff } RAND_METHOD; so you don't need to know the name

@retain
@cls.ffi.callback("ENGINE_GEN_INT_FUNC_PTR", error=0)
def osrandom_finish(engine):
return 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think finish and init can be totally dropped.

make sure we're not in an error state when we start, because then all
bets are off and we might consume an error we didn't cause. then clear
the error queue, which restores the behavior of the way the C module was
previously checking for existence of its engine.
assert result == 1
looked_up_engine = lib.ENGINE_by_id(_osrandom_engine_id)
assert looked_up_engine != ffi.NULL
return 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return values aren't being used here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a test which fails without them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we should update the test. It was written on the assumption of a C API. Now that this code is python we can do something sane, instead of C's bonkers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"something sane" being "raise an exception on the second call"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds reasonable

On Sat, Jun 27, 2015 at 9:25 PM, Glyph notifications@github.com wrote:

In src/cryptography/hazmat/bindings/openssl/binding.py
#2073 (comment):

  • engine = lib.ENGINE_new()
  • try:
  •    result = lib.ENGINE_set_id(engine, _osrandom_engine_id)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_set_name(engine, _osrandom_engine_name)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_set_RAND(engine, method)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_add(engine)
    
  •    assert result == 1
    
  • finally:
  •    result = lib.ENGINE_free(engine)
    
  •    assert result == 1
    
  • looked_up_engine = lib.ENGINE_by_id(_osrandom_engine_id)
  • assert looked_up_engine != ffi.NULL
  • return 1

"something sane" being "raise an exception on the second call"?


Reply to this email directly or view it on GitHub
https://github.com/pyca/cryptography/pull/2073/files#r33417937.

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any recommendations for what a good exception type would be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RuntimeError? (Let's just get something down on paper, we can argue over
that detail later)

On Sat, Jun 27, 2015 at 9:29 PM, Glyph notifications@github.com wrote:

In src/cryptography/hazmat/bindings/openssl/binding.py
#2073 (comment):

  • engine = lib.ENGINE_new()
  • try:
  •    result = lib.ENGINE_set_id(engine, _osrandom_engine_id)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_set_name(engine, _osrandom_engine_name)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_set_RAND(engine, method)
    
  •    assert result == 1
    
  •    result = lib.ENGINE_add(engine)
    
  •    assert result == 1
    
  • finally:
  •    result = lib.ENGINE_free(engine)
    
  •    assert result == 1
    
  • looked_up_engine = lib.ENGINE_by_id(_osrandom_engine_id)
  • assert looked_up_engine != ffi.NULL
  • return 1

Any recommendations for what a good exception type would be?


Reply to this email directly or view it on GitHub
https://github.com/pyca/cryptography/pull/2073/files#r33417944.

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

@glyph
Copy link
Contributor Author

glyph commented Jun 28, 2015

The last commit addresses @alex's last bit of feedback, so I think this is good to go now?

import threading

from cryptography.hazmat.bindings._openssl import ffi, lib


@ffi.callback("int (*)(unsigned char *, int)", error=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@reaperhulk do we need to call ERR_put_error on error here, or will stuff work ok without it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things will work fine without it, but I'm curious why we're returning -1 here in case of error? 0 is what the old code returned.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.openssl.org/docs/crypto/RAND_bytes.html -1 seems more correct
than 0 (particularly for psueobytes)

On Sun, Jun 28, 2015 at 3:49 PM, Paul Kehrer notifications@github.com
wrote:

In src/cryptography/hazmat/bindings/openssl/binding.py
#2073 (comment):

import threading

from cryptography.hazmat.bindings._openssl import ffi, lib

+@ffi.callback("int (*)(unsigned char *, int)", error=-1)

Things will work fine without it, but I'm curious why we're returning -1
here in case of error? 0 is what the old code returned.


Reply to this email directly or view it on GitHub
https://github.com/pyca/cryptography/pull/2073/files#r33426214.

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 is definitely the right response for pseudo (since that's a terrible response code system and we handled it that way in the past), but I'm not convinced it's the right one for RAND_bytes. It's probably not a big deal, but I do not know what OpenSSL does when you tell it the random engine doesn't support RAND_bytes as opposed to telling it that it's broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everysingle caller of RAND_bytes in ssl/ uses <= 0 to check the
status, so it seems to be immaterial.

On Sun, Jun 28, 2015 at 3:58 PM, Paul Kehrer notifications@github.com
wrote:

In src/cryptography/hazmat/bindings/openssl/binding.py
#2073 (comment):

import threading

from cryptography.hazmat.bindings._openssl import ffi, lib

+@ffi.callback("int (*)(unsigned char *, int)", error=-1)

-1 is definitely the right response for pseudo (since that's a terrible
response code system and we handled it that way in the past), but I'm not
convinced it's the right one for RAND_bytes. It's probably not a big
deal, but I do not know what OpenSSL does when you tell it the random
engine doesn't support RAND_bytes as opposed to telling it that it's broken.


Reply to this email directly or view it on GitHub
https://github.com/pyca/cryptography/pull/2073/files#r33426295.

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

@alex
Copy link
Member

alex commented Jun 28, 2015

Besides that comment this LGTM, @reaperhulk what do you think?

Now that this is more easily worked with (because python), we should file some follow up issues to write tests that stub out os.urandom and verify that the results are used correctly.

@reaperhulk
Copy link
Member

Since we do have the ability to do some monkeypatching in tests I'd like to see at least one test that verifies the data from os.urandom is being copied into OpenSSL-land properly. It can really just call RAND_bytes directly.

Incidentally, I did a quick crappy test to see if there's any significant speed difference (as this method should result in at least one additional memory copy in addition to calling into Python) and a loop of 10000 fetches of 8192 bytes of randomness took about the same amount of time with both the old and new approach.

@alex
Copy link
Member

alex commented Jun 29, 2015

@reaperhulk 8192 is a pretty big read for /dev/urandom, if you've got the script handy, could you try with 16 byte reads?

@reaperhulk
Copy link
Member

100,000 16 byte reads takes ~0.6-0.7s with the C engine code, ~0.9-1.0s with this PR.

@reaperhulk
Copy link
Member

Here's an even more minimal script that attempts to solely measure the cost of the RAND_bytes call (my previous code allocated a new buffer and also pulled out the data after each call)

from cryptography.hazmat.backends.openssl import backend

buf = backend._ffi.new("char[16]")
for _ in range(100000):
    res = backend._lib.RAND_bytes(buf, 16)
    assert res == 1

On my machine running time python testrand.py (best run of 5)

This PR

real    0m0.848s
user    0m0.567s
sys 0m0.276s

Variance between 0.848s and 0.860s

Master

real    0m0.535s
user    0m0.326s
sys 0m0.205s

Variance between 0.535s and 0.554s

I don't necessarily think this is going to be a serious problem, just something to be aware of. It could conceivably negatively affect TLS handshake performance? PyOpenSSL won't be affected though because it doesn't activate this engine.

@glyph
Copy link
Contributor Author

glyph commented Jun 29, 2015

I cranked up the iteration count by another 10x, switched to xrange instead of range, and measured with PyPy.

best of 3:

This PR

real    0m2.193s
user    0m0.731s
sys 0m1.463s

Variance between 2.292 and 2.193

Master

real    0m1.857s
user    0m0.407s
sys 0m1.447s

Variance between 1.857 and 1.945

It looks to me like we're pretty clearly just measuring function call / buffer copy overhead for your Python runtime here, and pypy does commensurately better. How much random data does one TLS handshake require? My suspicion is that this amount of additional overhead (on pypy, approximately 3µs/call - and it is per call, not per byte, as the overhead all but disappears with fewer calls generating the same number of bytes) will verge on undetectable. Just ballpark-wise, http://www.semicomplete.com/blog/geekery/ssl-latency.html (from 2010) puts TLS handshake latency at 322000µs, so we'd have to be calling RAND_bytes 1000 times in the course of the handshake to cause a 1% degradation to performance.

@alex
Copy link
Member

alex commented Jun 29, 2015

Yeah, I think the performance implications of this are totally 100% fine :-)

@glyph
Copy link
Contributor Author

glyph commented Jun 30, 2015

That last commit should address @reaperhulk's feedback. Anything else?

@codecov-io
Copy link

Current coverage is 99.8%

Merging #2073 into master will change coverage by 0% by cd8977b

Coverage Diff

@@            master   #2073   diff @@
======================================
  Files          112     112       
  Stmts        10632   10683    +51
  Branches      1223    1225     +2
  Methods          0       0       
======================================
+ Hit          10611   10662    +51
  Partial         21      21       
  Missed           0       0       

Powered by Codecov

@reaperhulk
Copy link
Member

The patched test is failing on libressl.

@glyph
Copy link
Contributor Author

glyph commented Jun 30, 2015

The patched test is failing on libressl.

whyyyyyyyy

OK. Any tips on how to set up a libressl build environment that I can test with?

@glyph
Copy link
Contributor Author

glyph commented Jun 30, 2015

For those following along at home, I did a brew install libressl and then this helpful shell function to switch OpenSSL's:

function kegger () {
    for kegtoadd in "$@"; do
        local kegpath="/usr/local/opt/$kegtoadd";
        export LDFLAGS="-L${kegpath}/lib $LDFLAGS";
        export CPPFLAGS="-I${kegpath}/include $CPPFLAGS";
        export PATH="${kegpath}/bin:$PATH";
    done;
}

@glyph
Copy link
Contributor Author

glyph commented Jun 30, 2015

OpenSSL:

int RAND_bytes(unsigned char *buf, int num)
{
    const RAND_METHOD *meth = RAND_get_rand_method();
    if (meth && meth->bytes)
        return meth->bytes(buf, num);
    return (-1);
}

LibreSSL:

/*
 * Hurray. You've made it to the good parts.
 */
int
RAND_bytes(unsigned char *buf, int num)
{
    if (num > 0)
        arc4random_buf(buf, num);
    return 1;
}

See also libressl/portable#17

@glyph
Copy link
Contributor Author

glyph commented Jun 30, 2015

Regarding the build failures:

  • the travis failure is a spurious issue with installing dependencies on the docs builder.
  • the codecov failure is a result of LibreSSL not being part of the test matrix; I added that code as a result of @reaperhulk's feedback, so I'm assuming LibreSSL should be in that matrix?

@reaperhulk
Copy link
Member

Libre is in our jenkins cluster and historically we haven't been able to correlate coverage from that. codecov may give us that ability now, but that's going to be a separate investigation. For now I'd say you should make a new function that triggers a skip if you pass a string with "LibreSSL" in it, call that from your test, and then add one more function that tests the skip (using with pytest.raises(pytest.skip.Exception):)

@reaperhulk
Copy link
Member

LGTM, awesome work @glyph. We should (probably as a separate PR) update our docs around the OS random engine to note explicitly that it uses the Python os.urandom implementation (and what that entails) along with a note that LibreSSL patches random itself.

@alex
Copy link
Member

alex commented Jul 1, 2015

Merge?

On Tue, Jun 30, 2015 at 8:18 PM, Paul Kehrer notifications@github.com
wrote:

LGTM, awesome work @glyph https://github.com/glyph. We should (probably
as a separate PR) update our docs around the OS random engine to note
explicitly that it uses the Python os.urandom implementation (and what
that entails) along with a note that LibreSSL patches random itself.


Reply to this email directly or view it on GitHub
#2073 (comment).

"I disapprove of what you say, but I will defend to the death your right to
say it." -- Evelyn Beatrice Hall (summarizing Voltaire)
"The people's good is the highest law." -- Cicero
GPG Key fingerprint: 125F 5C67 DFE9 4084

@glyph
Copy link
Contributor Author

glyph commented Jul 1, 2015

YES PLEASE

On Jun 30, 2015, at 6:08 PM, Alex Gaynor notifications@github.com wrote:

reaperhulk added a commit that referenced this pull request Jul 1, 2015
Replace C implementation of OS Random engine with Python one that just calls os.urandom
@reaperhulk reaperhulk merged commit 0a4c9cc into pyca:master Jul 1, 2015
@glyph
Copy link
Contributor Author

glyph commented Jul 1, 2015

🎉 🎈 ⭐ 🌟 🌟 ⭐ ✨ 💖 👍 🆗 🎆

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 9, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging this pull request may close these issues.

None yet

4 participants