-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os.urandom() fails under high load #62956
Comments
I have seen complains from e.g. Tarek that os.urandom() fails under high load: https://twitter.com/tarek_ziade/status/362281268215418880 The problem is caused by file descriptor limits. os.urandom() opens /dev/urandom for every call. How about os.urandom() uses a persistent file descriptor? That should eliminate the error. It may alsos speed up os.urandom() because a persistent FD gets rid of open() and close() syscalls.
|
dev_urandom_python() should handle ENFILE and ENOENT differently to raise a different exception. Or it should always call PyErr_SetFromErrno(PyExc_OSError); ? Can tarek tell us more about its usecases: is he directly calling os.urandom() or does he use the random module? How many threads?
os.urandom() is called at Python startup to generate a "secret key" for random hash. If the file descriptor is never closed, the next file descriptor will be 4 instead of the expect 3. Always keeping an internal file descriptor open may have unexpected effects like leaking a file descriptor to a child process... (see the PEP-446, not implemented yet). I'm ok to keep a fd open if the user controls the lifetime of the object (lifetime of the fd). For example, I expect that rng = SystemRandom() opens /dev/urandom when the object is created, and closes it when the object is destroyed. |
I don't think that's bug in os.urandom(). If os.urandom() doesn't fail, something else will fail soon after. |
I agree with Antoine. Exhausting the FDs is not the problem, the problem is the misleading error. |
the random pool can be exhausted, but this is not "soon after" I think. In Linux and Mac OS X, ulimit -n defaults to 512 and 256. It's very easy to reach that limit if you write a web app that uses this API.
Do you suggest that we should not use os.urandom on high load ? Opening an FD on every call sounds under optimal, I am not seeing any drawback not to try to optimize that API. |
I was using ws4py inside greenlets. ws4py uses os.urandom() to generate some keys. So one single thread, many greenlets. |
I don't think he's referring to the entropy pool, but to RLIMIT_NOFILE.
What does high load mean?
Well, first we'll have to make the code thread-safe, if we want to |
It's highly unlikely that you are every going to exhaust the CPRNG to a
The drawback is a slightly more complicated implementation that has to |
a web app with a few hundreds concurrent requests.
I use greenlets. But, I don't know - are you suggesting os.urandom() should be marked in the documentation as "DOES NOT SCALE" and I should use another API ? Which one ? |
2013/8/16, Tarek Ziadé <report@bugs.python.org>:
Well, even with greenlets, I assume you're using at least one FD |
I do many calls on urandom() so that's the FD bottleneck.
Of course it is. But it looks like you know better without having looked at the code. :)
Let me know how to do this without being able to prevent the API to close the FD everytime.
Then we should improve it or deprecate it. |
Am 16.08.2013 18:24, schrieb Charles-François Natali:
Why locking? /dev/urandom is a pseudo char device. You can have multiple |
Unless you're doing many calls *in parallel* it's unlikely to be a os.urandom() is a convenience function, it doesn't have to be extremely |
So please explain me :-)
Simply open('/dev/urandom', 'rb').
I don't think it can be fixed. I think Christian's working on a PEP |
You must put a lock around the open() call, though, to avoid calling it |
Am 16.08.2013 18:47, schrieb Charles-François Natali:
In the light of the recent Android issue with PRNGs [1] I don't think |
That's what we're saying since message 1. Antoine, allo quoi! :)
I suggest that you tell it the documentation then, and explain that it does not scale and people should write their own thing. |
Exactly (unless the FD is open during the module initialization, |
Yeah, sure. |
Attaching a patch to make error reporting better. |
it sounded like you did not really want any explanation
do you mean that we cannot have two parallel calls of that function ?
a web socket application that spawns one socket per connection, then uses a lib that calls many times os.urandom(), will generate most of its FDs on os urandom but since you said that os.urandom() should not be used in the first place - that's what I will keep in mind |
Just to be explicit, |
Good point, Donald. os.urandom() is the only (simple) way to access the Windows randomness pool. |
Why didn't you include ENODEV? Otherwise, wouldn't self.addCleanup be simpler than the large |
That isn't mentioned in the POSIX open() spec: However ENODEV still seems to be a standard errno constant, so why not:
The problem is if some code tries to create a fd before the cleanup |
Looking at random.SystemRandom it appears it would suffer from the same FD exhaustion problem. So as of right now afaik none of the sources of cryptographically secure random in the python stdlib offer a way to open a persistent FD. The primary question on my mind is if os.urandom can't be modified to maintain a persistent FD can Python offer a urandom class that *will* maintain a persistent FD? |
Well, if we want to offer such a facility, let's bundle it in |
Attached patch to make os.urandom's fd persistent. |
Updated error handling patch testing for ENODEV. |
LGTM, you can apply to 2.7 and 3.x (I just hope all those errnos are |
New changeset 193bcc12575d by Antoine Pitrou in branch '3.3': New changeset fe949918616c by Antoine Pitrou in branch 'default': |
New changeset ec296a36156b by Antoine Pitrou in branch '2.7': |
Ok, committed. We're left with the persistent fd patch for 3.4. |
Updated patch for persistent fd. |
Updated patch after Christian's comments. |
LGTM |
Tarek: try to use ssl.RAND_bytes(), it is secure, fast and don't use a file IMO if something can be improved, it is in the random.SystemRandom() class: |
haypo: It's been suggested by a number of security professionals that using the OpenSSL random (or really any random) instead of urandom is likely to be a smarter idea. The likelyhood that urandom is broken is far less than any other source of random. This can be seen in the recent issues on the Android platform. This is not to say that there's a reason to believe that OpenSSL is broken currently, but that the chances are higher for it to be than /dev/urandom. An example of when this happened was http://www.debian.org/security/2008/dsa-1571. There's no reason to believe that OpenSSL is wrong right now, but the chances of OpenSSL being wrong are greater than the chances of /dev/urandom being There's been a few threads on twitter about it in light of the Android SecureRandom issue (don't need to read these, just here for reference): I don't think it actually matters if os.urandom or random.SystemRandom is the preferred interface that keeps the FD open but I do believe there should be one implementation that will use the OS source of random and maintain a persistent FD. |
Ok, you're gonna laugh, the simplified patch has a complication (not theoretical, it would trip test_cmd_line). Attaching patch. |
Antoine, this changeset broke Tiger buildbots badly: Traceback (most recent call last):
File "/Users/db3l/buildarea/3.x.bolen-tiger/build/Lib/test/test_os.py",
line 1033, in test_urandom_failure
ValueError: not allowed to raise maximum limit
""" http://buildbot.python.org/all/builders/x86%20Tiger%203.x/builds/6826/steps/test/logs/stdio """ The code trying to reset RLIMIT_NOFILE to its previous value fails, It looks like Tiger getrlimit() return a nonsensical value (maybe -1), I'd suggest two things:
|
Or more precisely, just run the test in a subprocess. That should fix |
New changeset b9e62929460e by Antoine Pitrou in branch '3.3': New changeset 68ff013b194c by Antoine Pitrou in branch 'default': New changeset 869df611c138 by Antoine Pitrou in branch '2.7': |
Ok, the tiger should feel better now :-) |
So, to come back to the original topic, is everyone sold on the idea of caching the urandom fd lazily? |
Lazily opening urandom and holding it open sounds like a sane thing to do to me +1 |
New changeset acc7439b1406 by Antoine Pitrou in branch 'default': |
Ok, I've committed the patch for the lazy opening approach. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: