-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
select.get_fileno passed IOClosed object #154
Comments
Maybe it shouldn't pass 'on_error' as listener's throw back , how about just a 'lambda _: None'; Similarly, #137 'max recursion depth exceeded' is triggered because exception raised when calling 'mark_as_closed' in listener.defang which passed with 'lambda x:None', howerver it's a method takes 0 argument. |
Just copying Jakub's comment here for ease of future debugging:
da87716#diff-0195c9ec76921f751bef650c49d902b1R257 |
Hi, We have been using QPID as the RPC driver in our openstack environment, but we hit this eventlet error, just the same as you described in this bug's description, after a very time-comsuming job in the openstack-nova-compute service, an RPC timeout is reported and this triggers the eventlet type-error, first with a "TypeError: () takes exactly 1 argument (0 given)" and then "TypeError: cannot concatenate 'str' and 'type' objects" error. (the error log is pasted at the last) this error would lead the openstack compute service down and never come back to up status unless restarted. is there any fix or workaround for this issue? Thanks in advance~ We are using the openstack Juno release which listed the eventlet version requirement as: eventlet>=0.15.1,<=0.15.2, So, in our env, the version we use of eventlet is 0.15.2. openstack nova compute.log: |
The issue described in my last comments has been fixed by the following change: -- listeners.append(hub.add(hub.READ, k, on_read, on_error, lambda x: None)) I only changed the last two arguments as lambda, this can fix our issue. Here are some detailed analysis for the arguments error: As for the first exception, since the lambda function does nothing but just returns none, it don't depend on any arguments, so I just remove the argument of lambda, this can fix the mark_as_closed TypeError listed in my last comments. As for the second exception of TypeError: Expected int or long, got IOClosed object: from the code, we can see the on_error function needs a fd as its argument, but the close_one function in hub.py called on_error function with an IOClosed object by: listener.tb(IOClosed(errno.ENOTCONN, "Operation on closed file")) This can fix our issue, the QPID can be reconnected successfully, and our openstack-nova-compute service resumed to up status automatically, so the problem gone. |
eventlet#154 This is a workaround and not a complete solution. In eventlet#154 (comment) it is mentioned that this fixes the issue.
It's a shame to admit this - my Smarkets colleagues (pinging @u-quark) and I managed to produce a simple test case for this some time ago and came up with a way to make this code behave less confusingly but didn't have time to contribute the change properly yet. For the record - this failure is most likely a result of a programmer error (fd reused while select using it is in progress if I remember correctly) so this is still gonna result in an error but with one that actually says something useful and points to the source of the issue. |
eventlet#154 This is a workaround and not a complete solution. In eventlet#154 (comment) it is mentioned that this fixes the issue.
I've tried @jstasiak code fix, that works for me. |
@hardys @yittg @dongyanyang @jstasiak @dalinhuang relevant patch was merged into master. It is expected that this issue should be fixed also. Please try and write back how you feel.
|
Sorry to leave this open. Probably fixed, but not direct confirmation from OP. |
Since the fix for #137, the infinite recursion is fixed, but we see the issue below.
This happens when you cause an RPC timeout on OpenStack, such that the select times out, then this error happens when you reconnect the network to resolve the RPC interruption.
The text was updated successfully, but these errors were encountered: