-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: file descriptor cannot be a negative integer (-1) #63
Comments
Thank you for the report and the debug log to help with looking at this. Having looked through the log, it looks like the remote Actor closed the socket at some point while there were still message queue to send to that remote queued, and the send got attempted. It's entirely possible for sockets to close, and the TCPTransport should accomodate those situations, but as you noted, it wasn't properly handling the I've uploaded a fix (a99fbf9) that should properly handle the It's entirely possible that there's some other underlying issue that resulted in the socket disconnect; please check that the updated version is giving you the expected results and let me know if there's something else I can help with here. [Note: apparently pushing a commit with the word "Fix" followed by a hash and a bug number causes github to automatically close an issue. TIL. I've re-opened the issue to wait for your confirmation and further testing.] |
With this commit we upgrade from Thespian 3.10.0 to 3.10.1 which includes a bugfix that improves stability when a remote actor has closed a socket which still had pending messages. Relates kquick/Thespian#63
With this commit we upgrade from Thespian 3.10.0 to 3.10.1 which includes a bugfix that improves stability when a remote actor has closed a socket which still had pending messages. Relates kquick/Thespian#63
Environment
uname -a
): Darwin io 18.7.0 Darwin Kernel Version 18.7.0TCPTransport
(on a single machine)Error
When running an actor system on a single machine we get the following stack trace after a while:
In the error log we also see:
I saw that the code in
TCPTransport
has handling for invalid file descriptors but it expectsOSError
to be raised.I can reproduce this in our application when an actor sends large log messages so I wonder whether it is related. I also managed to capture a complete debug log output of Thespian (~ 13MB expanded; the error is towards the end of the log file).
The error is pretty well reproducible but the reproduction scenario is a bit involved (it requires running an application that is somewhat complex to setup) so I'm not sure whether it is helpful to describe it here. I am of course happy to replicate the steps here, provide further details and / or participate in testing but I wanted to get the issue out here first.
The text was updated successfully, but these errors were encountered: