New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative to ssh's ServerAliveInterval and ServerAliveCountMax client options #918
Comments
Btw., I use Fedora's |
Thanks for the report! Not a feature I use myself (& not the original author) so I haven't really touched it; glancing at it, The actual keeping-alive timer functionality also looks implemented, there's a timestamp that gets updated on writes, or on reads when a timeout occurs (in the same spot that triggers the message send, when the time since last transmission exceeds the interval, as one would expect). There's a chance for bugs in there, of course. But I think the real problem is...there's no equivalent of Been a while since I dealt with "remote server went away" issues so I don't remember our normal behavior in that case, but assuming it wouldn't trigger socket/etc errors in your situation, certainly seems the current setup would run forever, and implementing the equiv Afraid I don't have time to deep dive into it now but I'd certainly entertain a PR. |
I have seen issues with the existing keepalive logic not being able to detect some duff connections. Never found the root cause, but what I was able to see was that the client keepalive messages were reported as succeeding, but only buffered in the socket Send-Q monitored via netstat. In order to track the server replies, the sent ssh "global-request" message needs to set the I implemented a makeshift solution that never got enough polish on it to submit back as a PR, but did seem to properly detect the issues that I was encountering. Lingering doubt as to the proper handling of the See code here and if @bitprophet can provide guidance, I can work on converting that into a PR if it looks good. |
This allows us to detect SSH connection issues more earlier instead of hanging forever. Paramiko has issuees with keep-alive packets and timeouts: paramiko/paramiko#918 So now, reschedule build in case of SSH issues. Also, heavily depend on openssh client configuration.
Currently, the documentation states that keepalive maps to ClientAliveInterval, which is a server side setting, unlike what the name indicates. The client side setting is called ServerAliveInterval. You can see references to this in the below two discussions: paramiko/paramiko#918 https://unix.stackexchange.com/a/3027/6475
I'm not sure whether this is implemented or not .. Based on the code, it looks like it could work but it does not. When I do
conn.get_transport().set_keepalive(5)
andconn.exec_command('...', timeout=10)
, my code hangs indefinitely even if I kill the remote ssh server.Can this be fixed/implemented?
The text was updated successfully, but these errors were encountered: