Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

threading bugfix for socket closure #156

Closed
wants to merge 2 commits into from
Closed

threading bugfix for socket closure #156

wants to merge 2 commits into from

Conversation

@tycho
Copy link
Contributor

@tycho tycho commented Apr 11, 2013

Submitting a bugfix on behalf of a coworker on my team.

Make sure the Thread.run() method has terminated before closing the
socket. Currently, the socket is closed through Packetizer.close(),
which happens too early. Move the socket.close() into Transport.close()
and after the Thread.join() call.

While at it, modify the stop_thread() method and use it in
Transport.close() to avoid code duplication. Use join() with a timeout
to make it possible to terminate the main thread with KeyboardInterrupt.
Also, remove the now obsolete socket.close() from Transport.atfork().

This fixes a potential infinite loop if paramiko.SSHClient is connected
through a paramiko.Channel instead of a regular socket (tunneling).

Details:

Using a debug patch to dump the current stack of the thread every
couple of seconds while trying to close it, I've seen the following
over and over again:

Thread could not be stopped, still running.
Current traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
    self.__bootstrap_inner()
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File ".../paramiko/transport.py", line 1564, in run
    self._channel_handler_table[ptype](chan, m)
  File ".../paramiko/channel.py", line 1102, in _handle_close
    self.transport._send_user_message(m)
  File ".../paramiko/transport.py", line 1418, in _send_user_message
    self._send_message(data)
  File ".../paramiko/transport.py", line 1398, in _send_message
    self.packetizer.send_message(data)
  File ".../paramiko/packet.py", line 319, in send_message
    self.write_all(out)
  File ".../paramiko/packet.py", line 248, in write_all
    n = self.__socket.send(out)
  File ".../paramiko/channel.py", line 732, in send
    self.lock.release()

The thread was running Packetizer.write_all() in an endless loop:

while len(out) > 0:
    ...
    n = Channel.send(out) # n == 0 because channel got closed
    ...
    out = out[n:]         # essentially out = out

Signed-off-by: Frank Arnold <farnold@amazon.com>
Make sure the Thread.run() method has terminated before closing the
socket. Currently, the socket is closed through Packetizer.close(),
which happens too early. Move the socket.close() into Transport.close()
and after the Thread.join() call.

While at it, modify the stop_thread() method and use it in
Transport.close() to avoid code duplication. Use join() with a timeout
to make it possible to terminate the main thread with KeyboardInterrupt.
Also, remove the now obsolete socket.close() from Transport.atfork().

This fixes a potential infinite loop if paramiko.SSHClient is connected
through a paramiko.Channel instead of a regular socket (tunneling).

Details:

Using a debug patch to dump the current stack of the thread every
couple of seconds while trying to close it, I've seen the following
over and over again:

Thread could not be stopped, still running.
Current traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
    self.__bootstrap_inner()
  File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
    self.run()
  File ".../paramiko/transport.py", line 1564, in run
    self._channel_handler_table[ptype](chan, m)
  File ".../paramiko/channel.py", line 1102, in _handle_close
    self.transport._send_user_message(m)
  File ".../paramiko/transport.py", line 1418, in _send_user_message
    self._send_message(data)
  File ".../paramiko/transport.py", line 1398, in _send_message
    self.packetizer.send_message(data)
  File ".../paramiko/packet.py", line 319, in send_message
    self.write_all(out)
  File ".../paramiko/packet.py", line 248, in write_all
    n = self.__socket.send(out)
  File ".../paramiko/channel.py", line 732, in send
    self.lock.release()

The thread was running Packetizer.write_all() in an endless loop:

while len(out) > 0:
    ...
    n = Channel.send(out) # n == 0 because channel got closed
    ...
    out = out[n:]         # essentially out = out

Signed-off-by: Frank Arnold <farnold@amazon.com>
@tycho

This comment has been minimized.

Copy link
Owner

@tycho tycho commented on paramiko/transport.py in 068bf63 Apr 11, 2013

There seems to be a documentation bug for Python. The "is_alive" here needs to be changed to "isAlive" for Python 2.5 compatibility.

…ive()

Python's documentation has a bug[1], in that it doesn't correctly annotate
is_alive as being a function introduced in Python 2.6.

[1] http://bugs.python.org/issue15126

Signed-off-by: Steven Noonan <snoonan@amazon.com>
@bitprophet
Copy link
Member

@bitprophet bitprophet commented Apr 28, 2013

Thanks to you & Frank for this! I acknowledge the bug & fix, though I'd love it if you could provide more details on how to reproduce. The the tunnel-friendly sock option is recent, but I don't recall seeing this issue when I was testing out the new code, nor have I run into it since (though I personally do not use the feature in my daily work).

Also, minor, but please add a changelog entry in NEWS under the 1.11 section (or whichever is latest when you get to it) & crediting whoever wants credit (it can be both of course).

@frank-arnold
Copy link

@frank-arnold frank-arnold commented May 6, 2013

With the following script and paramiko.git/master I'm able to reproduce the
issue once in a while. It's a race condition though, so it doesn't trigger every
time.

The underlying problem is, that the send() method of regular socket objects
will raise an exception. This exception cancels the loop in Packetizer.write_all().
The send() method of paramiko.Channel objects just returns 0 if the channel
got closed. This can lead to the race condition resulting in an infinite loop.

#!/usr/bin/env python
import sys
from optparse import OptionParser

sys.path.insert(0, "./paramiko")
import paramiko

def get_option_parser():
    parser = OptionParser(
        usage="%prog --target <target ip> --gateway "
            "<gateway ip> --user <username> --keyfile <key file> <command>",
        description="Reproducer for possible Paramiko infinite loop while "
            "using tunneling (Paramiko pull request #156). Execute a command "
            "on a target host via a tunneled SSH connection through gateway.")
    parser.add_option("--target", action="store", dest="target", default=None,
        help="Name or IP of the target host")
    parser.add_option("--gateway", action="store", dest="gateway", default=None,
        help="Name or IP of the gateway")
    parser.add_option("--user", action="store", dest="username", default=None,
        help="The username to be used for the target host and gateway")
    parser.add_option("--keyfile", action="store", dest="keyfile", default=None,
        help="The keyfile to be used for the target host and gateway")

    return parser

def main():
    parser = get_option_parser()
    (options, args) = parser.parse_args()

    command = " ".join(args)

    failed = False
    if not command:
        sys.stderr.write("Need command to execute on the target.\n")
        failed = True
    if not options.target:
        sys.stderr.write("No target IP given.\n")
        failed = True
    if not options.gateway:
        sys.stderr.write("No gateway IP given.\n")
        failed = True
    if not options.username:
        sys.stderr.write("No username given.\n")
        failed = True
    if not options.keyfile:
        sys.stderr.write("No keyfile given.\n")
        failed = True
    if failed:
        parser.print_usage()
        sys.exit(2)

    # Gateway setup
    gateway_client = paramiko.SSHClient()
    gateway_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    gateway_client.connect(
        options.gateway,
        username=options.username,
        key_filename=options.keyfile,
        allow_agent=True,
        look_for_keys=False
    )
    transport = gateway_client.get_transport()
    transport.set_keepalive(30)
    channel = transport.open_channel(
        "direct-tcpip",
        (options.target, 22),
        ("dummy", 31337)
    )

    # Target setup
    target_client = paramiko.SSHClient()
    target_client.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    target_client.connect(
        options.target,
        username=options.username,
        key_filename=options.keyfile,
        allow_agent=True,
        look_for_keys=False,
        sock=channel
    )

    # Execute command on target
    stdout = target_client.exec_command(command)[1]
    output = "".join(stdout.readlines())

    # Close target and gateway
    target_client.close()
    gateway_client.close()

    # Print stdout from target
    sys.stdout.write(output)

if __name__ == "__main__":
    main()
@bitprophet
Copy link
Member

@bitprophet bitprophet commented Sep 27, 2013

@frank-arnold Thanks - I can replicate with that (happened on my 2nd try, even.) Using this snippet I can reliably get it to appear & tell me how many runs it took:

$ while true; do gtimeout 5 python test.py --target <host> --gateway <otherhost> --user <name> --keyfile ~
/.ssh/id_rsa uname || break; done | wc -l

Prints out 20 or 21 usually, so getting it on my 2nd try was some luck.

@bitprophet
Copy link
Member

@bitprophet bitprophet commented Sep 28, 2013

Sweet deal, when I switch to your branch that script ran for literally an hour without timing out. I'd say that indicates a fix! Merged into the 1.11 branch (figure that's the best target since that's what you would've based off of) and will publish tonight.

dkhapun pushed a commit to cyberx-labs/paramiko that referenced this pull request Jun 7, 2018
If the channel is closed the send method returs a response length
of 0. This is not handled correctly by the packetizer and puts it
in an infinite loop. (Fixes paramiko#156 for real :-)

We make sure we don't do more than 10 iteration on a 0 length respose,
but raise an EOFError.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants