Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxyCommand objects need the socket-like-obj _closed fix too #789

Closed
bitprophet opened this issue Jul 30, 2016 · 10 comments
Closed

ProxyCommand objects need the socket-like-obj _closed fix too #789

bitprophet opened this issue Jul 30, 2016 · 10 comments

Comments

@bitprophet
Copy link
Member

@bitprophet bitprophet commented Jul 30, 2016

See #774 (comment) ; also commutative to #520.

@bitprophet
Copy link
Member Author

@bitprophet bitprophet commented Jul 30, 2016

Interestingly, ProxyCommand doesn't appear to expose any 'closed' attributes whatsoever - no closed, no _closed, etc.

It only implements ClosingContextManager which calls self.close, but no state is ever tracked for whether it's closed or not.

Since it wraps a subprocess.Popen I think the "most correct" analogue to "is the socket closed" would be if process.returncode is not None?

bitprophet added a commit that referenced this issue Jul 30, 2016
@bitprophet
Copy link
Member Author

@bitprophet bitprophet commented Jul 30, 2016

@nvgoldin If possible, please try cherry-picking or just applying 228ed87 to your local Paramiko. It's pretty basic so I think it'll work, but I haven't actually tested it myself yet. Hope to later.

@nvgoldin
Copy link

@nvgoldin nvgoldin commented Jul 30, 2016

The exception is gone, thanks!!
But, there is another problem which I am not sure is directly related to this(tell me if to open another issue). Seems like the proxy command process isn't killed properly and it leaves zombie processes, running the following for several loops:

import paramiko
import time
import logging
import os
ssh_host='localhost'
proxy_cmd='ssh -o StrictHostKeyChecking=no -W localhost:22 localhost'
logging.basicConfig(level=logging.DEBUG)
while True:
    logging.debug('running PID %s', os.getpid())
    ssh = paramiko.SSHClient()
    ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
    ssh_proxy = paramiko.ProxyCommand(proxy_cmd)
    ssh.connect(sock=ssh_proxy, hostname=ssh_host)
    sftp = ssh.open_sftp()
    sftp.close()
    ssh.close()
    ssh_proxy.close()
    logging.debug('going to sleep')
    time.sleep(3)

Results in: (20125 is the program pid):

>ps  xao pid,ppid,pgid,sid,comm | grep 20125

20126 20125 20125 15530 ssh <defunct>
20169 20125 20125 15530 ssh <defunct>
20228 20125 20125 15530 ssh <defunct>
20271 20125 20125 15530 ssh <defunct>
20330 20125 20125 15530 ssh <defunct>
20373 20125 20125 15530 ssh <defunct>
20416 20125 20125 15530 ssh <defunct>
20459 20125 20125 15530 ssh <defunct>
20502 20125 20125 15530 ssh <defunct>
20545 20125 20125 15530 ssh <defunct>
20588 20125 20125 15530 ssh <defunct>

The logs after authentication look like this:

DEBUG:paramiko.transport:[chan 0] Max packet in: 32768 bytes
DEBUG:paramiko.transport:[chan 0] Max packet out: 32768 bytes
DEBUG:paramiko.transport:Secsh channel 0 opened.
DEBUG:paramiko.transport:[chan 0] Sesch channel 0 request ok
INFO:paramiko.transport.sftp:[chan 0] Opened sftp connection (server version 3)
INFO:paramiko.transport.sftp:[chan 0] sftp session closed.
DEBUG:paramiko.transport:[chan 0] EOF sent (0)
DEBUG:root:going to sleep
DEBUG:paramiko.transport:EOF in transport thread

I tried changing the 'close' method of ProxyCommand from: os.kill to Popen's kill, with no success(from the docs that looks like recommended method, though not related to this.)

Maybe I should use a different order of closing(i.e. sftp/ssh/proxy)?

@nvgoldin
Copy link

@nvgoldin nvgoldin commented Jul 30, 2016

Update: changing proxy.py close method to:

    def close(self):
        self.process.kill()
        self.process.poll()

Resolves the issue(no zombie process leftovers). Though I'm not sure if this has any side-affects.

@nvgoldin
Copy link

@nvgoldin nvgoldin commented Aug 18, 2016

Ping. Wonder if we can get this going. I've been testing it for the past weeks and the above fix seems to be working(no zombie processes).
Want me to create a new PR?(based on 228ed87 and adding the process.poll() to update the exit status)

@bitprophet
Copy link
Member Author

@bitprophet bitprophet commented Sep 8, 2016

Thanks for #811, I'll try verifying it on my end when I get to the next bugfix release. (May need to bump it to a feature since we're manipulating the public API, but either way it'll get looked at.)

@bitprophet
Copy link
Member Author

@bitprophet bitprophet commented Dec 6, 2016

Starting to wonder if I should investigate using Invoke's Runner for this stuff, sigh (as it, too, has to handle all sorts of subprocess shutdowns and suchlike). Not worth it in the short term though, it's not explicitly designed for wholly-noninteractive byte-forwarding (even though that SHOULD work fine and I very much want it to if it does not).

Poking #811 now...

@bitprophet
Copy link
Member Author

@bitprophet bitprophet commented Dec 6, 2016

I can:

  • confirm the before/after re: my old branch about this that adds closed to ProxyCommand
  • NOT confirm any zombie processes as stated, though my methodology is a bit different (using unpublished fabric 2 code, and using netcat as the gateway instead of ssh).
  • Both "normal" (no explicit close) and closer to your style of execution (close on the connection obj in each loop iteration) work fine, no zombies.
  • FWIW I'm doing this on OSX 10.11.

I also doublechecked and the impl of ProxyCommand.close has been this way since 2012 - so it's unlikely to be super incorrect or I'd have expected tickets about zombies before now. Makes me wonder if some extra factor is at work in your case?

Either way I don't think it should block the basic attribute-error-fixing commit from merging so I'm gonna do that and we can spin this discussion into a new ticket if it's still affecting you. Let me know. Thanks!

@bitprophet bitprophet closed this Dec 6, 2016
bitprophet added a commit that referenced this issue Dec 6, 2016
@nvgoldin
Copy link

@nvgoldin nvgoldin commented Dec 14, 2016

@bitprophet - thanks for looking into this. I'll open a new issue if this happens again.

dkhapun pushed a commit to cyberx-labs/paramiko that referenced this issue Jun 7, 2018
@ciscorat
Copy link

@ciscorat ciscorat commented Mar 6, 2020

Hello,

I'm running into the same problem using SSH Tunnel(Netcat on jumphost). I get Error reading SSH protocol banner but only when the device is down.
This is the Netmiko code part that waits for the device to come up:

upstate = False
while not upstate:
sys.tracebacklimit = None
try:
net = ConnectHandler(**lab)
except:
continue
else:
upstate = True
print("Switch is up")`

I applied the fix #798 in paramiko/proxy.py but im still getting this error. Any clue what should I do next?

from datetime import datetime
import os
from shlex import split as shlsplit
import signal
from subprocess import Popen, PIPE
from select import select
import socket
import time

from paramiko.ssh_exception import ProxyCommandFailure
from paramiko.util import ClosingContextManager

class ProxyCommand(ClosingContextManager):
"""
Wraps a subprocess running ProxyCommand-driven programs.

This class implements a the socket-like interface needed by the
`.Transport` and `.Packetizer` classes. Using this class instead of a
regular socket makes it possible to talk with a Popen'd command that will
proxy traffic between the client and a server hosted in another machine.

Instances of this class may be used as context managers.
"""
def __init__(self, command_line):
    """
    Create a new CommandProxy instance. The instance created by this
    class can be passed as an argument to the `.Transport` class.

    :param str command_line:
        the command that should be executed and used as the proxy.
    """
    self.cmd = shlsplit(command_line)
    self.process = Popen(self.cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE,
                         bufsize=0)

    self.timeout = None

def send(self, content):
    """
    Write the content received from the SSH client to the standard
    input of the forked command.

    :param str content: string to be sent to the forked command
    """
    try:
        self.process.stdin.write(content)
    except IOError as e:
        # There was a problem with the child process. It probably
        # died and we can't proceed. The best option here is to
        # raise an exception informing the user that the informed
        # ProxyCommand is not working.
        raise ProxyCommandFailure(' '.join(self.cmd), e.strerror)
    return len(content)

def recv(self, size):
    """
    Read from the standard output of the forked program.

     :param int size: how many chars should be read

    :return: the string of bytes read, which may be shorter than requested
    """
    try:
        buffer = b''
        start = time.time()
        while len(buffer) < size:
            select_timeout = None
            if self.timeout is not None:
                elapsed = (time.time() - start)
                if elapsed >= self.timeout:
                    raise socket.timeout()
                select_timeout = self.timeout - elapsed
            r, w, x = select(
                [self.process.stdout], [], [], select_timeout)
            if r and r[0] == self.process.stdout:

                buffer += os.read(
                    self.process.stdout.fileno(), size - len(buffer))
        return buffer
    except socket.timeout:
        if buffer:
            # Don't raise socket.timeout, return partial result instead
            return buffer
        raise  # socket.timeout is a subclass of IOError
    except IOError as e:
            raise ProxyCommandFailure(' '.join(self.cmd), e.strerror)

def close(self):
    os.kill(self.process.pid, signal.SIGTERM)

@property
def closed(self):
    return self.process.returncode is not None

@property
def _closed(self):
    # Concession to Python 3 socket-like API
    print("bla")
    return self.closed

def settimeout(self, timeout):
    self.timeout = timeout
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.