Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changed channel.transport to a weakref to fix reference cycle #367

Closed
wants to merge 1 commit into from
Closed

Changed channel.transport to a weakref to fix reference cycle #367

wants to merge 1 commit into from

Conversation

alexsacr
Copy link

We're using a number of Paramiko-based SSH proxies and all of them were leaking multiple gigabytes of RAM per day. After some investigation, I noticed Channel objects were not being collected properly to a reference cycle:

image

I pared down our server code considerably and sent a couple hundred of requests through the app with a stock 1.13 package to demonstrate the leak:

http://snapfiber.com/ssh-server_paramiko-1-13-unpatched.svg

After applying this change, the leak no longer exists:

http://snapfiber.com/ssh-server_paramiko-1-13-patched.svg

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.17%) when pulling 3e43eec on alex-sf:fix_reference_cycle into c0667e1 on paramiko:1.11.

@alexsacr
Copy link
Author

Note to others that come across this: If you can't/won't apply this, calling:

del channel.transport

After closing down your channel removes the reference and allows channels to be collected.

@lndbrg
Copy link
Contributor

lndbrg commented Aug 20, 2014

Hmm, I think this bug should be solved in another way. According to the RFC when you want to close a channel you SHOULD send an EOF, followed by a MUST send of a CHANNEL_CLOSED. The recipient of a CHANNEL_CLOSED must respond with a CHANNEL_CLOSED. In our code the flow should be:

channel.close() -> send EOF and CHANNEL_CLOSED

Wait for the channel to receive a CHANNEL_CLOSED -> call channel._handle_close() where we should unlink the transport.

It seems that we never get to the _handle_close() method. I don't know why, but the correct fix should be triggering the unlink from there.

@alex-sf what do you think? Can you see if you can find out why this isn't called with your test system?

@lndbrg
Copy link
Contributor

lndbrg commented Sep 30, 2014

@alex-sf sorry to ping you again, but i would prefer to solve it in another way if possible, did you have a chance to look at what i wrote? :)

@mdellavo
Copy link

@rectalogic
Copy link
Contributor

I think the issue is that Channel implements __del__ and so will not be GC'd if it is involved in any cycles. In our sftp server, we do get CHANNEL_CLOSED and the Channel does unlink itself from the Transport but we still get leaks because Transport.server_accepts array contains the Channel which references the Transport - so another cycle, and it is uncollectable due to the __del__.
https://docs.python.org/2/library/gc.html#gc.garbage

Here is a test sftp server:

import logging
import socket
import threading
import gc
import sys
from StringIO import StringIO
import paramiko

log = logging.getLogger("sftp_server")

HOST, PORT = 'localhost', 2222
HOSTKEY = """
-----BEGIN EC PRIVATE KEY-----
MHcCAQEEINnMNKzog0Cbnun+EKuFu4kcpc0iKJALw5LkC1E1x+9EoAoGCCqGSM49
AwEHoUQDQgAEt11fabDR77CPL/1OUfayQ/m3C1uRI/039pBmYagk4zQ4Mxwow6L6
UkKHL9pHHO3aWdUulY9OU88mT7O8Lg5hyA==
-----END EC PRIVATE KEY-----
"""


class AllAccessServer(paramiko.ServerInterface):
    def check_auth_none(self, username):
        return paramiko.AUTH_SUCCESSFUL

    def check_auth_password(self, username, password):
        return paramiko.AUTH_SUCCESSFUL

    def check_auth_publickey(self, username, key):
        return paramiko.AUTH_SUCCESSFUL

    def check_channel_request(self, kind, chanid):
        return paramiko.OPEN_SUCCEEDED


def start_server(host, port):
    hostkey = paramiko.ECDSAKey.from_private_key(StringIO(HOSTKEY))
    server_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    server_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, True)
    server_socket.bind((host, port))
    server_socket.listen(10)

    while True:
        conn, address = server_socket.accept()

        transport = paramiko.Transport(conn)
        transport.load_server_moduli()
        transport.add_server_key(hostkey)
        transport.set_subsystem_handler("sftp", paramiko.SFTPServer, paramiko.SFTPServerInterface)
        transport.start_server(server=AllAccessServer(), event=threading.Event())


# Connect with:
# $ echo quit | sftp -oPort=2222 localhost
#
if __name__ == '__main__':
    logging.basicConfig(level=logging.INFO)
    try:
        start_server(HOST, PORT)
    except KeyboardInterrupt:
        gc.collect()
        for g in gc.garbage:
            log.info("garbage: %r", g)
            if isinstance(g, paramiko.Channel):
                log.info("       g.transport.server_accepts: %r", g.transport.server_accepts)
                log.info("       g.transport._channels.values(): %r", g.transport._channels.values())
        sys.exit(0)

Connect and cleanly disconnect to this a number of times via
echo quit | sftp -oPort=2222 localhost

Then kill the server with Ctrl-C and it will dump uncollectable garbage.

After connecting/disconnecting a few times we have some uncollectable Transport/Channel pairs:

$ env/bin/python sftp_server.py
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.1)
INFO:paramiko.transport:Auth granted (none).
INFO:paramiko.transport:Disconnect (code 11): disconnected by user
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.1)
INFO:paramiko.transport:Auth granted (none).
INFO:paramiko.transport:Disconnect (code 11): disconnected by user
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.1)
INFO:paramiko.transport:Auth granted (none).
INFO:paramiko.transport:Disconnect (code 11): disconnected by user
INFO:paramiko.transport:Connected (version 2.0, client OpenSSH_7.1)
INFO:paramiko.transport:Auth granted (none).
INFO:paramiko.transport:Disconnect (code 11): disconnected by user
^C
INFO:sftp_server:garbage: <paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33c6fd0L (unconnected)>>
INFO:sftp_server:       g.transport.server_accepts: [<paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33c6fd0L (unconnected)>>]
INFO:sftp_server:       g.transport._channels.values(): []
INFO:sftp_server:garbage: <paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33f2c10L (unconnected)>>
INFO:sftp_server:       g.transport.server_accepts: [<paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33f2c10L (unconnected)>>]
INFO:sftp_server:       g.transport._channels.values(): []
INFO:sftp_server:garbage: <paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33f2ed0L (unconnected)>>
INFO:sftp_server:       g.transport.server_accepts: [<paramiko.Channel 0 (closed) -> <paramiko.Transport at 0x33f2ed0L (unconnected)>>]
INFO:sftp_server:       g.transport._channels.values(): []

All of the Transport ChannelMaps are empty (transport._channels.values()) because the channels unlinked themselves in _handle_close(). But each Transport.server_accepts references its owning Transport.

Calling Transport.accept() would clear the server_accepts list - but it's not obvious where this should be done when passing an Event to Transport.start_server() to make it async.

@bitprophet
Copy link
Member

This and #440 / #441 sound like they describe the same problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants