SFTP client failures for large directories #73

Closed
bennoleslie opened this Issue Jan 10, 2017 · 3 comments

Comments

Projects
None yet
2 participants
@bennoleslie

bennoleslie commented Jan 10, 2017

If a directory has a large number of files it is possible that the packet returned in response to FXP_READDIR is too large for the client. Common clients (i.e.: OpenSSH sftp client) have a maximum packet size of 256KiB.

The following code can be used to demonstrate the issue.

I'm testing against master (33761e9)

import stat
from asyncssh import SFTPAttrs
from asyncssh.sftp import SFTPName
import asyncio, asyncssh, sys

class MySFTPServer(asyncssh.SFTPServer):
    def __init__(self, conn):
        self._chroot = None
        self._cwd = '/'

    def listdir(self, path):
        return [SFTPName('{:05d}'.format(i).encode('ascii'), '',  SFTPAttrs(permissions=stat.S_IRUSR))
                for i in range(100000)]


async def start_server():
    await asyncssh.listen('', 8022, server_host_keys=['ssh_host_key'],
                          authorized_client_keys='ssh_user_ca',
                          sftp_factory=MySFTPServer)

loop = asyncio.get_event_loop()

try:
    loop.run_until_complete(start_server())
except (OSError, asyncssh.Error) as exc:
    sys.exit('Error starting server: ' + str(exc))

loop.run_forever()

Using standard OpenSSH sftp client (in this case OpenSSH_6.9p1, LibreSSL 2.1.8 however from code inspection the same thing would happen on a later version as well).

The output is:

% echo "ls -la" | sftp -b -  -P 8022 localhost
sftp> ls -la
Received message too long 8200009

You can tune the size of the directory by changing '100000' in the script. You'll see that if you dial it down low enough the packet will be small enough to go under the SFTP 256KiB limit.

The way to fix this is by getting _process_readdir to return smaller packets (the client keeps calling until EOF is reached anyway). A prototype implementation is something like:

_MAXIMUM_NAMES_PER_READDIR = 128

@asyncio.coroutine
def _process_readdir(self, packet):
    """Process an incoming SFTP readdir request"""
    handle = packet.get_string()
    packet.check_end()
    names = self._dir_handles.get(handle)
    if names:
        batch = names[:_MAXIMUM_NAMES_PER_READDIR]
        del names[:_MAXIMUM_NAMES_PER_READDIR]
        return batch
    else:
        raise SFTPError(FX_EOF, '')

Now, a really good question is where does 128 come from, and why is it even approaching correct. The answer is unsatisfying; the latest SFTP draft specification (which coincidentally expired 10 years ago tomorrow) https://tools.ietf.org/html/draft-ietf-secsh-filexfer-13 has little to say about the issue.

The maximum size of a packet is in practice determined by the
client (the maximum size of read or write requests that it sends,
plus a few bytes of packet overhead). All servers SHOULD support
packets of at least 34000 bytes (where the packet size refers to
the full length, including the header above). This should allow
for reads and writes of at most 32768 bytes.

Unfortunately the first sentence is fine for read requests, but not for FXP_READDIR. Unlike the underlying kernel system call, FXP_READDIR doesn't allow the client to specify how many records it expects to receive back, so the server simply has to guess, which is of course ridiculous.

Given the broken spec the best we can do is look at common usage. OpenSSH server just hardcodes it to 100 https://github.com/openssh/openssh-portable/blob/master/sftp-server.c#L1072.

The Go implementation picks 128 instead https://github.com/pkg/sftp/blob/master/server.go#L449.

Per the XXX comments in the OpenSSH implementation it would make more sense to track the size of the packet rather than the number of entries (since the entry size is variable), however that complicates the code somewhat, so picking a number of records seems easier.

And, even if you did something based on the packet size, what packet size could you even pick? There is no minimum packet size in the spec that indicates what a client must support, although probably 32KiB is reasonable given the above quote. In that case 128 gives you 256 bytes per record on average to play with, which is probably OK, but really, who knows.

It would probably be ideal to provide a way to configure either the number of records or the minimum buffer size so it can be tuned based on the clients any user may need to support.

Overall, this is a very unsatisfying issue as there isn't really a right answer, but such is life.

@ronf

This comment has been minimized.

Show comment
Hide comment
@ronf

ronf Jan 11, 2017

Owner

Thanks for your detailed analysis, Ben, and for your suggested fix!

Given that there's nothing negotiated on the wire for the maximum SFTP message size (unlike the max supported SSH packet size, which is actually exchanged when a channel is set up), there's no way to be sure what size is "safe", but it still seems worth at least trying to stay under the hardcoded maximum value in the OpenSSH implementation, since that's probably what is at the other end in most cases. If that turns out to not be sufficient, it might be possible to do something based on the identification string in the version the peer sends at startup. I can also look into making this configurable, if that's not good enough.

I'm also torn about whether to go with a max number of bytes or max number of entries, but given that we can't know the "safe" maximum number of bytes, I think a simpler fix like what you suggest here is probably the best answer for now. This could be a problem if a directory contained multiple really long filenames, but given many OSes have limits on filenames of a few hundred bytes, it's pretty unlikely that we'd run into an issue with that with 128 entries per response, especially at the OpenSSH limit of 256 KiB.

I've been making other changes to the SFTP support in the "develop" branch to make it work better on Windows. I'll put this fix in there as well, and roll it into the next release. Thanks again!

Owner

ronf commented Jan 11, 2017

Thanks for your detailed analysis, Ben, and for your suggested fix!

Given that there's nothing negotiated on the wire for the maximum SFTP message size (unlike the max supported SSH packet size, which is actually exchanged when a channel is set up), there's no way to be sure what size is "safe", but it still seems worth at least trying to stay under the hardcoded maximum value in the OpenSSH implementation, since that's probably what is at the other end in most cases. If that turns out to not be sufficient, it might be possible to do something based on the identification string in the version the peer sends at startup. I can also look into making this configurable, if that's not good enough.

I'm also torn about whether to go with a max number of bytes or max number of entries, but given that we can't know the "safe" maximum number of bytes, I think a simpler fix like what you suggest here is probably the best answer for now. This could be a problem if a directory contained multiple really long filenames, but given many OSes have limits on filenames of a few hundred bytes, it's pretty unlikely that we'd run into an issue with that with 128 entries per response, especially at the OpenSSH limit of 256 KiB.

I've been making other changes to the SFTP support in the "develop" branch to make it work better on Windows. I'll put this fix in there as well, and roll it into the next release. Thanks again!

@ronf ronf self-assigned this Jan 11, 2017

@ronf ronf added the bug label Jan 11, 2017

@ronf

This comment has been minimized.

Show comment
Hide comment
@ronf

ronf Jan 11, 2017

Owner

A fix for this is now checked into the "develop" branch.

Owner

ronf commented Jan 11, 2017

A fix for this is now checked into the "develop" branch.

@ronf

This comment has been minimized.

Show comment
Hide comment
@ronf

ronf Feb 19, 2017

Owner

This fix is now available in the 1.9.0 release.

Owner

ronf commented Feb 19, 2017

This fix is now available in the 1.9.0 release.

@ronf ronf closed this Feb 19, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment