Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ftplib: command response shift - mismatch #69644

Open
peterpan mannequin opened this issue Oct 22, 2015 · 15 comments
Open

ftplib: command response shift - mismatch #69644

peterpan mannequin opened this issue Oct 22, 2015 · 15 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@peterpan
Copy link
Mannequin

peterpan mannequin commented Oct 22, 2015

BPO 25458
Nosy @giampaolo, @native-api
Files
  • ftplib.py: patched ftplib.py
  • test_ftp.py: test for patched ftplib.py
  • ftp_error_illustration.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2015-10-22.10:35:52.716>
    labels = ['3.7', 'type-bug', 'library']
    title = 'ftplib: command response shift - mismatch'
    updated_at = <Date 2016-12-14.22:48:22.756>
    user = 'https://bugs.python.org/peterpan'

    bugs.python.org fields:

    activity = <Date 2016-12-14.22:48:22.756>
    actor = 'Ivan.Pozdeev'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2015-10-22.10:35:52.716>
    creator = 'peterpan'
    dependencies = []
    files = ['42099', '42100', '45820']
    hgrepos = []
    issue_num = 25458
    keywords = []
    message_count = 15.0
    messages = ['253326', '261374', '261389', '261417', '261418', '282775', '282793', '282796', '282797', '282899', '283073', '283125', '283215', '283217', '283225']
    nosy_count = 3.0
    nosy_names = ['giampaolo.rodola', 'Ivan.Pozdeev', 'peterpan']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25458'
    versions = ['Python 2.7', 'Python 3.3', 'Python 3.4', 'Python 3.5', 'Python 3.6', 'Python 3.7']

    @peterpan
    Copy link
    Mannequin Author

    peterpan mannequin commented Oct 22, 2015

    When handling the transfer socket manually the asynchronous status message "226 transfer complete" on the control socket is falsely taken as response for the last sent command.
    ftplib reads the response too late and the command/response order becomes invalid.

    I can avoid that by using the undocumented ftplib internal method FTP.getline() after the transfer socket is closed and not sending more commands while the transfer socket is open.

    It would be useful, if ftplib empties the response socket buffer before sending the next command. But maybe the best solution is an optional function callback when the "226" response appears, while it is ignored when not matching the last sent command.

    Example code that triggers the problem:

    import ftplib
    import socket
    import re
    
    
    ftp = ftplib.FTP()
    ftp.set_debuglevel(1)
    ftp.connect('ftp.debian.org', timeout=10)
    ftp.login('anonymous','user@example.com')
    ftp.sendcmd('TYPE A')
    
    s = ftp.transfercmd('LIST')

    '''
    #manual transfer socket - should result in same behaviour
    r = ftp.sendcmd('EPSV')
    r = re.search('\|([0-9]+)\|', r)
    port = int( r.group(1) )
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(('ftp.debian.org', port))
    ftp.sendcmd('LIST')
    '''

    fp = s.makefile('r')
    fp.read()
    fp.close()
    s.close()
    
    #ftplib falsely sees "226 transfer complete" as response to next command
    ftp.sendcmd('NOOP')

    @peterpan peterpan mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 22, 2015
    @giampaolo
    Copy link
    Contributor

    I'm not sure I understood the problem. Perhaps if you provide a patch that would make things more clear and you'd have more chances to see this issue fixed.

    @peterpan
    Copy link
    Mannequin Author

    peterpan mannequin commented Mar 8, 2016

    The problem in my example is ftplib reports a "226" response to command "NOOP" which is nonsense. ftplib received "226" before "ftp.sendcmd('NOOP')" was called.

    Since "transfercmd()" returns a socket, ftplib was planned to allow for manual transfer socket handling, but it is currently unable to handle the asynchronous responses from the server when the transfer is done. This is a bug or design error.

    Multiple changes are needed to support manual transfer socket handling. I suggest the following:

    Since asynchronous responses from the server are possible on the command socket, "set_debuglevel(1)" must report them at once, but this would require multithreading. A good compromise is to debug print and clear any buffered status right before sending the next command.
    We also need a method to check the last status code, in order to know the result of the last manual transfer. ftplib has to store it separately as an attribute.

    New attribute
    -------------

    this.last_status_code = None #has no effect to any command or debug output
    this.last_status_message

    New internal method
    -------------------

    #loop: look for buffered status response; if present, print debug and assign this.last_status = buffer.pop()
    .unbuffer():
    ...

    New user methods
    ----------------

    #Set last status to None so we can use "get_last_status" to check/poll for the next one.
    .clear_last_status():
    this.last_status_code = None
    this.last_status_message = ""
    return

    #Return the last status received from the server; if there is one on the buffer, unbuffer and return that.
    .get_last_status():
    this.unbuffer()
    return [this.last_status_code, this.last_status_message]

    @peterpan
    Copy link
    Mannequin Author

    peterpan mannequin commented Mar 9, 2016

    I've updated "ftplib.py" from the 3.5.1 source code release.
    This should fix issues:
    http://bugs.python.org/issue25458
    http://bugs.python.org/issue25491

    @peterpan
    Copy link
    Mannequin Author

    peterpan mannequin commented Mar 9, 2016

    Here is a small test for the new version.

    (To see the original ftplib.py version failing copy+paste the code from my initial post into a python file and run)

    @native-api native-api mannequin added the 3.7 (EOL) end of life label Dec 9, 2016
    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 9, 2016

    The current solution looks fishy to me. We should stick to https://tools.ietf.org/html/rfc959 . In particular, 226 is sent when the server closes the auxiliary socket, so the module should react accordingly. Debug printing and/or issuing warnings is an obvious no-go.

    Also attached a text file illustrating the problem.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 9, 2016

    Found the root problem: a 1xx response doesn't complete a LIST command, it should wait further for a 2xx one. See RFC 959 section 6 (state diagrams).

    This could be urllib's rather than ftplib's fault: the former calls low-level subroutines of the latter.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 9, 2016

    Darn, my problem _is_ in urllib and thus is different that the one in this ticket. Though it too results in a "command response shift".

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 9, 2016

    The solution for the OP's problem is:

    • after closing the data socket (whether it was returned by transfercmd()'/ntransfercmd()' or opened manually), an additional FTP.getresp()'/FTP.voidresp()' is required to process the 226 response.

    The built-in retrbinary()' and retrlines()' do implement this logic.

    But the documentation doesn't mention this requirement in transfercmd()'/ntransfercmd() entries (and getresp()'/voidresp()' are undocumented outright).

    urllib' authors have fallen into this same trap: they use transfercmd' directly but don't call `voidresp()' afterwards.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 11, 2016

    Spawned the `urllib' issue to bpo-28931.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 13, 2016

    I tried to fix urllib' and ultimately failed. In a nutshell, handling of the aftermath of an ntransfercmd' is broken.

    Since ntransfercmd'/transfercmd' returns a socket, handling of an end-of-transmission response is done in independently invoked code - upon the socket's close. So, if any other operation on the command connection is attempted before that, `ftplib's handling of the session breaks.

    The plan to fix follows, but first, some background:

    According to http://stackoverflow.com/questions/2549829/using-ftplib-for-multithread-uploads , FTP actually doesn't, and is not designed to, handle transfers in "background". In that you surely can send a further command on the socket, but the server won't actually read it until it's done with the transfer.

    According to http://stackoverflow.com/questions/31560701/ftp-data-connections-reuse , data connections cannot be reused.

    (RFC959 is vague on both points)

    Now, the proposed fix design:

    • upon starting a transfer, an FTP object sets a flag, `transfer_in_progress'.
    • any public subroutine that would send a further command checks the flag. Then, there are a few options:
      a) refuse any further commands outright until the user explicitly closes the socket (whose close handler would also handle the end-of-transfer response)
      b) check the wire for an end-of-transfer response and if it's there, handle it and allow the command. Closing the socket may or may not handle the response in its own right.
      c) allow the command even if there's no end-of-transfer response. Then handling of the transfer response is done with the function parsing the response for the new command - which will hang until the transfer is complete.
    • the code handling the end-of-transfer response clears the flag.

    Each option has drawbacks, so I'm not sure which one is the most pythonic.
    I would go with b) because it would allow syntax like urllib.urlopen(<...>).read() which it does allow for HTTP.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 13, 2016

    One more concern about the fix (so that you don't assume I didn't think of this ;) ) - handling of errors signified by the end-of-transfer response.

    Handling a response in a close handler prevents us from actually checking its code:

    Now,

    • The errors where the transfer never starts are detected by `ntransfercmd' either when opening the socket (425) or checking the first response (55x)
      • an exception when opening the socket would result in the response not being read.
    • The errors when the transfer ends prematurely
      • are either handled by the socket (connection reset/timeout)
      • or can be detected by checking data length against the real one if it's available <- these are not currently handled
        • if it results from the user closing the socket prematurely (426), it _should_ be ignored
        • otherwise, we can wrap recv(), too, and check the response if the underlying fn returns ''
    • If the error is local (an exception is raised in other code), the server's response doesn't matter

    Looks like fixing this part warrants a separate ticket, though it does affect which option to take at this step - it speaks in favor of wrapping the data socket.

    I'll ask at python-dev for some feedback before I go any way.

    @giampaolo
    Copy link
    Contributor

    Maybe it's me but I still fail to understand what's the issue here. From what I understand from the last message(s) it seems there is complaint about not being able to use ftplib with threads.
    FTP is a "command - response" protocol which is designed to handle one request at a time, meaning you send a command then you are supposed to *wait* to get a response back.
    If you want to do multiple transfers in parallel then you need a separate connection, not a separate thread using the same connection.
    ftplib should *not* be thread-safe because of how FTP works.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 14, 2016

    • The initial msg253326 has a code snippet using ftplib directly and showing the error.
    • The linked bpo-28931 has another snippet that uses ftplib through urllib and results in the same error.

    There isn't a single mention of "thread" in either.
    Could you just run either of them and see for yourself?

    I can rub your nose in it, but I thought you're better than that :)

    Oh well...

    C:\Ivan>python
    Python 2.7.12 (v2.7.12:d33e0cf91556, Jun 27 2016, 15:19:22) [MSC v.1500 32 bit Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import ftplib
    >>> ftp = ftplib.FTP()
    >>> ftp.connect('ftp.debian.org', timeout=10)
    '220 ftp.debian.org FTP server'
    >>> ftp.login('anonymous','user@example.com')
    '230 Login successful.'
    >>> ftp.sendcmd('TYPE A')
    '200 Switching to ASCII mode.'
    >>> s = ftp.transfercmd('LIST')
    >>> fp = s.makefile('r')
    >>> fp.read()
    'drwxr-xr-x    9 1176     1176         4096 Dec 14 15:08 debian\r\n'
    >>> fp.close()
    >>> s.close()
    >>> # Now the session is broken:
    ...
    >>> ftp.sendcmd('NOOP')
    '226 Directory send OK.'
    >>> ftp.dir()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "C:\Py\lib\ftplib.py", line 534, in dir
        self.retrlines(cmd, func)
      File "C:\Py\lib\ftplib.py", line 437, in retrlines
        conn = self.transfercmd(cmd)
      File "C:\Py\lib\ftplib.py", line 376, in transfercmd
        return self.ntransfercmd(cmd, rest)[0]
      File "C:\Py\lib\ftplib.py", line 334, in ntransfercmd
        host, port = self.makepasv()
      File "C:\Py\lib\ftplib.py", line 312, in makepasv
        host, port = parse227(self.sendcmd('PASV'))
      File "C:\Py\lib\ftplib.py", line 830, in parse227
        raise error_reply, resp
    ftplib.error_reply: 200 Switching to ASCII mode.

    @native-api
    Copy link
    Mannequin

    native-api mannequin commented Dec 14, 2016

    Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC v.1900 32 bit (Intel)] on win32
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import ftplib
    <...>
    >>> ftp.sendcmd('NOOP')
    '226 Directory send OK.'

    There are no changes in and `ntransfercmd' between 2.7 and default branches save for cosmetic ones.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant