-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MLSD command support to ftplib #55281
Comments
From RFC-3659: The MLST and MLSD commands are intended to standardize the file and The patch in attachment adds support for MLSD command. Example: >>> import ftplib
>>> from pprint import pprint as pp
>>> f = ftplib.FTP()
>>> f.connect("localhost")
>>> f.login("anonymous")
>>> ls = f.mlsd()
>>> pp(ls)
{'modify': 20100814164724,
'name': 'svnmerge.py',
'perm': 'r',
'size': 90850,
'type': 'file',
'unique': '80718b568'},
{'modify': 20101207185033,
'name': 'README',
'perm': 'r',
'size': 53731,
'type': 'file',
'unique': '80718aafe'},
{'modify': 20100417183215,
'name': 'install-sh',
'perm': 'r',
'size': 7122,
'type': 'file',
'unique': '80718b2d2'},
{'modify': 20110129210053,
'name': 'Include',
'perm': 'el',
'size': 4096,
'type': 'dir',
'unique': '8071a2bc4'}]
>>> |
Why the callback option? |
I agree that the callback isn't needed, and it reflects the older coding style of much of the library (such as in retrlines). Instead, I'd make this a generator, yielding each of the dicts. (Actually in some ideal rewrite of ftplib, the whole callback "feature" used by retrlines would go away except as a compatibility feature, and internally everything would use generators instead of callbacks.) Aren't you modifying the state on the server (via "OPTS MLST"), and then if you make a subsequent call without specifying "facts" you'll be using the value of "facts" from the previous call to MLSD? I don't think that's desirable, but short of calling "OPTS MLST" every time (possibly with the results of an initial FEAT) there's no way around it. Just today I saw a filename with "; " in the name. I think you want to use .partition(' ') isolate the facts from the filename, and add a test with such a filename. I don't like the "isdigit" test. Shouldn't this decision be left to the caller? What if a fact happens to look like an integer some of the time, but not always? Maybe "unique" is a hex string. I don't think you'd want it converted to an int on the occasions where it was all decimal digits, but a string otherwise. Also look at your "modify" facts. Those are not very useful as ints. I don't think you should invent a pseudo-fact "name", in case the standard ever uses that. I'd prefer if you yielded tuples of (filename, fact-dict). MLSD_DATA has newlines in it, so you get "\r\n\n" sequences in your test data. If a fact=value string does not have an '=', you silently ignore it. I'd rather this raise an exception and not pass silently. It's not compliant. You should have a test for this. It's possible for a value to have an '=' in it, so you don't want to use .split('='), you should use .partition('='). You should add such a fact to the test data. Thanks for working on this. It will be a great addition to ftplib. |
Here is a patch incorporating some of the changes proposed by Eric:
This is the behaviour according to RFC. MLSD will return the set of facts, until a new OPTS MLST issued.
Drop the isdigit test: some value such as timestamps ('modify') might be "digits" for one files and would remain strings for the others.
I don't think it should raise an error, i'd rather just pass it to the caller. There is a wide variety of possible non-compliant responses. For example there is a rigid format for time stamps, do we have to check for that? One thing, that my patch doesn't do at the moment, but I think would be useful is to normalise all fact names to lower case. What do you think? I hope that MLSD_DATA now covers wider range of cases (it's from http://tools.ietf.org/html/rfc3659#section-7.7.4). Note that server MUST NOT return facts that weren't requested. What would be the return values check here? |
Thanks for the great review Eric.
Correct. This is how OPTS/MLSD are supposed to work. |
facts_found.strip(" ").rstrip(";") strip is redundant since facts_found is a first element of partitioning by the same string. rstrip is wrong since you're potentially deleting more than one character (there is no test for that). In your test you're checking whether returned facts contain every requested fact, this is not guaranteed by RFC. Also, not sure what the last for statement is supposed to mean. Is it a typo? |
You're right about r/strip(), thanks (new patch in attachment).
Yes, but we're using a dummy FTP server returning static MLSD_DATA, so this is not important.
It makes sure whether in case of directory empty (see: no data returned) we don't enter in the "for" block. |
I'll give this a proper review in the next day or so (busy at PyCon). Despite the fact that I typically hate changing values returned by the server, I agree on case-folding the fact names to lowercase upon reading them. The RFC clearly states they're case insensitive: "Fact names are case-insensitive. Size, size, SIZE, and SiZe are the same fact." You should have the facts in MLSD_DATA be mixed case, to ensure they're being lowercased correctly. I agree that there's nothing that can be done in the case where no facts are specified: the spec clearly says just use the values we've previously set. Caller beware. |
This is already tested by: + # case sensitiveness |
So it is. I told you I hadn't done a proper review! I was mainly trying to say I agree with lowercasing. I'll shut up until I can read the whole patch. |
Eric, any further comments about the patch? |
Yes, I think this should be committed. I think the API is reasonable, which is the primary concern. If there are implementation bugs, they can be addressed as they're found. |
New changeset 56bce38b274f by Giampaolo Rodola' in branch 'default': |
Has something prevent you from implementing suggestion provided in my review? |
Which ones in particular? msg130474? |
a review on rietveld http://bugs.python.org/review/11072/show |
In my view, aside from the documentation note they're all minor/styling-related changes but feel free to provide a patch if you want to and I'll commit it. |
New changeset 153bd8fc22c7 by Giampaolo Rodola' in branch 'default': |
New changeset ecef6b3f6639 by Gregory P. Smith in branch '3.5': New changeset 287bb82768a7 by Gregory P. Smith in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: