-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tarfile module should have a command line #57686
Comments
The tarfile module should have a simple command line that allows it to be executed with "-m" — even if its only ability was to take a filename and extract it to the current directory, it could be a lifesaver on Windows machines where Python has been installed but nothing else. Would such a patch be welcome if I could write one up? |
The feature request seems reasonable to me, but this can only go in 3.3. |
This is no bad idea. I recommend keeping it as simple as possible. I would definitely not be supportive of a full tar clone. List, extract, create - that should be enough. There are two possible command line choices: do what the zipfile module does or emulate tar. I am in favor of the latter. |
Patch looks good! Some minor comments on Rietveld. Could you add tests? |
+1 for adding a CLI and +1 for keeping it minimal. |
I was also working on this issue so thought I should also submit my patch.
|
Thanks for the review, Éric.
Done. Here's the new patch with Éric's comments addressed. |
Thanks for your comments Serhiy. And I am writing tests. |
It will be good if Berker and Ankur will merge their patches. Ankur's patch has some very useful features, but Berker's patch looks more mature. I prefer to emulate a subset of the tar utility interface too. |
I am more in favor of having something simple and similar to zipfile, like Lars, rather than following tar. |
This can confuse users. Note that even jar (which works with zip-like files) honors tar interface. |
Yeah, that’s always the discussion when writing a Python utility that has a unix equivalent: do you want to be familiar to Python users or to the unix tool users? I don’t have a strong opinion. I think unix users would have no reason to use python -m tarfile, and windows users won’t have the expectation that the interface is the same as tar—unless they are unix people who are using a windows machine for whatever reason. If it were me, I’d just start with python -m tarfile --help, so I’d have no expectations :) |
+ parser.add_argument('--gz', '--gunzip', '--gzip', '--tgz', '-z', Do we really need so much names for the same option? Where do these names come from? -- main() should exit after extract and create to only do one operation and don't always display the usage. It would be better to not duplicate the list of options and use parser.print_help() instead of sys.stdout.write(doc). Some consistency tests on exclusive options (bzip/gzip/lzma and list/create/extract) would be nice. -- tar options on Linux:
For tarfile, I propose to have a shorter list, and try to stay somehow compatible with tar:
Users of the TAR format usually come from UNIX, so using the same command line options should not be so surprising. I don't like the idea of an optional argument for --extract: "--extract file1 file2" is usually understood/read as "--extract=filename archive.tar". If you really think that we need to support "only extract some files", it should be a different option. Linux tar command has no such option. I propose to drop this feature (always extract all files). |
New patch(issue13477_v3.diff) attached. Changes:
The current docstring of tarfile module does not give much |
I was trying to implement all the formats mentioned in Serhiy's review. (and also different names for the same format) |
FTR Lars said that he prefered compat with the zipfile CLI, which is: Usage: |
Did you get all the review comments? Some of them were made on older versions of the patch, and don’t seem to be addressed in the latest version. Thanks. Ankur, could you submit a contributor agreement? http://www.python.org/psf/contrib/contrib-form/ |
I am still unclear about the outcomes of the discussion. I am confused which features need to be kept and which are to be removed.
|
Modern tar programs don't need to be told the compression method--they infer it. If they can do it in C, we can do it in Python. So we should simply omit the "-bz2" stuff. As for what the interface should look like, I'm definitely in favor of it looking like tar. unzip has the same interface on different platforms; so does 7zip, so does unrar. I think it's reasonable to expect that tar would take the same interface on different platforms. We don't need to coddle Windows users here. We're already expecting them to be sophisticated enough to handle the EOL conversion we're not doing for them. |
Note that --create command should support --directory option too.
An archive may have no extension or have a nonstandard extension. And stdin/stdout does not have a name. |
Huh. tar *can* infer it from the data itself. On the other hand, it chooses explicitly not to. % cat ~/Downloads/Python-3.3.0.tar.bz2| tar xvf - % cat ~/Downloads/Python-3.3.0.tgz| tar xvf - I guess "tar" knows explicit is better than implicit too ;-) |
Larry Hastings <report@bugs.python.org> writes:
I am told that the refusal of "tar" to introspect the data is because: (a) Tar runs "gunzip -c" (for example) as an external program; it does (b) Streams in UNIX cannot be rewound. Tar cannot look at the first (c) Given (a) and (b), tar could only support data introspection of (d) Therefore, tar refuses to even look. Since Python does bundle compression in its standard library, it can |
I don't think that we need to support compressing/decompressing using 2013/3/20 Brandon Craig Rhodes <report@bugs.python.org>:
|
I'd like to re-emphasize that it is best to keep the whole thing as simple and straight-forward as possible. Offer some basic operations and that's it. Although I am pretty accustomed to the original tar command line, I think we should copy zipfile's interface. It makes more sense to offer some kind of unified "Python" command line approach for archive access than keeping to old traditions. I agree with Victor that we don't really need support for stdin/stdout. It only complicates matters. If everybody still votes for stdin/stdout, I'd like to point out that tarfile supports compression detection for streams. It would be best to use mode="r|*" throughout because it works for both normal files and stdin. Use mode="w|(compression)" for writing to files and stdout accordingly. If we do not support stdin/stdout we no longer need all these compression options because for reading we do autodetection and for writing we could deduce the compression from the file extension (which is just some kind of autodetection too). Another side note: We should be aware of the effects discussed in bpo-17102 and bpo-1044. In my opinion tarfile as a library is obligated to behave like that, but maybe that's not acceptable for a command line tool. |
Then I propose to add an alternative tarfile command-line interface as Tools/scripts/tar.py for those who prefer a well-known and well-tested traditional interface. |
Regenerated patch against latest default (fixing conflicts). |
Thanks for the rebase, Antoine. Here is an updated patch:
|
From a quick glance, the patch looks ok. Serhiy, do you want to review it any further? |
Yes, this is in my plans. |
I have added comments on Rietveld. |
Attached an updated patch that addresses Serhiy's comments. Thanks! |
I think Berker has misunderstood me. Here is a patch based on issue13477_v5.diff with some cherry-picked changes from issue13477_v6.diff and several other changes:
I'm going to commit this patch at short time. Known bugs:
Besides all this I think the patch can be committed. |
New changeset a5b6c8cbc473 by Serhiy Storchaka in branch 'default': |
New changeset 70b9d22b900a by Serhiy Storchaka in branch 'default': |
Thank you Antoine. |
New changeset 5b52db6fc7dc by R David Murray in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: