Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mc cd foo.tar#utar does not handle POSIX ustar archives, only GNU tar vendor-specific/legacy ones #1952

Closed
mc-butler opened this issue Jan 10, 2010 · 43 comments
Assignees
Labels
area: vfs Virtual File System support prio: medium Has the potential to affect progress
Milestone

Comments

@mc-butler
Copy link

Important

This issue was migrated from Trac:

Origin https://midnight-commander.org/ticket/1952
Reporter mirabilos
Mentions miros-discuss@….org, zaytsev (@zyv), mrmazda@….net, nerijus@….sourceforge.net, szotsaki@….com

Hi,
please see http://www.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_06
for the specification of the POSIX ustar interchange format.
GNU cpio (-Hustar), paxtar, and GNU tar --format=ustar
all create archives of this format; bsdtar probably does
as well. However, I cannot cd#utar or “Enter” them in
both mc-4.6.1-16 (MirPorts) and mc_3:4.7.0-1 (Debian sid).
After looking at tar.c I think you only support the legacy
or vendor-specific/proprietary GNU tar archive format.
The new boot floppies of MirBSD as of today are ustar
archives, with the bootsector squeezed into an ustar
header and closely following the standard. Introspection
would be nice.

Note

Original attachments:

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 17, 2010 at 13:17 UTC (comment 1)

  • Milestone changed from 4.7 to 4.7.3
  • Status changed from new to accepted
  • Owner set to andrew_b
  • Severity changed from no branch to on review

Created 1952_branch. Parent branch is master.
[ff37dc26d46f652538c34475fec3f2b9bc9aa536]

In this branch, MC uses external TAR program instead of self parsing TAR archives. This branch also fixes #2201.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 17, 2010 at 16:42 UTC (comment 2)

  • Severity changed from on review to on rework

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 17, 2010 at 18:03 UTC (comment 3)

  • Severity changed from on rework to on review

Fixed extraction files from TAR archive.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 18, 2010 at 5:27 UTC (comment 4)

  • Severity changed from on review to on rework

There are problems with devices.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 18, 2010 at 6:44 UTC (comment 5)

  • Severity changed from on rework to on review

I hope that's all. :)

Initial [fbacd863dd2d9ae65903954827ae7547699d64db]

@mc-butler
Copy link
Author

Changed by mirabilos on Jun 19, 2010 at 14:40 UTC (comment 6)

I’m reading the unidiff… this now looks better, but the various tar
utilities’ output formats *also* differ:

GNU tar

tg@frozenfish:~ $ tar tzvf mksh_39.3.orig.tar.gz
-rw-r--r-- root/wheel 296033 2010-02-25 22:03 mksh-39.3.orig/mksh-R39c.cpio.gz
-rw-r--r-- root/wheel 11840 2010-01-28 16:22 mksh-39.3.orig/printf.c.1.14

paxtar (OpenBSD, MirBSD, maybe others; I have a Debian package):

tg@blau:~ $ tar tzvf mksh_39.3.orig.tar.gz
-rw-r--r-- 1 root wheel 296033 Feb 25 22:03 mksh-39.3.orig/mksh-R39c.cpio.gz
-rw-r--r-- 1 root wheel 11840 Jan 28 16:21 mksh-39.3.orig/printf.c.1.14

bsdtar (libarchive-based; native on FreeBSD, MidnightBSD and others):

mirabilos@stargazer:~ $ tar tzvf mksh_39.3.orig.tar.gz
-rw-r--r-- 0 root wheel 296033 Feb 25 22:03 mksh-39.3.orig/mksh-R39c.cpio.gz
-rw-r--r-- 0 root wheel 11840 Jan 28 16:22 mksh-39.3.orig/printf.c.1.14

There may very well be others, but these three are the most often
used – although, on FreeWRT, we have busybox tar (because one of the
libc functions paxtar uses seems to be broken with µClibc):

root@wlan1:~ # tar tvf mksh_39.3.orig.tar
-rw-r--r-- 0/0 296033 2010-02-25 23:03:39 mksh-39.3.orig/mksh-R39c.cpio.gz
-rw-r--r-- 0/0 11840 2010-01-28 17:22:12 mksh-39.3.orig/printf.c.1.14

And yes, I’m also the maintainer of mc on FreeWRT ;-)

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 20, 2010 at 14:06 UTC (comment 7)

OK, I see.

What we can do?

  • We can parse the output of tar --version and call the according function for each TAR utility (GNU, paxtar, bsdtar, ...) in new utar script.
  • We can support all tar formats in binary (as is currently in MC), but it will enlarge the size of main MC file (for reference, the size of GNU tar binary is more than 200 kB).
  • We can use some 3rd-party library or framework that supports tar archives:
  • Something else

@mc-butler
Copy link
Author

Changed by mirabilos on Jun 20, 2010 at 14:26 UTC (comment 8)

Only GNU tar supports --long-options.

I see two ways:

Either support all formats (I see two "major" differences with two subtle
subformats each) in the vfs script, or detect which tar /bin/tar is at
configure time (e.g. by checking a minimal tar file, I can produce one
which is 2K in size) and patch the vfs script (using the .in mechanism
would be fine) and hardcode /bin/tar as $TAR.

We could whitelist the four supported output formats (also consider that
gid 0 can be root, wheel, or something else…) and reject unknowns, thus
getting people to send in the actual output THEY get. Locale settings may
be an issue with GNU software (and some other) too.

This would break cross compilation though.

Or we could just try to apply guesswork (for instance, uid/gid or
uid<whitespacespace>gid, and it doesn’t matter whether uid and gid are
numeric or not… just the time/date format is annoying – the ls(1)-like
format is something I loathe to parse, but you can relatively easily
check for it). FWIW:

tg@blau:~ $ tar tzvf /MirOS/dist/mir/mksh/mksh-R24.cpio.gz | head -1
-rw-r--r-- 1 root wheel 125442 Jul 6 2005 mksh/mksh.1

This is the format I see with “old” files.

So I’d all be for the first way – support all of them in the vfs script.
If you want I could have a look at hacking this too; I have access to
Solaris, possibly HP-UX and AIX (if they get the lpar to boot/work again),
so I could test it on relatively many systems. I’d need to be pointed to
a specification of what exact arguments, input and output the vfs scripts
receive and are supposed to output though.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 20, 2010 at 14:42 UTC (comment 8.9)

  • Description edited

Replying to mirabilos:

Only GNU tar supports --long-options.

Long options are not used in recent version of vfs script .

I see two ways:
[skip]
This would break cross compilation though.

Cross compilation wouldn't be broken.

I’d need to be pointed to
a specification of what exact arguments, input and output the vfs scripts
receive and are supposed to output though.

You can found that in MC source tree (lib/vfs/mc-vfs/extfs/README) or in installed MC in you system (/usr/libexec/mc/extfs.d/README or /usr/lib/mc/extfs.d/README).

Thanks!

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jul 5, 2010 at 10:18 UTC (comment 10)

  • Severity changed from on review to on rework

@mc-butler
Copy link
Author

Changed by angel_il (@ilia-maslakov) on Jul 5, 2010 at 20:27 UTC (comment 11)

  • Milestone changed from 4.7.3 to 4.7.4

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jul 26, 2010 at 12:12 UTC

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Aug 14, 2010 at 15:11 UTC

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 9, 2010 at 16:28 UTC (comment 14)

  • Cc changed from miros-discuss@….org to miros-discuss@….org, zaytsev

There is a re-implementation of tar script in Debian bugzilla:

http://bugs.debian.org/500693

Maybe you can steal something from there.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Sep 20, 2010 at 16:28 UTC (comment 15)

  • Severity changed from on rework to on review
  • Keywords ustar, tar, vfs deleted
  • Milestone changed from 4.7.4 to 4.7.5
  • Version changed from 4.6.1 to master

Branch 1952_tar. Parent: master.
[cae7459699f6a22d63272e66dcfa4eedc017a765]

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Sep 28, 2010 at 17:33 UTC (comment 16)

Recent master contains modified VFS layer. Branch 1952_tar has been rebased.
Initial [dbf60df91916ca167270aa06d2cd1c88c0ac3cc7]

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Oct 29, 2010 at 11:23 UTC (comment 17)

  • Severity changed from on review to on hold
  • Blocked by set to #3

Ticket frozen until #3 unfixed.

@mc-butler
Copy link
Author

Changed by slavazanko (@slavaz) on Jul 8, 2011 at 9:30 UTC (comment 18)

  • Branch state set to on hold
  • Severity changed from on hold to no branch

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jul 8, 2011 at 9:46 UTC (comment 19)

  • Milestone changed from 4.7.5 to 4.8

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Aug 28, 2011 at 14:46 UTC

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Jun 18, 2015 at 18:25 UTC (comment 21)

  • Milestone changed from 4.8 to Future Releases

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Apr 25, 2017 at 10:37 UTC

@mc-butler
Copy link
Author

Changed by mrmazda (mrmazda@….net) on Apr 7, 2020 at 4:39 UTC (comment 23)

I've been extracting mozilla.org's Linux archives for two decades on various Gnu Linux distributions using MC exclusively, in virtually all cases the MC version packaged by the distro.

http://archive.mozilla.org/pub/firefox/releases/68.5.0esr/linux-x86_64/en-US/firefox-68.5.0esr.tar.bz2 2020-02-10 is the last version I was able to do this with successfully.

As of http://archive.mozilla.org/pub/firefox/releases/68.6.0esr/linux-x86_64/en-US/firefox-68.6.0esr.tar.bz2 2020-03-09 the destination has corrupted timestamps, 1970-01-01 for ordinary files, current date/time for directories, using 4.8.24 on Fedora 32, Debian Testing/Bullseye and openSUSE Tumbleweed.

Same problem with http://archive.mozilla.org/pub/firefox/releases/68.7.0esr/linux-x86_64/en-US/firefox-68.7.0esr.tar.bz2 2020-04-06.

@mc-butler
Copy link
Author

Changed by mrmazda (mrmazda@….net) on Apr 7, 2020 at 4:47 UTC (comment 24)

  • Cc changed from miros-discuss@….org, zaytsev to miros-discuss@….org, zaytsev, mrmazda@….net

@mc-butler
Copy link
Author

Changed by nerijus (nerijus@….sourceforge.net) on Jul 12, 2020 at 10:38 UTC (comment 25)

  • Cc changed from miros-discuss@….org, zaytsev, mrmazda@….net to miros-discuss@….org, zaytsev, mrmazda@….net, nerijus@….sourceforge.net

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 29, 2020 at 7:55 UTC

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 29, 2020 at 7:56 UTC

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 29, 2020 at 9:21 UTC (comment 27)

So, Suse people updated the script in mid-2018 and apparently it has been working well for quite some time. Andrew, what's your opinion? Is there a good reason (performance? availability on embedded w/o tar executable?) why we should keep our tar code?

If it makes more sense to keep our code, I wonder if we could steal somewhere a modern and clean implementation from all tar subformats floating around instead having an old unmaintained own implementation which probably was branched from whatever at some point...

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Sep 29, 2020 at 10:08 UTC (comment 28)

File extraction will be too slow (like an uzip). tar doesn't contain a list of files. To extract a file you should walk through archive to find it. To extract next file, you should walk through archive again. Again and again.

In the MC's tar implementation, position of all files are stored while archive reading and then used while file reading/extraction.

I'm working on update of tar -- I'm trying to sync it code with GNU tar one. But, unfortunately, haven't enough time for that. It's not trivial task because MC'tar is GNU tar approx. 25 years ago.

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on Sep 29, 2020 at 10:13 UTC (comment 28.29)

Replying to andrew_b:

tar doesn't contain a list of files.

It couldn't help in any case in the current VFS implementation (see #3).

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on Sep 29, 2020 at 11:44 UTC (comment 30)

Oh wow, thank you very much for the explanation. Yes, if you think that it's possible to sync up the code with GNU tar, this would be perfect. Hopefully if done right, later syncs will be much easier. One could also try to steal code from libarchive. No idea if it's any easier and/or better...

@mc-butler
Copy link
Author

Changed by szotsaki (szotsaki@….com) on Dec 28, 2020 at 15:39 UTC (comment 31)

  • Cc changed from miros-discuss@….org, zaytsev, mrmazda@….net, nerijus@….sourceforge.net to miros-discuss@….org, zaytsev, mrmazda@….net, nerijus@….sourceforge.net, szotsaki@….com

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 7, 2023 at 9:47 UTC (comment 32)

  • Blocked by #3 deleted

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 7, 2023 at 9:52 UTC (comment 33)

  • Milestone changed from Future Releases to 4.8.30
  • Branch state changed from on hold to on review

Now tar of MC supports various extended headers (including long file names and sparse files). The implementation is taken from GNU tar. Please test.

Branch: 1952_tar
Initial [78a25f7]

@mc-butler
Copy link
Author

Changed by zaytsev (@zyv) on May 7, 2023 at 11:59 UTC (comment 34)

This is awesome work! I wonder if the code can be organised somehow such that updates from GNU tar will be easier in the future by checking the diff and just stealing the code...

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 20, 2023 at 16:44 UTC (comment 35)

  • Votes set to andrew_b
  • Branch state changed from on review to approved

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 20, 2023 at 16:47 UTC (comment 36)

  • Votes changed from andrew_b to committed-master
  • Resolution set to fixed
  • Branch state changed from approved to merged
  • Status changed from accepted to testing

Merged to master: [e5911c1].

git log --pretty=oneline 86a9e0be2..e5911c1ef

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 20, 2023 at 16:50 UTC (comment 37)

  • Status changed from testing to closed

@mc-butler
Copy link
Author

Changed by ukr (@ePubRepo) on May 24, 2023 at 15:33 UTC

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 24, 2023 at 17:09 UTC (comment 39)

  • Resolution fixed deleted
  • Status changed from closed to reopened

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 24, 2023 at 17:50 UTC (comment 40)

  • Branch state changed from merged to on review
  • Votes committed-master deleted

Timestamps in tar archive are shown as "Jan 1, 1970".

Branch: 195_tar_timestamp
[c9169c0aa8c162ce6b5fd15636753865f9c3f844]

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 28, 2023 at 16:21 UTC (comment 41)

  • Branch state changed from on review to approved
  • Votes set to andrew_b

@mc-butler
Copy link
Author

Changed by andrew_b (@aborodin) on May 28, 2023 at 16:25 UTC (comment 42)

  • Branch state changed from approved to merged
  • Resolution set to fixed
  • Votes changed from andrew_b to committed-master
  • Status changed from reopened to closed

Merged to master: [5ac1e86].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: vfs Virtual File System support prio: medium Has the potential to affect progress
Development

No branches or pull requests

2 participants