Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extractAll() for Tarballs in tarPath not extracting files #6

Closed
theAkito opened this issue Jan 31, 2021 · 3 comments
Closed

extractAll() for Tarballs in tarPath not extracting files #6

theAkito opened this issue Jan 31, 2021 · 3 comments

Comments

@theAkito
Copy link

theAkito commented Jan 31, 2021

I want to extract a Tarball located at tarPath to tmpDir (not existing sub-directory). However, nothing gets extracted.

When I extract this Tarball with PeaZip, everything gets extracted perfectly and works fine. However, when I do it with this procedure

proc extractAll(tarPath, dest: string) {.raises: [IOError, ZippyError, OSError], tags: [
 ReadIOEffect, ReadDirEffect, ReadEnvEffect, WriteDirEffect, WriteIOEffect].}

the files never get extracted.

There is no exception, no message, nothing. No idea what is going on there.

The Tarball is named properly, suffixed with a .tar and works perfectly fine.

Update 1

After debugging tarball.nim I found out that inside the extractAll procedure, the loop after

NEVER gets entered. It just skips the entire rest of the procedure at this line.
Watching open, this procedure seems to work fine (but perhaps is not adding the files properly to the tarball) and finds all filenames from the tarball.

Update 2

As expected, the provided Tarball() does not have contents, so nothing is extracted, as nothing is provided by open():

ref 0000000004913fb0 --> [contents = [data = [],
counter = 0,
first = -1,
last = -1]]

Getting the above when running echo tarball.repr here:

Update 3

Through further debugging open() my suspicion was confirmed: open() never fills the tarball.contents; they always remain empty.

Update 4

if typeFlag == '0' or typeFlag == '5':

The above line ensures that tarball.contents are only filled, when the typeFlag either equals 0 or 5.
The problem is: typeFlag is always empty!

Update 5

fileNamePrefix =

The above line should ensure, that a valid fileNamePrefix is given, if the header is specifying a ustar archive.
The issue is, that fileNamePrefix always remains empty. because neither header[345 ..< 500].trim() nor header[345 ..< 500] returns any output. Not sure if this is required or optional. It however does not seem to be directly related to the bigger problem at hand, but it's worth noting.

Update 6

I found the mistake!

When using writeTarball(), the headers for each file are built manually:

var header = newStringOfCap(512)
header.add(path)
header.setLen(100)
header.add("000777 \0") # File mode
header.add(toOct(0, 6) & " \0") # Owner's numeric user ID
header.add(toOct(0, 6) & " \0") # Group's numeric user ID
header.add(toOct(entry.contents.len, 11) & ' ') # File size
header.add(toOct(entry.lastModified.toUnix(), 11) & ' ') # Last modified time
header.add(" ") # Empty checksum for now
header.setLen(156)
header.add(ord(entry.kind).char)
header.setLen(257)
header.add("ustar\0") # UStar indicator
header.add(toOct(0, 2)) # UStar version
header.setLen(329)
header.add(toOct(0, 6) & "\0 ") # Device major number
header.add(toOct(0, 6) & "\0 ") # Device minor number
header.setLen(512)

Now, when we look at the tar standard format, we see that the header byte 156 needs to contain char typeflag;. This typeflag is meant to be set here:

header.add(ord(entry.kind).char)

Now in this case, the number (from ord()) does not get converted to a char; it returns nothing. That's why the typeFlag is missing

typeFlag = header[156]

and that is why the files never are put into the tarball.contents
if typeFlag == '0' or typeFlag == '5':
tarball.contents[(fileNamePrefix / fileName).toUnixPath()] =
TarballEntry(
kind: EntryKind(typeFlag),
contents: data[pos ..< pos + fileSize]
)

and that is why the Tarball() is never actually extracted:
for path, entry in tarball.contents:
if path.isAbsolute():
raise newException(
ZippyError,
"Extracting absolute paths is not supported (" & path & ")"
)
if path.contains(".."):
raise newException(
ZippyError,
"Extracting paths containing `...` is not supported (" & path & ")"
)
case entry.kind:
of ekNormalFile:
createDir(tmpDir / splitFile(path).dir)
writeFile(tmpDir / path, entry.contents)
if entry.lastModified > Time():
setLastModificationTime(tmpDir / path, entry.lastModified)
of ekDirectory:
createDir(tmpDir / path)

@guzba
Copy link
Owner

guzba commented Jan 31, 2021

Picking up where we left off over here: 9a0482b

Before I merge a fix in for this issue, I'd like to have a repro so I can see it, keep it in mind as I work, and also to add a test for it, etc. Currently I cannot repro the issue but I believe it is real.

To start off, if you don't mind, could you checkout to master of guzba/zippy (head at 1f9fa49), cd to the project base dir, and run nim c --gc:arc -r .\tests\test_tarballs.nim (importantly, use --gc:arc), the test should pass. Then uncomment the block starting at line 33 through 50 so it runs in addition to the tests above it in the file. Again with --gc:arc, the test should pass without any changes (it does for me at least).

Does this also pass for you without changes or do you run into the empty Tarball issue?

@guzba
Copy link
Owner

guzba commented Jan 31, 2021

My current theory is that the issue is typeFlag can be '\0' in addition to '0' for normal files (per https://www.gnu.org/software/tar/manual/html_node/Standard.html), but that I currently do not include '\0' here:

if typeFlag == '0' or typeFlag == '5':

So I'm thinking this may be the fix:

if typeFlag == '0' or typeFlag == '\0':
  tarball.contents[(fileNamePrefix / fileName).toUnixPath()] =
    TarballEntry(
      kind: ekNormalFile,
      contents: data[pos ..< pos + fileSize]
    )
elif typeFlag == '5':
  tarball.contents[(fileNamePrefix / fileName).toUnixPath()] =
    TarballEntry(
      kind: ekDirectory
    )

This would happen for tarballs created by something other than Zippy since I should always be using '0' for files.

@guzba
Copy link
Owner

guzba commented Apr 11, 2021

I have merged and released the above change to address this issue a while back. Feel free to re-open if this issue persists.

@guzba guzba closed this as completed Apr 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants