-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for over-long filepaths via GNU extension and fix hardlinks #77
Conversation
@Bodigrim do you want a review for this? It's quite a long time ago I wrote this. |
11d6191
to
33ddcda
Compare
@hasufell do you remember how you tested this branch long ago? |
Pretty sure I used actual tar archives. I can try to give this a go maybe. |
Quite interestingly |
So, to test long filepaths, I changed the arbitrary instance like so: diff --git a/test/Codec/Archive/Tar/Types/Tests.hs b/test/Codec/Archive/Tar/Types/Tests.hs
index d2acc43..912bebb 100644
--- a/test/Codec/Archive/Tar/Types/Tests.hs
+++ b/test/Codec/Archive/Tar/Types/Tests.hs
@@ -48,7 +48,7 @@ instance Arbitrary TarPath where
arbitrary = either error id
. toTarPath False
. FilePath.Posix.joinPath
- <$> listOf1ToN (255 `div` 5)
+ <$> listOf1ToN 500
(elements (map (replicate 4) "abcd"))
shrink = map (either error id . toTarPath False) And then discovered that commit 0af50df seems to break the code. If I revert it, the test suite passes. |
I was able to fix it this way: diff --git a/test/Codec/Archive/Tar/Types/Tests.hs b/test/Codec/Archive/Tar/Types/Tests.hs
index d2acc43..134b141 100644
--- a/test/Codec/Archive/Tar/Types/Tests.hs
+++ b/test/Codec/Archive/Tar/Types/Tests.hs
@@ -45,19 +45,26 @@ instance Arbitrary Entry where
| perms' <- shrinkIntegral perms ]
instance Arbitrary TarPath where
- arbitrary = either error id
- . toTarPath False
+ arbitrary = fromTarPathResult
+ . toTarPath' False
. FilePath.Posix.joinPath
- <$> listOf1ToN (255 `div` 5)
+ <$> listOf1ToN 500
(elements (map (replicate 4) "abcd"))
- shrink = map (either error id . toTarPath False)
+ shrink = map (fromTarPathResult . toTarPath' False)
. map FilePath.Posix.joinPath
. filter (not . null)
. shrinkList shrinkNothing
. FilePath.Posix.splitPath
. fromTarPathToPosixPath
+-- errors only on FileNameEmpty, because long filenames are supported
+fromTarPathResult :: ToTarPathResult -> TarPath
+fromTarPathResult FileNameEmpty = error "File name empty"
+fromTarPathResult (FileNameOK tp) = tp
+fromTarPathResult (FileNameTooLong tp) = tp
+
+
instance Arbitrary LinkTarget where
arbitrary = maybe (error "link target too large") id
. toLinkTarget |
This actually makes me question the code. See the tests here: tar/test/Codec/Archive/Tar/Tests.hs Lines 25 to 42 in 7d0517d
I'm not sure we would expect long filenames to work under non-GNU formats? |
My tests show that It doesn't seem we allow great control to the users over the format, but the following patch might be an improvement in strictness anyway (and will also make ustar and v7 property tests fail with long filenames, which I would expect): diff --git a/Codec/Archive/Tar/Write.hs b/Codec/Archive/Tar/Write.hs
index 0d76e14..f3dcf5e 100644
--- a/Codec/Archive/Tar/Write.hs
+++ b/Codec/Archive/Tar/Write.hs
@@ -38,6 +38,8 @@ write es = LBS.concat $ map putEntry es ++ [LBS.replicate (512*2) 0]
putEntry :: Entry -> LBS.ByteString
putEntry entry = case entryContent entry of
NormalFile content size -> LBS.concat [ header, content, padding size ]
+ OtherEntryType 'L' _ _
+ | entryFormat entry /= GnuFormat -> error "Long filename support is a GNU extension"
OtherEntryType _ content size -> LBS.concat [ header, content, padding size ]
_ -> header
where |
And sorry to bother for more, but I think we should also support
https://www.gnu.org/software/tar/manual/html_node/Standard.html Otherwise it seems rather inconsistent. I can write a patch for that. |
33ddcda
to
303c0f5
Compare
Sure, that would be most appreciated. Shall we merge this then? I tested it on examples from #49, seems doing well. |
I'd rather have it cleaned up a bit more and add tests with adjusted arbitrary instance? There are also a few other things I'm starting to notice. E.g. handling of unknown filetypes isn't entirely correct. From the spec:
|
a0c8b4a
to
4c0c755
Compare
I've added a test to check that pack/unpack roundtrips with long file names + restricted long names to |
4c0c755
to
213fdd8
Compare
I'm already done, just adding tests. Should I push to this PR or create another one? |
Hmm, something feels off: both
and even
just freeze on my machine. And yet the test suite succeeds. How is it possible?.. |
Not sure why |
I'm on MacOS, so it's
It might not support GNU extensions well indeed... |
Nevermind, |
Btw, now I remember when I debug printed filepaths, that they had a trailing NUL char in Most filepath operations rely on It may make sense to sanitize them after extraction from the headers. |
And clean up unpackEntries
ff975c0
to
ffba5e9
Compare
ffba5e9
to
65112d9
Compare
Let's go! |
This is #50 rebased, all credit due to @hasufell.