Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework checkEntrySecurity #82

Merged
merged 19 commits into from
Dec 7, 2023
Merged

Rework checkEntrySecurity #82

merged 19 commits into from
Dec 7, 2023

Conversation

Bodigrim
Copy link
Contributor

@Bodigrim Bodigrim commented Nov 20, 2023

Key points:

  • .. in a path should be allowed as long as it does not point outside of the archive,
  • soft links should be validated as relative paths,
  • long names / links should not be exempt from validation.

This is meant to close #27, enabling us to unpack https://nodejs.org/dist/v8.9.3/node-v8.9.3-linux-x64.tar.xz.

@Bodigrim Bodigrim force-pushed the rework-check branch 2 times, most recently from 213a39d to 2f4b331 Compare November 20, 2023 00:56
@hasufell
Copy link
Member

I'm challenging whether symlinks pointing outside the archive is a security issue.

  1. we're not reading the target during unpack, unless the platform doesn't support symlinks (we could reject it during the fallback action)

in doesDirectoryExist absTarget >>= \case
True -> handleSymlinkError (copyDirectoryRecursive absTarget absPath)
$ createDirectoryLink relLinkTarget absPath
False -> handleSymlinkError (copyFile absTarget absPath)
$ createFileLink relLinkTarget absPath
where
handleSymlinkError action =
handle (\e -> if ioeGetErrorType e `elem` [IllegalOperation
,PermissionDenied
,InvalidArgument]
then action
else throwIO e
)

  1. symlinks don't circumvent file permissions/ownership

The only danger I can see is buggy implementations doing recursive deletion and following symlinks during that operation. In that case, you have much bigger problems on your hands.

Comment on lines 71 to 72
SymbolicLink link -> check (FilePath.Native.takeDirectory (entryPath entry)
FilePath.Native.</> fromLinkTarget link)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why Native? tar expects unix style separators ("slash" in the spec text) for its entries and so we should also expect these semantics for symlink target.

Now I also realize we're quite sloppy about this fact in unpack and pack. I think a symlink created on windows might not survive unpack on unix.

I'll probably have to test that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tested it. Symlinks are broken when packed on windows and trying to unpack on unix. We have to splitPath and joinPath them.

OtherEntryType 'K' longPath _
-- We don't know yet whether it's a hard or a soft link.
-- Let's err on the safe side and validate as a hard one.
-> check (Char8.unpack $ Char8.takeWhile (/= '\0') longPath)
Copy link
Member

@hasufell hasufell Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not considering the basepath ((entryPath entry)) here? Will that not yield false-positives?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See cases for hard and soft links above. AFAIU from the difference between handleHardLinks and handleSymLinks in Unpack.hs, hard links are "absolute" in the context of an archive (thus check longPath) and relative to the current position in the archive (thus check $ takeDirectory tarPath </> longPath). Here we don't know whether it's a hard or a soft link, so assuming the worst.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I mean is that symlink targets like ./../../bar will throw here afaiu, because it's assumed to be the tar root.

However, it may be a safe symlink.

I've written a patch where we don't use checkEntrySecurity but only operate on Entries e and can look as far ahead as we want, similar to unpack. It's a bit involved, so I haven't finished yet.

@Bodigrim
Copy link
Contributor Author

Bodigrim commented Nov 22, 2023

Looking at other checks, it's all broken in the presence of long filepaths / long symlinks.

We can probably benefit from implementing a transformation

Entries e -> [(FilePath /* path */, Maybe (FilePath /* link */, Bool /* is hard link? */))]

and expressing all the checks in terms of [(FilePath, Maybe (FilePath, Bool))] - they care only about filenames.

@Bodigrim
Copy link
Contributor Author

I'm challenging whether symlinks pointing outside the archive is a security issue.

I agree that such check does not make much sense for a general purpose archiver. But given that the main clients of tar are cabal-install and hackage-server, which do not perform any other sanity checks on their side, I'd rather keep link validation in place. Otherwise nothing prevents people from uploading Hackage packages with funny symlinks / hardlinks, happily unpacked by Cabal to clients. Given that Hackage is a permanent storage...

Potentially we can refactor the module, providing more flexible and fine-grained checks.

@hasufell
Copy link
Member

But given that the main clients of tar are cabal-install and hackage-server, which do not perform any other sanity checks on their side, I'd rather keep link validation in place

Can we not expose a stricter checkSecurity then? I'm not a fan of accomodating the needs of only a single user, even if it's cabal.

I've used tar in ghcup in the past and know there are more users. For other use cases you may want to be much more lenient.

That said, it isn't a regression, so a follow-up PR could introduce different "strictness" properties for unpack?

@Bodigrim
Copy link
Contributor Author

We can add unpackLenient which does not checkSecurity.

@Bodigrim Bodigrim marked this pull request as ready for review December 2, 2023 21:19
@hasufell
Copy link
Member

hasufell commented Dec 3, 2023

I think the callback stuff introduced a slight performance regression for unpack. Unpacking the cabal index takes 27s with it and 23s without. Not sure why.

@Bodigrim Bodigrim merged commit 0d26f10 into master Dec 7, 2023
26 of 30 checks passed
@Bodigrim
Copy link
Contributor Author

Bodigrim commented Dec 7, 2023

I factored out the support of long file names into a separate module. Hopefully I did not break anything else %)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Can't extract tar file: inappropriate type (Is a directory)
2 participants