Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Userland: Fix tar/unzip directory traversal vulnerability #5713

Conversation

bblenard
Copy link

This change validates the filenames within a tar/zip archive during
extraction. If the filename within a the archive is outside of the
current working directory the file will be skipped and not extracted
onto the host system.

Closes #3991
Closes #3992

@bblenard bblenard force-pushed the fix_archive_directory_traversal_bugs branch from 7165ec7 to bec265a Compare March 10, 2021 01:53
@bcoles
Copy link
Collaborator

bcoles commented Mar 10, 2021

I had a quick read over the code, but haven't tested.

In the tar patch, it looks like there's no validation on the file name beyond this check:

if (!file_name.starts_with(current_path.to_string()))

Wouldn't this still allow traversal, as a filename of /my/current/path/../../../../../something/else still technically starts with /my/current/path ?

The approach for unzip looks better; however, it uses realpath.

I started on a patch for this ages ago (see testing code below). The issue I ran into is that realpath only work on paths which exist, and some paths won't exist prior to extraction. IIRC, this was an issue when the compressed archive contained directories as the directory had not yet been created.

diff --git a/Userland/unzip.cpp b/Userland/unzip.cpp
index 2617c353d..bb67a83bc 100644
--- a/Userland/unzip.cpp
+++ b/Userland/unzip.cpp
@@ -28,9 +28,13 @@
 #include <AK/NumberFormat.h>
 #include <LibCore/ArgsParser.h>
 #include <LibCore/File.h>
+#include <limits.h>
+#include <stdlib.h>
 #include <string.h>
 #include <sys/stat.h>
+#include <unistd.h>
 
+bool verbose = false;
 static const u8 central_directory_file_header_sig[] = "\x50\x4b\x01\x02";
 
 static bool seek_and_read(u8* buffer, const MappedFile& file, off_t seek_to, size_t bytes_to_read)
@@ -120,31 +124,54 @@ static bool unpack_file_for_central_directory_index(off_t central_directory_inde
         return false;
     file_name[file_name_length] = '\0';
 
-    if (file_name[file_name_length - 1] == '/') {
-        if (mkdir(file_name, 0755) < 0) {
+    if (strlen(file_name) > PATH_MAX) {
+        fprintf(stderr, "Warning: skipped file path (length %lu exceeds %d): %s\n", strlen(file_name), PATH_MAX, file_name);
+        return false;
+    }
+
+    char output_directory[PATH_MAX];
+    // FIXME: change once output directory (-d) is supported
+    getcwd(output_directory, sizeof(output_directory));
+
+    char file_path[PATH_MAX];
+    //realpath(String::format("%s/%s", output_directory, file_name).characters(), file_path);
+    realpath(file_name, file_path);
+    printf("OUT: %s\n", output_directory);
+    printf("path: %s\n", file_path);
+
+    if (strncmp(output_directory, file_path, strlen(output_directory)) != 0) {
+        fprintf(stderr, "Warning: ignored path traversal: %s\n", file_path);
+        //fprintf(stderr, "Warning: ignored path traversal: %s\n", file_name);
+        return false;
+    }
+
+    if (verbose)
+        printf(" extracting: %s\n", file_path);
+
+    if (file_path[strlen(file_path) - 1] == '/') {
+        if (mkdir(file_path, 0755) < 0) {
             perror("mkdir");
             return false;
         }
     } else {
-        auto new_file = Core::File::construct(String { file_name });
+        auto new_file = Core::File::construct(String { file_path });
         if (!new_file->open(Core::IODevice::WriteOnly)) {
-            fprintf(stderr, "Can't write file %s: %s\n", file_name, new_file->error_string());
+            fprintf(stderr, "Can't write file %s: %s\n", file_path, new_file->error_string());
             return false;
         }
 
-        printf(" extracting: %s\n", file_name);
         u8 raw_file_contents[compressed_file_size];
-        if (!seek_and_read(raw_file_contents, file, local_file_header_index + LFHFileNameBaseOffset + file_name_length + extra_field_length, compressed_file_size))
+        if (!seek_and_read(raw_file_contents, file, local_file_header_index + LFHFileNameBaseOffset + strlen(file_path) + extra_field_length, compressed_file_size))
             return false;
 
         // FIXME: Try to uncompress data here. We're just ignoring it as no decompression methods are implemented yet.
         if (!new_file->write(raw_file_contents, compressed_file_size)) {
-            fprintf(stderr, "Can't write file contents in %s: %s\n", file_name, new_file->error_string());
+            fprintf(stderr, "Can't write file contents in %s: %s\n", file_path, new_file->error_string());
             return false;
         }
 
         if (!new_file->close()) {
-            fprintf(stderr, "Can't close file %s: %s\n", file_name, new_file->error_string());
+            fprintf(stderr, "Can't close file %s: %s\n", file_path, new_file->error_string());
             return false;
         }
     }
@@ -159,6 +186,7 @@ int main(int argc, char** argv)
 
     Core::ArgsParser args_parser;
     args_parser.add_option(map_size_limit, "Maximum chunk size to map", "map-size-limit", 0, "size");
+    args_parser.add_option(verbose, "Verbose output", "verbose", 'v');
     args_parser.add_positional_argument(path, "File to unzip", "path", Core::ArgsParser::Required::Yes);
     args_parser.parse(argc, argv);
 

@bcoles
Copy link
Collaborator

bcoles commented Mar 10, 2021

IIRC, this was an issue when the compressed archive contained directories as the directory had not yet been created.

Confirmed. I tested the unzip patch in this PR and ran into the same issue when the compressed zip file contains a directory, even when there's no traversal shenanigans at play.

unzip

user@ubuntu:~/Desktop/serenity$ unzip -l Base/home/anon/asdf.zip 
Archive:  Base/home/anon/asdf.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        0  2020-11-15 09:42   test/
       77  2020-11-15 09:42   test/Terminal.ini
---------                     -------
       77                     2 files
user@ubuntu:~/Desktop/serenity$ 

@bblenard
Copy link
Author

bblenard commented Mar 10, 2021

🤦 While cleaning up my code I must have accidentally removed the realpath call in tar. So yes you are right tar is still borked.

My testing wasn't good enough to catch the other issue you pointed out so thanks for mentioning it. I'll have to take this one back to the drawing board and see how to fix this.

My testing:
serenity

@bcoles
Copy link
Collaborator

bcoles commented Mar 10, 2021

My testing wasn't good enough to catch the other issue you pointed out so thanks for mentioning it. I'll have to take this one back to the drawing board and see how to fix this.

Using realpath is good and the correct approach (better than trying to perform string matching on .. or using starts_with, both of which are doomed to failure).

Unfortunately this issue is more complex than it appears on the surface. It [the unzip patch] definitely resolves the issue, but it does so by breaking existing functionality. I got about as far as you did before I decided it was someone else's problem - good luck :P

@bblenard
Copy link
Author

We might be in luck! I have to look this over but implementing this was going to be my next idea. Looks like its already been done though (maybe still have to review)

void LexicalPath::canonicalize()

@bblenard
Copy link
Author

@bcoles although I think using Lexical path could work I'm wondering if a cleaner solution would be to utilize unveil instead. If we just unveil our current working directory for rwc I would think unzip / tar should fail if there is a funky archive. I'll give that a try too but I'm curious on your thoughts?

@bcoles
Copy link
Collaborator

bcoles commented Mar 10, 2021

@bcoles although I think using Lexical path could work I'm wondering if a cleaner solution would be to utilize unveil instead. If we just unveil our current working directory for rwc I would think unzip / tar should fail if there is a funky archive. I'll give that a try too but I'm curious on your thoughts?

unveil is a good idea but I don't consider that the cleanest approach.

The paths should still be canonicalized appropriately. A combination would be best.

I forsee no immediate issues with using unveil. This will need to support writing to a user-specified output directory if one is provided (not sure if tar and unzip support these yet). There may also need to be support for temporary files created during extraction (maybe).

@bblenard bblenard force-pushed the fix_archive_directory_traversal_bugs branch from bec265a to ff15c7c Compare March 12, 2021 02:32
@bblenard
Copy link
Author

Hopefully that fixes the issue properly now. I wasn't able to verify with a non malicious zip because I couldn't figure out how to make one that passed VERIFY(compression_method == None); 🤦 not sure what I was doing wrong but tested with a non malicious tar archive so I'm fairly confident. Below is what I did to test

serenity

@bblenard
Copy link
Author

Also I'm not 100% sold on the design of my code but wasn't sure how I would make it better so feel free to rip that apart :)

@bcoles
Copy link
Collaborator

bcoles commented Mar 12, 2021

I wasn't able to verify with a non malicious zip because I couldn't figure out how to make one

You'll need to create the ZIP on the host then copy it to the filesystem. Every directory in Base is copied to the filesystem during ninja image as part of build-root-filesystem.sh.

mkdir test
echo 'hello friends' > test/hello
zip -r hello.zip test
mv hello.zip $SERENITY_ROOT/Base/home/anon/hello.zip

@bblenard
Copy link
Author

I wasn't able to verify with a non malicious zip because I couldn't figure out how to make one

You'll need to create the ZIP on the host then copy it to the filesystem. Every directory in Base is copied to the filesystem during ninja image as part of build-root-filesystem.sh.

mkdir test
echo 'hello friends' > test/hello
zip -r hello.zip test
mv hello.zip $SERENITY_ROOT/Base/home/anon/hello.zip

Thats good to know! I've been using a simple http server to download things with pro. That being said I tried to create a zip on my host with zip -r.... but it unzip on serenity complained, which is strange since thats how you are creating the zips

@bcoles
Copy link
Collaborator

bcoles commented Mar 12, 2021

I wasn't able to verify with a non malicious zip because I couldn't figure out how to make one

You'll need to create the ZIP on the host then copy it to the filesystem. Every directory in Base is copied to the filesystem during ninja image as part of build-root-filesystem.sh.

mkdir test
echo 'hello friends' > test/hello
zip -r hello.zip test
mv hello.zip $SERENITY_ROOT/Base/home/anon/hello.zip

Thats good to know! I've been using a simple http server to download things with pro. That being said I tried to create a zip on my host with zip -r.... but it unzip on serenity complained, which is strange since thats how you are creating the zips

What does "complained" mean ? I suggest that the complaint is a result of your changes to unzip.

unzip hello.zip

@bblenard
Copy link
Author

bblenard commented Mar 12, 2021

It happens on a clean master build too. @bcoles are you in the IRC channel because I could ping you there :)

clean-master

Edit:

Worked out the issue on IRC, I didn't read the error I was getting 🤦 and was able to verify creating a zip without compression decompressed fine

works

@bcoles
Copy link
Collaborator

bcoles commented Mar 12, 2021

For reference, this was discussed on IRC and the issue with decompression was resolved (and was unrelated to the patches in this PR).

Copy link
Collaborator

@bcoles bcoles left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look at the unzip patch only. My commentary may also apply to tar.

fprintf(stderr, "PATH unveil\n");
perror("unveil");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using both fprintf and perror seems redundant. Simply perror("unveil") is sufficient.

Userland/Utilities/unzip.cpp Show resolved Hide resolved
StringBuilder target_path;
// Currently unzip doesn't support extracting to a specified location. If that is implemented the target_path
// should be replaced with the target path from the command line arguements instead of "."
target_path.append(realpath(".", nullptr));
target_path.append("/");
auto target_path_string = target_path.to_string();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The target_path is effectively the same as cwd which is defined and unveiled earlier. target_path is a more future-proof name and it would make sense to group these together and use the same variable.

This way, if someone adds support for decompressing to a user-specified directory (ie, unzip -d /output/path) in the future, they only need to change one line as there's no hard coded assumption of operating in the working directory.

fprintf(stderr, "CWD unveil\n");
perror("unveil");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using both fprintf and perror seems redundant. Simply perror("unveil") is sufficient.

@bcoles
Copy link
Collaborator

bcoles commented Mar 12, 2021

I tried the patched tar utility on an old .tar.gz file I had lying around (after first decompressing with gunzip) which results in some unexpected output.

tar -v -x -f php.tar

An archive containing ./ is certainly unusual, but the output stating that it is not within the target directory is also a lie. Perhaps there should be a hard coded rejection for ./ entries with an appropriate error message.

On the other hand, tar on Ubuntu silently ignores it. For reference, here's the output for Ubuntu:

user@ubuntu:~/test$ tar --list -f php.tar 
./
./README
./index.php
./lang/
./lang/english.php
./lang/german.php
./lang/polish.php
./lang/rus-1251.php
./lang/rus-koi8r.php
./themes/
./themes/black-green.php
./themes/black-orange.php
./themes/default.php
user@ubuntu:~/test$ tar xvf php.tar 
./
./README
tar: ./README: implausibly old time stamp -9223372036854775808
./index.php
tar: ./index.php: implausibly old time stamp -9223372036854775808
./lang/
./lang/english.php
tar: ./lang/english.php: implausibly old time stamp -9223372036854775808
./lang/german.php
tar: ./lang/german.php: implausibly old time stamp -9223372036854775808
./lang/polish.php
tar: ./lang/polish.php: implausibly old time stamp -9223372036854775808
./lang/rus-1251.php
tar: ./lang/rus-1251.php: implausibly old time stamp -9223372036854775808
./lang/rus-koi8r.php
tar: ./lang/rus-koi8r.php: implausibly old time stamp -9223372036854775808
./themes/
tar: ./lang: implausibly old time stamp -9223372036854775808
./themes/black-green.php
tar: ./themes/black-green.php: implausibly old time stamp -9223372036854775808
./themes/black-orange.php
tar: ./themes/black-orange.php: implausibly old time stamp -9223372036854775808
./themes/default.php
tar: ./themes/default.php: implausibly old time stamp -9223372036854775808
tar: ./themes: implausibly old time stamp -9223372036854775808
tar: .: implausibly old time stamp -9223372036854775808
user@ubuntu:~/test$ ls -la
total 96
drwxr-xr-x  4 user user  4096 Dec 13  1901 .
drwxr-xr-x 21 user user  4096 Mar 12 06:07 ..
-rw-r--r--  1 user user 24508 Dec 13  1901 index.php
drwxr-xr-x  2 user user  4096 Dec 13  1901 lang
-rw-r--r--  1 user user 51200 Mar 12 05:57 php.tar
-rw-r--r--  1 user user  1300 Dec 13  1901 README
drwxr-xr-x  2 user user  4096 Dec 13  1901 themes
user@ubuntu:~/test$ 

On the other other hand, perhaps this is outside the scope of this PR.

@bblenard
Copy link
Author

I think its in scope. The issue you saw was from some sloppy programming on my part. The issue happens because I use the following check to determine if a file is in the target path

if (!destination_path.string().starts_with(target_path_string)) {

The reason you see this error is because for ./ the destination path ends up being /path/to/some/folder and the target_path_string is /path/to/some/folder/. I intentionally add the / to the target path because I wanted a way to ensure you were extracting under the target directory and not to a path that shares the same prefix but I don't know if that is a realistic thing to worry about.

@bcoles
Copy link
Collaborator

bcoles commented Mar 17, 2021

The reason you see this error is because for ./ the destination path ends up being /path/to/some/folder and the target_path_string is /path/to/some/folder/. I intentionally add the / to the target path because I wanted a way to ensure you were extracting under the target directory and not to a path that shares the same prefix but I don't know if that is a realistic thing to worry about.

It is a realistic thing to worry about, but you should be able to use the canonicalized path for the starts_with comparison, ensuring /some/path/./ == /some/path/.

This change makes LexicalPath objects directory aware. This allows
programs to resolve directory paths that contain special characters
without having to juggle appending a '/'. This doesn't matter a ton but
it helps when coding things that are checking paths.
@bblenard bblenard force-pushed the fix_archive_directory_traversal_bugs branch from ff15c7c to 2063dd9 Compare March 18, 2021 01:56
@bblenard
Copy link
Author

Hopefully I'm getting close with this now. I did make an ease of use change to LexicalPath. If that isn't an acceptable change I would love to hear some ideas on how to handle archive file paths that share prefixes with the target path but are not the same directory. For example

archive file path: /some/path_right_here
target file path:/some/path

serenity

This change validates the filenames within a tar/zip archive during
extraction. If the filename within a the archive is outside of the
current working directory the file will be skipped and not extracted
onto the host system.

Closes SerenityOS#3991
Closes SerenityOS#3992
@bblenard bblenard force-pushed the fix_archive_directory_traversal_bugs branch from 2063dd9 to 3844e85 Compare March 18, 2021 23:57
@linusg
Copy link
Member

linusg commented Apr 14, 2021

@bblenard there are two merge conflicts, please rebase this PR :)

@stale
Copy link

stale bot commented May 5, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions!

@stale stale bot added the stale label May 5, 2021
@stale
Copy link

stale bot commented May 13, 2021

This pull request has been closed because it has not had recent activity. Feel free to re-open if you wish to still contribute these changes. Thank you for your contributions!

@stale stale bot closed this May 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants