FIX #6413 pip install <url> allow directory traversal #6418

gzpan123 · 2019-04-17T13:28:23Z

cjerdonek

Thanks for the report and patch! Some comments.

cjerdonek · 2019-04-18T05:58:26Z

news/6413.bugfix

@@ -0,0 +1 @@
+Fixed pip install <url> allow directory traversal when a malicious server (or a network MitM if downloading over HTTP) send a Content-Disposition header with filename which contains "../ or ..\".


This should be in the imperative tone (so "Fix pip install <url> ..."): https://pip.pypa.io/en/stable/development/contributing/#contents-of-a-news-entry

You should also wrap lines to 80 characters and surround words that should be code with double backticks before and after (since it is reStructuredText).

cjerdonek · 2019-04-18T06:02:24Z

src/pip/_internal/download.py

+        _filename = params.get('filename')
+        if _filename:
+            _filename = os.path.basename(_filename)
+        filename = _filename or filename


To increase testability, I would define a parse_content_disposition(headers, default_filename) function that returns the file name to use, and then add a couple unit tests of the function using @pytest.mark.parametrize.

cjerdonek · 2019-04-18T06:03:54Z

tests/unit/test_download.py

+    }
+    session.get.return_value = resp
+
+    temp_dir = mkdtemp()


You can use the pytest tmpdir fixture so you won't need to manually create and delete a temp dir.

cjerdonek · 2019-04-18T06:06:29Z

tests/unit/test_download.py

+    temp_dir = mkdtemp()
+    temp_sub_dir = mkdtemp(dir=temp_dir, prefix="sub-dir-")
+    hashes = None
+    progress_bar = "on"


You can pass these two variables as keyword argument to save two lines.

cjerdonek · 2019-04-18T06:13:00Z

tests/unit/test_download.py

@@ -157,6 +157,43 @@ def test_unpack_http_url_bad_downloaded_checksum(mock_unpack_file):
        rmtree(download_dir)


+def test_download_http_url_with_directory_traversal(data):
+    """
+    It should download file to temp_sub_dir


I would be more explicit in the docstring and say something like, "Test downloading a file when the content-disposition header contains a filename with a ".." path part."

cjerdonek · 2019-04-18T10:17:31Z

src/pip/_internal/download.py

+    type, params = cgi.parse_header(content_disposition)
+    filename = params.get('filename')
+    if filename:
+        filename = os.path.basename(filename).split("\\")[-1].rstrip(".")


Seeing this implementation, it might actually be better to add a second function (called something like sanitize_filename()) and put the bulk of your unit tests in the unit tests of that function. Also, why is it necessary to split on \ after calling basename(), and why splitting only with that character as opposed to others? Some code comments are probably warranted here explaining what we are protecting against. Lastly, what about left-stripping ., too?

I added .split("\\")[-1] because os.path.basename("dir\\file") return "dir\file" in linux.
Later I found out that "dir \ file" is a valid filename in linux. so .split("\\")[-1] can be removed.

I added .rstrip(".") because os.path.basename("dir/..") return ".." and ".." is not a valid filename,
But only ".." can not lead to directory traversal. so ``.rstrip(".")``` can also be removed.

So only os.path.basename(filename) could fix this security issue already, so I decided to remove .split("\\")[-1] and .rstrip("."), is this ok?

Isn't something like that still needed for Windows, though?

In windows both os.path.basename("dir\\file") and os.path.basename("dir/file") return "file".
There is no other thing need to do in Windows, I think.

cjerdonek

Thanks for your updates (and I agree with using os.path.basename()). Some more comments.

src/pip/_internal/download.py

news/6413.bugfix

src/pip/_internal/download.py

tests/unit/test_download.py

src/pip/_internal/download.py

cjerdonek

Some final(?) comments. Thanks for your work on this.

src/pip/_internal/download.py

tests/unit/test_download.py

src/pip/_internal/download.py

tests/unit/test_download.py

gzpan123 · 2019-05-04T13:02:11Z

any problems?

BrownTruck · 2019-05-10T18:30:08Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

cjerdonek · 2019-05-11T01:16:34Z

any problems?

No, just haven't gotten to this. But looks like you need to do a rebase in the meantime..

BrownTruck · 2019-06-04T00:00:06Z

Hello!

I am an automated bot and I have noticed that this pull request is not currently able to be merged. If you are able to either merge the master branch into this pull request or rebase this pull request against master then it will be eligible for code review and hopefully merging!

cjerdonek · 2019-06-09T18:32:14Z

@gzpan123 I made some formatting and wording tweaks before planning to merge this, if that's okay.

cjerdonek · 2019-06-11T08:12:48Z

Thanks so much for all your work on this, @gzpan123, and for sticking with it! 👍

cjerdonek · 2019-06-11T09:13:34Z

@gzpan123 In the course of reviewing your PR, I noticed a related issue that looks like it could use fixing. It seems like the "default" filename (from the Link object) could also result in directory traversal. For example:

>>> from pip._internal.models.link import Link
>>> link = Link('https://example.com/..%2Ffoo.txt')
>>> link.filename
'../foo.txt'

This is because Link.filename unquotes its basename before returning:

pip/src/pip/_internal/models/link.py

Lines 61 to 68 in 5776ddd

    
           @property 
        
           def filename(self): 
        
               # type: () -> str 
        
               _, netloc, path, _, _ = urllib_parse.urlsplit(self.url) 
        
               name = posixpath.basename(path.rstrip('/')) or netloc 
        
               name = urllib_parse.unquote(name) 
        
               assert name, ('URL %r produced no filename' % self.url) 
        
               return name

cjerdonek added C: download About fetching data from PyPI and other sources type: bugfix type: security Has potential security implications labels Apr 17, 2019

cjerdonek suggested changes Apr 18, 2019

View reviewed changes

cjerdonek reviewed Apr 18, 2019

View reviewed changes

cjerdonek reviewed Apr 19, 2019

View reviewed changes

BrownTruck added the needs rebase or merge PR has conflicts with current master label May 10, 2019

gzpan123 force-pushed the master branch from 01913f0 to 09c2418 Compare May 11, 2019 04:29

pypa-bot removed the needs rebase or merge PR has conflicts with current master label May 11, 2019

BrownTruck added the needs rebase or merge PR has conflicts with current master label Jun 4, 2019

pypa-bot removed the needs rebase or merge PR has conflicts with current master label Jun 4, 2019

gzpan123 force-pushed the master branch 2 times, most recently from a2ea2a2 to 42dfa4f Compare June 9, 2019 10:00

cjerdonek force-pushed the master branch from 42dfa4f to cd032f7 Compare June 9, 2019 18:30

cjerdonek mentioned this pull request Jun 9, 2019

pip install <url> allow directory traversal, leading to arbitrary file write #6413

Closed

cjerdonek approved these changes Jun 9, 2019

View reviewed changes

FIX pypa#6413 pip install <url> allow directory traversal

a4c735b

cjerdonek force-pushed the master branch from cd032f7 to a4c735b Compare June 11, 2019 07:23

cjerdonek merged commit 5776ddd into pypa:master Jun 11, 2019

cjerdonek mentioned this pull request Jun 22, 2019

Parse the url when creating a Link object #6635

Merged

lock bot added the auto-locked Outdated issues that have been locked by automation label Jul 11, 2019

lock bot locked as resolved and limited conversation to collaborators Jul 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX #6413 pip install <url> allow directory traversal #6418

FIX #6413 pip install <url> allow directory traversal #6418

gzpan123 commented Apr 17, 2019 •

edited by cjerdonek

cjerdonek left a comment

cjerdonek Apr 18, 2019

cjerdonek Apr 18, 2019

cjerdonek Apr 18, 2019

cjerdonek Apr 18, 2019

cjerdonek Apr 18, 2019

cjerdonek Apr 18, 2019

gzpan123 Apr 19, 2019

cjerdonek Apr 19, 2019

gzpan123 Apr 19, 2019 •

edited

cjerdonek left a comment

cjerdonek left a comment

gzpan123 commented May 4, 2019

BrownTruck commented May 10, 2019

cjerdonek commented May 11, 2019

BrownTruck commented Jun 4, 2019

cjerdonek commented Jun 9, 2019

cjerdonek commented Jun 11, 2019

cjerdonek commented Jun 11, 2019

		@@ -0,0 +1 @@
		Fixed pip install <url> allow directory traversal when a malicious server (or a network MitM if downloading over HTTP) send a Content-Disposition header with filename which contains "../ or ..\".

FIX #6413 pip install <url> allow directory traversal #6418

FIX #6413 pip install <url> allow directory traversal #6418

Conversation

gzpan123 commented Apr 17, 2019 • edited by cjerdonek

cjerdonek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gzpan123 Apr 19, 2019 • edited

Choose a reason for hiding this comment

cjerdonek left a comment

Choose a reason for hiding this comment

cjerdonek left a comment

Choose a reason for hiding this comment

gzpan123 commented May 4, 2019

BrownTruck commented May 10, 2019

cjerdonek commented May 11, 2019

BrownTruck commented Jun 4, 2019

cjerdonek commented Jun 9, 2019

cjerdonek commented Jun 11, 2019

cjerdonek commented Jun 11, 2019

gzpan123 commented Apr 17, 2019 •

edited by cjerdonek

gzpan123 Apr 19, 2019 •

edited