-
Notifications
You must be signed in to change notification settings - Fork 1.9k
TarSlip vulnerability improvements #10851
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
11 commits
Select commit
Hold shift + click to select a range
ee2644c
TarSlip improvements
Sim4n6 5ca9c26
Update change note for TarSlip vuln
Sim4n6 0cb2a78
Revert "Update change note for TarSlip vuln"
Sim4n6 7d66972
Update change note for TarSlip vuln
Sim4n6 6634e27
Merge branch 'github:main' into main
Sim4n6 49830f2
Update python/ql/lib/semmle/python/security/dataflow/TarSlipCustomiza…
Sim4n6 391a209
Update python/ql/lib/semmle/python/security/dataflow/TarSlipCustomiza…
Sim4n6 0d25261
Update python/ql/lib/semmle/python/security/dataflow/TarSlipCustomiza…
Sim4n6 23f35ed
rename to ExtractAllWithoutMembersSink
Sim4n6 81c9a00
update the comment for ExtractAllwMembersSink
Sim4n6 f0fb8e5
rename to ExtractAllwithMembersSink
Sim4n6 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
5 changes: 5 additions & 0 deletions
5
python/ql/lib/change-notes/2022-10-17-TarSlipCustomizations-Improv.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| --- | ||
| category: majorAnalysis | ||
| --- | ||
| * Added a couple of sources, as `tarfile.TarFile(), MKtarfile.Tarfile.open(), contextlib.closing(XXXX)`. | ||
| * Added a couple of sinks, as `_extract_member()`, and more from `extractall()`. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
60 changes: 60 additions & 0 deletions
60
python/ql/src/experimental/Security/CWE-022bis/TarSlip.qhelp
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,60 @@ | ||
| <!DOCTYPE qhelp PUBLIC | ||
| "-//Semmle//qhelp//EN" | ||
| "qhelp.dtd"> | ||
| <qhelp> | ||
|
|
||
| <overview> | ||
| <p>Extracting files from a malicious tarball without validating that the destination file path | ||
| is within the destination directory can cause files outside the destination directory to be | ||
| overwritten, due to the possible presence of directory traversal elements (<code>..</code>) in | ||
| archive path names.</p> | ||
|
|
||
| <p>Tarball contain archive entries representing each file in the archive. These entries | ||
| include a file path for the entry, but these file paths are not restricted and may contain | ||
| unexpected special elements such as the directory traversal element (<code>..</code>). If these | ||
| file paths are used to determine an output file to write the contents of the archive item to, then | ||
| the file may be written to an unexpected location. This can result in sensitive information being | ||
| revealed or deleted, or an attacker being able to influence behavior by modifying unexpected | ||
| files.</p> | ||
|
|
||
| <p>For example, if a tarball contains a file entry <code>../sneaky-file</code>, and the tarball | ||
| is extracted to the directory <code>/tmp/tmp123</code>, then naively combining the paths would result | ||
| in an output file path of <code>/tmp/tmp123/../sneaky-file</code>, which would cause the file to be | ||
| written to <code>/tmp/</code>.</p> | ||
|
|
||
| </overview> | ||
| <recommendation> | ||
|
|
||
| <p>Ensure that output paths constructed from tarball entries are validated | ||
| to prevent writing files to unexpected locations.</p> | ||
|
|
||
| <p>The recommended way of writing an output file from a tarball entry is to call <code>extract()</code> or <code>extractall()</code>. | ||
| </p> | ||
|
|
||
| </recommendation> | ||
|
|
||
| <example> | ||
| <p> | ||
| In this example an archive is extracted without validating file paths. | ||
| </p> | ||
|
|
||
| <sample src="examples/TarSlip_1.py" /> | ||
|
|
||
| <p>To fix this vulnerability, we need to call the function <code>extractall()</code>. | ||
| </p> | ||
|
|
||
| <sample src="examples/NoHIT_TarSlip_1.py" /> | ||
|
|
||
| </example> | ||
| <references> | ||
| <li> | ||
| Snyk: | ||
| <a href="https://snyk.io/research/zip-slip-vulnerability">Zip Slip Vulnerability</a>. | ||
| </li> | ||
|
|
||
| <li> | ||
| Tarfile documentation | ||
| <a href="https://docs.python.org/3/library/tarfile.html#tarfile.TarFile.extractall">extractall() warning</a> | ||
| </li> | ||
| </references> | ||
| </qhelp> |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| /** | ||
| * @name Arbitrary file write during archive extraction ("Tar Slip") | ||
| * @description Extracting files from a malicious archive without validating that the | ||
| * destination file path is within the destination directory can cause files outside | ||
| * the destination directory to be overwritten. | ||
| * @kind path-problem | ||
| * @id py/tarslip | ||
| * @problem.severity error | ||
| * @security-severity 7.5 | ||
| * @precision high | ||
| * @tags security | ||
| * external/cwe/cwe-022 | ||
| */ | ||
|
|
||
| import python | ||
| import semmle.python.security.dataflow.TarSlipCustomizations | ||
| import semmle.python.security.dataflow.TarSlipQuery | ||
| import DataFlow::PathGraph | ||
|
|
||
| from Configuration config, DataFlow::PathNode source, DataFlow::PathNode sink | ||
| where config.hasFlowPath(source, sink) | ||
| select source.getNode(), source, sink, | ||
| "This unsanitized archive entry, which may contain '..', is used in a $@.", sink.getNode(), | ||
| "file system operation" |
19 changes: 19 additions & 0 deletions
19
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_1.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| import sys | ||
| import tarfile | ||
|
|
||
| def managed_members_archive_handler(filename): | ||
| tar = tarfile.open(filename) | ||
| result = [] | ||
| for member in tar: | ||
| if ".." in member.name: | ||
| raise ValueError("Path in member name !!!") | ||
| result.append(member) | ||
| path = sys.argv[2] | ||
| #print("files are extracted to: ", path) | ||
| tar.extractall(path=path, members=result) | ||
| tar.close() | ||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) > 1: | ||
| filename = sys.argv[1] | ||
| managed_members_archive_handler(filename) |
27 changes: 27 additions & 0 deletions
27
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_2.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| import sys | ||
| import tarfile | ||
| import tempfile | ||
|
|
||
| def managed_members_archive_handler(filename): | ||
| tar = tarfile.open(filename) | ||
| tar.extractall(path=tempfile.mkdtemp(), members=members_filter(tar)) | ||
| tar.close() | ||
|
|
||
|
|
||
| def members_filter(tarfile): | ||
| result = [] | ||
| for member in tarfile: | ||
| if '../' in member.name: | ||
| print('Member name container directory traversal sequence') | ||
| continue | ||
| elif member.issym() or member.islnk(): | ||
| print('Symlink to external resource') | ||
| continue | ||
| result.append(member) | ||
| return result | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) > 1: | ||
| filename = sys.argv[1] | ||
| managed_members_archive_handler(filename) |
8 changes: 8 additions & 0 deletions
8
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_3.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| import tarfile | ||
| import sys | ||
|
|
||
| with tarfile.open(sys.argv[1]) as tar: | ||
| for entry in tar: | ||
| if ".." in entry.name: | ||
| raise ValueError("Illegal tar archive entry") | ||
| tar.extract(entry, "/tmp/unpack/") |
13 changes: 13 additions & 0 deletions
13
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_4.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,13 @@ | ||
| import tarfile | ||
| import sys | ||
| import os | ||
|
|
||
| def _validate_archive_name(name, target): | ||
| if not os.path.abspath(os.path.join(target, name)).startswith(target + os.path.sep): | ||
| raise ValueError(f"Provided language pack contains invalid name {name}") | ||
|
|
||
| with tarfile.open(sys.argv[1]) as tar: | ||
| target = "/tmp/unpack" | ||
| for entry in tar: | ||
| _validate_archive_name(entry.name, target) | ||
| tar.extract(entry, target) |
29 changes: 29 additions & 0 deletions
29
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_5.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # https://github.com/PyCQA/bandit | ||
|
|
||
| import sys | ||
| import tarfile | ||
| import tempfile | ||
|
|
||
| def managed_members_archive_handler(filename): | ||
| tar = tarfile.open(filename) | ||
| tar.extractall(path=tempfile.mkdtemp(), members=members_filter(tar)) | ||
| tar.close() | ||
|
|
||
|
|
||
| def members_filter(tarfile): | ||
| result = [] | ||
| for member in tarfile.getmembers(): | ||
| if '../' in member.name: | ||
| print('Member name container directory traversal sequence') | ||
| continue | ||
| elif (member.issym() or member.islnk()) and ('../' in member.linkname): | ||
| print('Symlink to external resource') | ||
| continue | ||
| result.append(member) | ||
| return result | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) > 1: | ||
| filename = sys.argv[1] | ||
| managed_members_archive_handler(filename) |
26 changes: 26 additions & 0 deletions
26
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_6.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,26 @@ | ||
| # https://github.com/OctoPrint/OctoPrint/ | ||
|
|
||
| import sys | ||
| import tarfile | ||
| import os | ||
|
|
||
| def _validate_tar_info(info, target): | ||
| _validate_archive_name(info.name, target) | ||
| if not (info.isfile() or info.isdir()): | ||
| raise ValueError("Provided language pack contains invalid file type") | ||
|
|
||
| def _validate_archive_name(name, target): | ||
| if not os.path.abspath(os.path.join(target, name)).startswith(target + os.path.sep): | ||
| raise ValueError(f"Provided language pack contains invalid name {name}") | ||
|
|
||
| target = "/tmp/unpack" | ||
|
|
||
| with tarfile.open(sys.argv[1], "r") as tar: | ||
|
|
||
| # sanity check | ||
| for info in tar.getmembers(): | ||
| _validate_tar_info(info, target) | ||
|
|
||
| # unpack everything | ||
| tar.extractall(target) | ||
|
|
29 changes: 29 additions & 0 deletions
29
python/ql/src/experimental/Security/CWE-022bis/examples/NoHIT_TarSlip_7.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,29 @@ | ||
| # https://github.com/PyCQA/bandit | ||
|
|
||
| import sys | ||
| import tarfile | ||
| import tempfile | ||
|
|
||
| def managed_members_archive_handler(filename): | ||
| tar = tarfile.open(filename) | ||
| tar.extractall(path=tempfile.mkdtemp(), members=members_filter(tar)) | ||
| tar.close() | ||
|
|
||
|
|
||
| def members_filter(tarfile): | ||
| result = [] | ||
| for member in tarfile.getmembers(): | ||
| if '../' in member.name: | ||
| print('Member name container directory traversal sequence') | ||
| continue | ||
| elif member.issym() or member.islnk(): | ||
| print('Symlink to external resource') | ||
| continue | ||
| result.append(member) | ||
| return result | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) > 1: | ||
| filename = sys.argv[1] | ||
| managed_members_archive_handler(filename) |
18 changes: 18 additions & 0 deletions
18
python/ql/src/experimental/Security/CWE-022bis/examples/TarSlip_1.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| # https://github.com/PyCQA/bandit | ||
|
|
||
| import sys | ||
| import tarfile | ||
| import tempfile | ||
|
|
||
| def managed_members_archive_handler(filename): | ||
| tar = tarfile.open(filename) | ||
| tarf = tar.getmembers() | ||
| for f in tarf: | ||
| if not f.issym(): | ||
| tar.extractall(path=tempfile.mkdtemp(), members=[f]) | ||
| tar.close() | ||
|
|
||
| if __name__ == "__main__": | ||
| if len(sys.argv) > 1: | ||
| filename = sys.argv[1] | ||
| managed_members_archive_handler(filename) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extra source added here is better handled by an additional taint step. Then paths will start at the open call rather than at the closing call, and, if the two a separated in the code, we would find the data flow to connect them.