-
Notifications
You must be signed in to change notification settings - Fork 334
Ianhelle/tar slip did protection 2026 04 27 #889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
c3c21b4
Fix tar-slip/zip-slip path traversal in archive extraction
ianhelle 165ea3c
Re-save pickle test data to fix NumPy 2.4 align deprecation warning
ianhelle d0edbe9
Address PR review: improve archive extraction safety
ianhelle 70ebbc4
Merge branch 'main' into ianhelle/tar-slip-did-protection-2026-04-27
ianhelle 783c157
Merge branch 'main' into ianhelle/tar-slip-did-protection-2026-04-27
ianhelle 67b7016
Fix mypy str-bytes-safe error in archive_utils
ianhelle 2f57041
Refactor ZipFile to use direct context manager
ianhelle File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,190 @@ | ||
| # ------------------------------------------------------------------------- | ||
| # Copyright (c) Microsoft Corporation. All rights reserved. | ||
| # Licensed under the MIT License. See License.txt in the project root for | ||
| # license information. | ||
| # -------------------------------------------------------------------------- | ||
| """Safe archive extraction utilities to prevent path traversal attacks.""" | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| import logging | ||
| import sys | ||
| import tarfile | ||
| import zipfile | ||
| from pathlib import Path | ||
|
|
||
| from .exceptions import MsticpyUserError | ||
|
|
||
| logger: logging.Logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| def validate_archive_member_path( | ||
| member_name: str, | ||
| dest_dir: str | Path, | ||
| ) -> Path: | ||
| """ | ||
| Validate that an archive member path does not escape dest_dir. | ||
|
|
||
| Checks for absolute paths, parent directory references, and | ||
| resolved path containment within the destination directory. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| member_name : str | ||
| The archive member name/path to validate. | ||
| dest_dir : str or Path | ||
| The intended extraction destination directory. | ||
|
|
||
| Returns | ||
| ------- | ||
| Path | ||
| The resolved target path within dest_dir. | ||
|
|
||
| Raises | ||
| ------ | ||
| MsticpyUserError | ||
| If the member path would escape the destination directory. | ||
|
|
||
| """ | ||
| member_path = Path(member_name) | ||
| if member_path.is_absolute() or member_name.startswith("/"): | ||
| raise MsticpyUserError( | ||
| f"Archive member has an absolute path '{member_name}'.", | ||
| "This may indicate a malicious archive (path traversal attack).", | ||
| title="Unsafe archive member path", | ||
| ) | ||
| if ".." in member_path.parts: | ||
| raise MsticpyUserError( | ||
| f"Archive member contains parent directory reference: '{member_name}'.", | ||
| "This may indicate a malicious archive (path traversal attack).", | ||
| title="Unsafe archive member path", | ||
| ) | ||
| dest = Path(dest_dir).resolve() | ||
| target = (dest / member_name).resolve() | ||
| if not target.is_relative_to(dest): | ||
| raise MsticpyUserError( | ||
| f"Archive member path escapes the destination directory: '{member_name}'.", | ||
| f"Resolved path '{target}' is outside '{dest}'.", | ||
| "This may indicate a malicious archive (path traversal attack).", | ||
| title="Unsafe archive member path", | ||
| ) | ||
| logger.debug("Validated archive member path: %s", member_name) | ||
| return target | ||
|
|
||
|
|
||
| def safe_tar_extract( | ||
| tar: tarfile.TarFile, | ||
| member: tarfile.TarInfo, | ||
| dest_dir: str | Path, | ||
| ) -> None: | ||
| """ | ||
| Safely extract a single tar archive member after path validation. | ||
|
|
||
| Validates that the member path does not escape dest_dir and | ||
| rejects symlinks or hardlinks that could be used for traversal. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| tar : tarfile.TarFile | ||
| The open tar archive. | ||
| member : tarfile.TarInfo | ||
| The tar member to extract. | ||
| dest_dir : str or Path | ||
| The destination directory for extraction. | ||
|
|
||
| Raises | ||
| ------ | ||
| MsticpyUserError | ||
| If the member path is unsafe or the member is a | ||
| symlink/hardlink pointing outside dest_dir. | ||
|
|
||
| """ | ||
| if not (member.isreg() or member.isdir()): | ||
| if member.issym() or member.islnk(): | ||
| _validate_tar_link(member, dest_dir) | ||
| else: | ||
| raise MsticpyUserError( | ||
| "Archive contains an unsupported member" | ||
| f" type: '{member.name}'" | ||
| f" (type={member.type!r}).", | ||
| "Only regular files and directories are allowed.", | ||
| title="Unsafe archive member type", | ||
| ) | ||
| validate_archive_member_path(member.name, dest_dir) | ||
| if sys.version_info >= (3, 12): | ||
| tar.extract(member, dest_dir, filter="data") | ||
| else: | ||
| tar.extract(member, dest_dir) | ||
|
|
||
|
|
||
| def safe_zip_extract( | ||
| zip_file: zipfile.ZipFile, | ||
| file_name: str, | ||
| dest_dir: str | Path, | ||
| ) -> None: | ||
| """ | ||
| Safely extract a single zip archive member after path validation. | ||
|
|
||
| Validates that the member path does not escape dest_dir. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| zip_file : zipfile.ZipFile | ||
| The open zip archive. | ||
| file_name : str | ||
| The name of the file to extract. | ||
| dest_dir : str or Path | ||
| The destination directory for extraction. | ||
|
|
||
| Raises | ||
| ------ | ||
| MsticpyUserError | ||
| If the member path would escape the destination directory. | ||
|
|
||
| """ | ||
| validate_archive_member_path(file_name, dest_dir) | ||
| zip_file.extract(file_name, path=dest_dir) | ||
|
|
||
|
|
||
| def _validate_tar_link( | ||
| member: tarfile.TarInfo, | ||
| dest_dir: str | Path, | ||
| ) -> None: | ||
| """ | ||
| Validate that a symlink or hardlink target is within dest_dir. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| member : tarfile.TarInfo | ||
| The tar member (symlink or hardlink) to validate. | ||
| dest_dir : str or Path | ||
| The destination directory for extraction. | ||
|
|
||
| Raises | ||
| ------ | ||
| MsticpyUserError | ||
| If the link target escapes the destination directory. | ||
|
|
||
| """ | ||
| dest = Path(dest_dir).resolve() | ||
| link_target = member.linkname | ||
| if Path(link_target).is_absolute(): | ||
| raise MsticpyUserError( | ||
| "Archive contains a link with an absolute" | ||
| f" target: '{member.name}'" | ||
| f" -> '{link_target}'.", | ||
| "This may indicate a malicious archive (path traversal attack).", | ||
| title="Unsafe archive link target", | ||
| ) | ||
| # Resolve link target relative to the member's directory | ||
| member_dir = (dest / member.name).resolve().parent | ||
| resolved_link = (member_dir / link_target).resolve() | ||
| if not resolved_link.is_relative_to(dest): | ||
| raise MsticpyUserError( | ||
| "Archive contains a link that escapes the" | ||
| f" destination: '{member.name}'" | ||
| f" -> '{link_target}'.", | ||
| f"Resolved link target '{resolved_link}' is outside '{dest}'.", | ||
| "This may indicate a malicious archive (path traversal attack).", | ||
| title="Unsafe archive link target", | ||
| ) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.