Harden runtime_env zip extraction path containment#63786
Conversation
There was a problem hiding this comment.
Code Review
This pull request refactors the path traversal protection logic in unzip_package by resolving paths using os.path.realpath and checking if the resolved path is within the target directory. It also adds unit tests to verify that malicious zip entries attempting path traversal are correctly skipped. The reviewer noted that using startswith for path containment checks is fragile and error-prone, particularly on Windows due to case-insensitivity and drive letters, and suggested using os.path.commonpath instead to ensure robust path validation.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if not ( | ||
| resolved == target_real or resolved.startswith(target_real + os.sep) | ||
| ): | ||
| logger.warning(f"Skipping unsafe path in zip: {member}") | ||
| continue |
There was a problem hiding this comment.
Using startswith for path containment checks can be fragile and error-prone, especially on Windows where paths are case-insensitive and can reside on different drives. For example, if there is a casing mismatch (such as drive letter C: vs c: or directory casing), startswith will perform a case-sensitive comparison and incorrectly skip valid files.
Using os.path.commonpath is much more robust as it automatically handles case-normalization on Windows (via os.path.normcase internally), correctly handles path components (avoiding trailing slash issues), and safely raises a ValueError if paths are on different drives, which can then be logged appropriately.
| if not ( | |
| resolved == target_real or resolved.startswith(target_real + os.sep) | |
| ): | |
| logger.warning(f"Skipping unsafe path in zip: {member}") | |
| continue | |
| try: | |
| if os.path.commonpath([target_real, resolved]) != target_real: | |
| logger.warning(f"Skipping unsafe path in zip: {member}") | |
| continue | |
| except ValueError: | |
| logger.warning(f"Skipping path on different drive in zip: {member}") | |
| continue |
There was a problem hiding this comment.
Thanks for the suggestion. I updated the containment check to use os.path.commonpath() on the resolved paths, with os.path.normcase() for comparison and ValueError handling for different drives.
Signed-off-by: wonyunkang <rojo.wk@gmail.com>
4ccfae7 to
db0fe0a
Compare
edoakes
left a comment
There was a problem hiding this comment.
LGTM assuming tests pass. Thanks for the contribution.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
Reviewed by Cursor Bugbot for commit 16f9b36. Configure here.
Head branch was pushed to by a user without write access
92883ac to
169c5ab
Compare
Signed-off-by: H4ck2 <H4ck2@users.noreply.github.com>
169c5ab to
82444d1
Compare
|
This looks like an infra/network flake rather than a PR failure: the retry also failed while pip was fetching psutil metadata from files.pythonhosted.org during Docker/setup, before tests ran. It may pass on a fresh retry or different worker once the network issue clears. I don’t see evidence that this is caused by the PR changes. Could we retry CI when appropriate? |
|
Kicked off CI to re-run, will monitor |
|
Thanks for the fix @H4ck2 |

Why are these changes needed?
This PR hardens
runtime_envzip package extraction by resolving candidatemember paths before checking whether they remain inside the extraction target.
The existing
unzip_package()implementation builds a candidate path from thetarget directory and zip member name, then checks containment before resolving
path components such as
... This makes the zip extraction path inconsistentwith the safer resolved-path containment logic already used by
untar_package().This change updates
unzip_package()to:This keeps zip extraction behavior aligned with the intended path containment
invariant and with the existing tar extraction implementation.
Related issue number
N/A
Checks
Testing
I also ran the three new regression cases in a focused standalone pytest harness
against the patched
unzip_package()implementation: 3 passed.