Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-113356: Ignore errors in "._ABC.pth" #113357

Closed
wants to merge 5 commits into from

Conversation

ronaldoussoren
Copy link
Contributor

@ronaldoussoren ronaldoussoren commented Dec 21, 2023

On macOS the system can create a "._" file next to a regular file when the system needs to store metadata that is not supported by the filesystem (such as storing extended attributes on an exFAT filesystem).

Those files are binary data and cause startup failures when the base file is a ".pth" file.

On macOS the system can create a "._" file next to a
regular file when the system needs to store metadata
that is not supported by the filesystem (such as storing
extended attributes on an exFAT filesystem).

Those files are binary data and cause startup failures when
the base file is a ".pth" file.
@ronaldoussoren
Copy link
Contributor Author

ronaldoussoren commented Dec 21, 2023

I'm not 100% happy with this PR because the block of code where I ignore UnicodeDecodeErrors is fairly large, and includes the invocation of exec for PTH lines starting with an import statement.

I don't check the filename before trying to open it out of an abundance of caution. There's bound to be someone that intentionally uses a filename with "._" as a prefix in the name.

@ronaldoussoren ronaldoussoren added needs backport to 3.11 only security fixes needs backport to 3.12 bug and security fixes labels Dec 21, 2023
Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to skip all dot-files in addsitedir().

@ronaldoussoren
Copy link
Contributor Author

I think it is better to skip all dot-files in addsitedir().

I agree, but that is a change of functionality. I'd like to back port this to 3.11 and 3.12 as the issue affects users.

I can do a follow-up PR for just main that ignores all dot-files (probably with a whatsnew entry)

@serhiy-storchaka
Copy link
Member

I think that it is very unlikely that anybody intentionally uses a pth file with a name starting with a dot. Should not the base name match the package name, which cannot start with a dot? And if it is used intentionally, it looks like a malicious use, since dot-files are "hidden" by default. So we can fix a minor security issue of executing code in hidden files (normally dot-files cannot be imported).

The heuristic used in this PR is not completely reliable, because by accident the first line can be decodable. Even if macOS only supports UTF-8 locale (does it?), the disk mounted on macOS can later be used on other OS with Latin1 or other 8-bit locale.

@ronaldoussoren
Copy link
Contributor Author

I think that it is very unlikely that anybody intentionally uses a pth file with a name starting with a dot. Should not the base name match the package name, which cannot start with a dot? And if it is used intentionally, it looks like a malicious use, since dot-files are "hidden" by default. So we can fix a minor security issue of executing code in hidden files (normally dot-files cannot be imported).

The heuristic used in this PR is not completely reliable, because by accident the first line can be decodable. Even if macOS only supports UTF-8 locale (does it?), the disk mounted on macOS can later be used on other OS with Latin1 or other 8-bit locale.

The ._ file is in AppleDouble format, which is binary. I don't expect this to ever be compatible with ASCII, this is the output of od -c on such a file for a file on exFAT with a single extended attribute named "python" and with the value "gil":

0000000   \0 005 026  \a  \0 002  \0  \0   M   a   c       O   S       X
0000020                                   \0 002  \0  \0  \0  \t  \0  \0
0000040   \0   2  \0  \0 016 260  \0  \0  \0 002  \0  \0 016 342  \0  \0
0000060  001 036  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000100   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0000120   \0  \0  \0  \0   A   T   T   R 377 377 367 250  \0  \0 016 342
0000140   \0  \0  \0 214  \0  \0  \0 003  \0  \0  \0  \0  \0  \0  \0  \0
0000160   \0  \0  \0  \0  \0  \0  \0 001  \0  \0  \0 214  \0  \0  \0 003
0000200   \0  \0  \a   p   y   t   h   o   n  \0  \0  \0   g   i   l  \0
0000220   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0007340   \0  \0  \0  \0 001  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
0007360   \0 036   T   h   i   s       r   e   s   o   u   r   c   e    
0007400    f   o   r   k       i   n   t   e   n   t   i   o   n   a   l
0007420    l   y       l   e   f   t       b   l   a   n   k            
0007440   \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0007740   \0  \0  \0  \0 001  \0  \0  \0 001  \0  \0  \0  \0  \0  \0  \0
0007760   \0 036  \0  \0  \0  \0  \0  \0  \0  \0  \0 034  \0 036 377 377
0010000

Note that there are bytes with value 0xFF on the line starting at 0000120, and that's before values I can influence directly.

@ronaldoussoren
Copy link
Contributor Author

ronaldoussoren commented Dec 23, 2023

@Yhg1s and @pablogsal: This PR attempts to fix an issue where macOS can create "._" files next to regular files on (in particular) exFAT files when the filesystem driver needs to store metadata that is not supported by the operating system.

File listing of a directory with a file "text.txt" where I added a single extended attribute:

-rwx------  1 ronald  staff  4096 Dec 23 13:23 ._test.txt
-rwx------@ 1 ronald  staff     0 Dec 22 09:03 test.txt

The PR currently tries to be fairly targeted to this particular issue. Would it be acceptable to just ignore all ".pth" files whose name starts with a dot? As @serhiy-storchaka notes such files can be seen as security risk because they are hidden by default and can execute arbitrary code.

UPDATE: asked the wrong release manager for the 3.11 back port. Sorry, @pablogsal and @ambv...

@ronaldoussoren
Copy link
Contributor Author

I've slightly tweaked the approach:

  • Test test now generates a "._" file that contains the prefix from an actual AppleDouble file generated on macOS 14
  • Site.py looks for the magic value at the start of a AppleDouble file instead of relying on a unicode decoding problem

@serhiy-storchaka
Copy link
Member

Opened a security issue #113659.

@ronaldoussoren
Copy link
Contributor Author

The fix in #113659 is better than this very targeted PR and has been merged. I'm therefore closing this PR.

@ronaldoussoren ronaldoussoren deleted the gh-113356 branch January 18, 2024 11:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants