Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle symlinks when extracting .whl files #10935

Open
1 task done
mergian opened this issue Mar 3, 2022 · 9 comments
Open
1 task done

Handle symlinks when extracting .whl files #10935

mergian opened this issue Mar 3, 2022 · 9 comments
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature

Comments

@mergian
Copy link

mergian commented Mar 3, 2022

Description

PIP does not export symlinks correctly when processing WHL archives. See my proof below. Expected output should be:

-rw-rw-r-- 1 user user 0 Mar   3 08:00 bla.txt
lrwxrwxrwx 1 user user 7 Mar   3 08:00 bla.txt.link -> bla.txt # << this is a symlink
-rw-rw-r-- 1 user user 0 Mar   3 08:00 __init__.py
drwxrwxr-x 2 user user 37 Mar  3 08:00 __pycache__

But PIP creates bla.txt.link as a text file containing "bla.txt"

-rw-rw-r-- 1 user user 0 Mar   3 08:38 bla.txt
-rw-rw-r-- 1 user user 7 Mar   3 08:38 bla.txt.link # << this is just a text file containing "bla.txt"
-rw-rw-r-- 1 user user 0 Mar   3 08:00 __init__.py
drwxrwxr-x 2 user user 37 Mar  3 08:00 __pycache__

Fixing this issue is rather easy. The code at pip/_internal/operations/install/wheel.py#L388 needs to be changed to:

with self._zip_file.open(zipinfo) as f:
        if (zipinfo.external_attr & (1 << 29)):
                os.symlink(f.read(), self.dest_path)
        else:
                with open(self.dest_path, "wb") as dest:
                        shutil.copyfileobj(f, dest)

Explanation: the 29th bit in zipinfo.external_attr indicates if this is a symlink. In that case, the content of the file is the link destination.

Expected behavior

Symlinks should be extracted correctly and not as text files.

pip version

22.0.3

Python version

3.7

OS

Linux

How to Reproduce

# 1. Create a dummy package with a symlink
mkdir build
pushd build

mkdir testsymlink
pushd testsymlink
touch __init__.py
touch bla.txt
ln -s bla.txt bla.txt.link
popd # testsymlink

mkdir testsymlink-1.0.dist-info
pushd testsymlink-1.0.dist-info
echo "Metadata-Version: 2.1"			        > METADATA
echo "Name: testsymlink"			        >> METADATA
echo "Version: 1.0.0"				        >> METADATA
echo "Summary: UNKNOWN"				        >> METADATA
echo "Home-page: UNKNOWN"			        >> METADATA
echo "Author: UNKNOWN"				        >> METADATA
echo "Author-email: UNKNOWN"			        >> METADATA
echo "License: UNKNOWN"				        >> METADATA
echo "Platform: UNKNOWN"			        >> METADATA
echo "Requires-Python: >=3.7"			        >> METADATA
 
echo "testsymlink/__init__.py,,"			> RECORD
echo "testsymlink/bla.txt,,"				>> RECORD
echo "testsymlink/bla.txt.link,,"			>> RECORD
echo "testsymlink-1.0.dist-info/METADATA,,"		>> RECORD
echo "testsymlink-1.0.dist-info/WHEEL,,"		>> RECORD
echo "testsymlink-1.0.dist-info/top_level.txt,,"	>> RECORD
echo "testsymlink-1.0.dist-info/RECORD,,"		>> RECORD

echo "testsymlink" > top_level.txt

echo "Wheel-Version: 1.0"			        > WHEEL
echo "Generator: bdist_wheel (0.37.1)"		        >> WHEEL
echo "Root-Is-Purelib: true"			        >> WHEEL
echo "Tag: py3-none-any"			        >> WHEEL

popd # testsymlink-1.0.dist-info

zip -ry testsymlink-1.0-py3-none-any.whl testsymlink testsymlink-1.0.dist-info

# 2. Install the dummy package
python3 -m pip install testsymlink-1.0-py3-none-any.whl

popd # build

# 3. Print folder of the dummy package
ls -l $(python3 -c "import os; import testsymlink; print(os.path.dirname(testsymlink.__file__))")

Output

-rw-rw-r-- 1 user user 0 Mar   3 08:38 bla.txt
-rw-rw-r-- 1 user user 7 Mar   3 08:38 bla.txt.link
-rw-rw-r-- 1 user user 0 Mar   3 08:00 __init__.py
drwxrwxr-x 2 user user 37 Mar  3 08:00 __pycache__

Code of Conduct

@mergian mergian added S: needs triage Issues/PRs that need to be triaged type: bug A confirmed bug or unintended behavior labels Mar 3, 2022
@pradyunsg
Copy link
Member

Explanation: the 29th bit in zipinfo.external_attr indicates if this is a symlink. In that case, the content of the file is the link destination.

Do you have a source for this information?

@mergian
Copy link
Author

mergian commented Mar 3, 2022

Here they use https://stackoverflow.com/a/60691331 (0xA000) << 16 which is identical.

Although using stat seems to be more stable approach as shown here https://stackoverflow.com/a/65817451 . Important here is that they do this << 16 shift, as file permissions are offset by 16 in external_attr.

You can verify this by printing the ZipInfo of the file. It should print as something like this:
<ZipInfo filename='testlink' compress_type=deflate filemode='lrwxrwxrwx' file_size=0 compress_size=2> So the filemode needs to show the "l" flag.

@pfmoore
Copy link
Member

pfmoore commented Mar 3, 2022

Wheels (and more importantly, I believe the standard library) don't support symlinks, This is a known limitation, and the fact that you have to maually create a wheel containing symlinks demonstrates that it's not simply a pip issue, but the whole toolchain doesn't support them.

There have been discussions about how to support symlinks in wheels (and what to do about wheels containing symlinks on platforms that don't support them) but as yet there's been no conclusion. See, for example, https://discuss.python.org/t/symbolic-links-in-wheels/1945

@uranusjr
Copy link
Member

uranusjr commented Mar 3, 2022

Also see https://bugs.python.org/issue27318

A lot of layers need to be fixed before pip has a chance to support symlinks.

@mergian
Copy link
Author

mergian commented Mar 3, 2022

I understand concerns about symlinks in Wheels AND if they are non-standard, then setuptools, distutils, and pip should at least warn about them and not just writing text files and let the user spend hours looking for possible reason.

However, as Wheels are supposed to be binary distributions (mainly meant for a particular platform) I think it should be in the package owners responsibility to make sure that he builds working packages.

So an easy extension to my proposed fix would be:

if is_symlink():
	if platform_supports_symlinks():
		os.symlink(...)
	else:
		raise Exception('This platform does not support symlinks, please contact package provider...')

And yes, setuptools and others don't support to add symlinks yet, which does not mean that PIP should not be the first to support them (Chicken Egg problem?).

@uranusjr: yes I agree but I think that rather aims for the extractall() and extract(file) operations. But PIP uses shutil.copyfileobj(f, dest) ( https://github.com/python/cpython/blob/3.10/Lib/shutil.py#L187 ) which just does a copy through f.read() and dest.write(). That means that no file attributes are applied. So even IF zipfile would fix handling of symlinks the PIP implementation would still not work.

@uranusjr
Copy link
Member

uranusjr commented Mar 3, 2022

The problem is less about where the fix should go, but that any fix pip introduces would encounter the same issue faced by extractall. Exactly how a symlink should be extracted is not a solved issue, and it would not be desired for pip’s extracting logic to deviate from the stdlib zipfile implementation. So pip needs to wait for either CPython decides on the issue (including if they decide to not fix the issue, in which case we can do whatever we want) before we can make a move.

@pfmoore
Copy link
Member

pfmoore commented Mar 3, 2022

Personally, I think that the symlinks-in-wheels issue should not be solved by putting actual symlinks in the wheel (which, as has been noted, relies on functions which have various issues) but rather by having metadata in the wheel stating "installers should make these symlinks". I'm working on a proposal based on this idea.

I suggest this discussion gets taken to the Discourse thread I linked previously - I don't expect pip to implement support for symlinks in wheels until it's standardised, so having the discussion here is pointless (and means that not everyone with an interest in the topic has visibility of it).

@pfmoore
Copy link
Member

pfmoore commented Mar 3, 2022

Also, if you can't include symlinks in wheels using standard tools like setuptools or flit, adding support in pip is mostly pointless...

@arcivanov
Copy link

You can get symlink support in wheels using wheel-axle, a drop-in replacement for the wheel that is fully backwards compatible. All modernish versions of PIP are supported.

https://github.com/karellen/wheel-axle

For an example, please see: https://github.com/karellen/wheel-axle-example

@pradyunsg pradyunsg changed the title PIP does not handle symlinks correctly when extracting WHL archives. Handle symlinks when extracting .whl files Sep 9, 2023
@pradyunsg pradyunsg added state: needs discussion This needs some more discussion type: feature request Request for a new feature and removed type: bug A confirmed bug or unintended behavior S: needs triage Issues/PRs that need to be triaged labels Sep 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
state: needs discussion This needs some more discussion type: feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

5 participants