Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bdist_wheel makes absolute data_files relative to site-packages #92

Closed
agronholm opened this issue Dec 7, 2013 · 81 comments
Closed

bdist_wheel makes absolute data_files relative to site-packages #92

agronholm opened this issue Dec 7, 2013 · 81 comments
Labels

Comments

@agronholm
Copy link
Contributor

@agronholm agronholm commented Dec 7, 2013

Originally reported by: Marcus Smith (Bitbucket: qwcode, GitHub: qwcode)


bdist_wheel doesn't handle absolute paths in the "data_files" keyword like standard setuptools installs do, which honor it as absolute (which seems to match the examples in the distutils docs)

when using absolute paths, the data ends up in the packaged wheel at the top level, and get's installed relative to site-packages (along with the project packages)

so, bdist_wheel is re-interpreting distutil's "data_files" differently. maybe better for wheel to fail to build projects with absolute data_files, than to just reinterpret it.


@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Dec 7, 2013

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


I.e. it's either that the wheel spec has to grow to cover absolute data_files (I don't see how it could handle them now; putting them into {distribution}-{version}.data doesn't help because that's relative to sys.prefix), or bdist_wheel just needs to fail to build in that case.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Dec 27, 2013

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, relative "data_files" paths are handled as expected and end up in the "*.data" dir in the packaged wheel.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't think we should allow absolute paths.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


Absolute paths need to be allowed but it may be acceptable to restrict to absolute paths within the sdist.

There's a place in setuptools where certain kinds of paths cause errors and I run into it from time to time. I don't remember the details atm, only that it would be much easier to use if it did allow absolute paths.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


Why does it have to be allowed? If bdist_wheel and sdist were consistent, that would be one thing, but they're not and can't be at the current time, so it seems wrong for wheels to build absolute paths and then place them into site-packages

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


I could be thinking about setuptools' /other/ bug ;-)

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't see any reason why absolute paths have to be allowed. I think they are a bad design in general, everything should be rooted in sys.prefix. It's not a very good thing for a Wheel to be able to override /etc/hosts for instance.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, there's a metadata issue open for whether wheel would grow the ability to handle platform-specific paths (including absolute I guess) https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

for me, this issue isn't about that discussion.

it's about the oddity of placing absolute paths into site-packages

since wheel has no ability to properly place absolute files currently, it shouldn't build projects that declare them

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


packagename-1.0.data/data/ is currently a way to place absolute files. This is an accidental feature but I don't have any particular beef with it.

They are absolute relative to the root of the virtualenv :-) Or if no virtualenv is in use, probably /

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


That's not what absolute means, that's a relative path. An absolute file is one that will install to /this/exact/path/even/in/a/virtualenv

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


so take this setup.py which defines an absolute data files at "/opt/data_file": https://gist.github.com/qwcode/9144129
(and assuming there is a "data_file" relative to it)

build an sdist and wheel and then install each, and see where "data_file" goes.

  • for the sdist: /opt/data_file
  • for the wheel: ../site-packages/opt/data_file

on the other hand, relative data files get packaged into *.data/data and get installed relative to sys.prefix

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Mar 3, 2014

Original comment by Michael Hoglan (Bitbucket: mhoglan, GitHub: mhoglan):


Graphite does a similar thing, not specifically their data files, but the lib files are specified in an absolute location (/opt/graphite/webapp) in the setup.cfg, and it results in the files being under site-packages/opt/graphite/... when you build a wheel and install it in a virtualenv.

When building from source, I would specify --install-options to change those locations to be relative to the virtualenv, but that does not seem possible to pass those options into pip wheel.

Removing the prefix / lib configurations in the setup.cfg cause the wheel and source installs to behave the same (ends up in site-packages); Altering the wheel and getting rid of the /opt/graphite/webapp at the top level achieves the same thing (since it would have assumed prefix of . at install);

btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that. I see this as more of having to work with projects that are not defined cleanly. And probably allowing there to be consistency between a src install and a wheel install.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Mar 23, 2015

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that.

Agreed!

I see this as more of having to work with projects that are not defined cleanly.

Well, actually the current problem is to work with package installers and virtualenvs that are defined cleanly!

Problem is that you may be able to put a data file somewhere using setup(data_files=xx) -- but can you determine where it went from your application instance!?

That's the main problem I'm facing with setuptools right now... when using setuptools, all paths for the data_files kwarg are relative to sys.prefix, but when installing in a virtualenv, they're not..

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Mar 29, 2015

Original comment by Keerthan Jaic (Bitbucket: jck2, GitHub: jck2):


Is there a uniform way to find (relative) package data which works irrespective of whether the package is installed globally or in a venv?

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented May 18, 2015

Original comment by Joo Tsao (Bitbucket: nuwa, GitHub: nuwa):


need support setup(data_files=/opt/xxx)

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jun 4, 2015

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


For reference, this is bug is essentially the same as #120
And since pip 7.0.0 all packages are now wheeled before install, meaning that this bug and #120 are getting prime exposure in several packages.
See pypa/pip#2874

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jun 5, 2015

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@jck2 the simplest way for me is to only use package data effectively stored in a package directory side by side with the python code that needs them and never use data files.
Once you have this, dirname and __file__ will let you navigate to these data file locations relative to your python code location. Since the data is always in the same place relative to the calling code, the fact you are installed globally in a venv or else does not matter anymore.

As a simple example of this approach:

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jun 5, 2015

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


My work-around for pip 7.0 (because pip automatically creates wheels from sdists) is to include this in setup.py:

if 'bdist_wheel' in sys.argv:
    raise RuntimeError("This setup.py does not support wheels")

Pip will automatically skip the .whl packaging and run the normal sdist installation.

Why on earth this decision to make an unfinished packaging system deploy things that weren't intended for it by default is beyond my belief :( People who've made sdist installations, released them, and tested them, can create their .whl files themselves... this new bdist_wheel call prolonges the installation process and creates new unexpected behavior.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Dec 10, 2015

Original comment by Benjamin Reedlunn (Bitbucket: breedlun, GitHub: breedlun):


I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand?

Here is a link to my stack overflow question that goes into more detail.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 18, 2017

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


This is a real problem for me as well.

My setup.py script works as expected with regard to data_files that use an absolute path and honors them when I do 'python setup.py install' however when I do 'python setup.py bdist_wheel' and then pip install my wheel the data_files that I specified with an absolute path and were correctly installed using a straight setup.py install ARE NOT installed correctly from the wheel and wind up relative to site-packages. I.e. site-packages/usr/lib/blah/blah

If I want to install a file outside of site-packages (say to an arbitrary place on the filesystem) I should be able to do that. The behaviour is inconsistent. I'd really like to see this fixed because right now I can't use wheels and that's exactly what I want to use.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 18, 2017

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


@joe_code - I can recommend finding a workaround, not using setup.py's setup(). Ultimately, that's what we did, and to be honest and despite my previous harsh rhetoric in this thread, it's nice to get rid of data_files and have a Python project that works inside virtual environments again and can be distributed with Wheel :)

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 18, 2017

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


Hey Benjamin, thanks for your reply. Could you elaborate a little bit on your solution please?

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 18, 2017

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


In our case, we could factor out most of the files in /usr/share and turn them into "package data". The remaining files are now handled by OS installers (for instance debian packages, pkg for Mac, setup.exe for stuff Windows etc).

In case you don't want to create OS installers, you can have a "run first" approach for your application for which you do if not os.path.exists, possibly adding a file with your project's version in. The disadvantage is uninstallation.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 24, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


I think it's a real problem that this decision has effectively broken a use case that many packages have relied in--in some cases for bad reasons, but in other cases for good reasons.

Although I personally feel like the reasoning behind the breakage has some merit, breaking things without offering some kind of guidance on how best to handle outside-Python resource files has created yet another sore point against Python packaging that has been raised by some of colleagues, and it's a valid complaint.

I think the argument "well we shouldn't just allow installing files to arbitrary system locations" is well meaning but ultimately spurious. It's true that, depending on what install_data gets set to, the paths which can be installed to is somewhat limited making it hard, say, to overwrite /etc/hosts. Yet pip will also happily overwrite executables in /usr/bin, for example, which I think is awful and it shouldn't. So really you're making a security-related argument that falls apart because there's actually no promise of security when installing a wheel system-wide (outside a virtualenv). Meanwhile it's possible to hand-craft wheels with files in the .data directory that can be installed almost anywhere within /usr at the very least.

I think a better approach would be to not make arbitrary decisions for software developers who know what they're doing, and where necessary protect users (and developers who don't know what they're doing) by not allowing pip to overwrite files that already exist on their system (especially for "data files").

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 24, 2017

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


Allowing absolute paths breaks the isolation of virtual environments.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 31, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


So treat absolute paths as relative to the root of a virtualenv when installing in a virtualenv, and don't break their semantics on system installs.

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 31, 2017

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@embray but is pip aware of being in a virtualenv at all?

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 31, 2017

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


Actually it is: https://github.com/pypa/pip/blob/d86d1713647f791979b9267ffc5773479d0ef469/pip/locations.py#L39

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Jan 31, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


Yes, it has to be--especially to be able to deal with the nuances between virtualenvs with and without "global site-packages".

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Mar 5, 2017

Original comment by Benno Fünfstück (Bitbucket: bennofs, GitHub: bennofs):


Is there a way right now to install some file into site-packages that works with both setuptools and bdist_wheel? For example, if I want to install a native library that is later loaded by my application. Or should I not use site-packages for that?

@pfmoore
Copy link
Member

@pfmoore pfmoore commented Sep 13, 2018

I agree. But I do think we need to clarify (somewhere) what does constitute a "proper Python package". People are using the packaging toolset for all sorts of things that often go beyond the description of "a basic Python library" (for example, pip itself is not a library, rather it's a command line application, but it's distributed as a wheel).

My informal view on what constitutes a "valid Python package" is:

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.
  2. Must not need installation of any files in absolute locations, or OS-specific locations. It's perfectly OK to look for such files at runtime, but don't ship files that should be pre-installed.
  3. Must not require integration with OS services (e.g. installation as a system service, integration with system documentation services like manpages, registration with system package managers or in a system registry, ...)

If you fail any of these criteria (or can't work out some compromise of your own, like asking users to run a post-install script manually) then the Python packaging toolset isn't what you want. Of course, many people will use it anyway because the alternatives aren't that straightforward. But they might have to find their own fixes for things we don't support.

I'm happy if that's not actually what we choose to take as a definition - for example, if someone wanted to extend the wheel spec to support (in a suitably cross-platform way) absolute paths and/or locations for things like mampages, then I'd be fine with that. But until that happens, the above is my rule of thumb on what's in scope and supported for packaging.

@sashkab
Copy link

@sashkab sashkab commented Sep 13, 2018

2. Must not need installation of any files in absolute locations,

What about relative location, but not inside of site-package? I.e we need to install something into $VIRTUAL_ENV/xxx. What's the best way here? Still come up with the install script for user to run manually? I need to support installation into virtual environment, I don't need to support any other installation method, because of the package specifics.

@benoit-pierre
Copy link
Member

@benoit-pierre benoit-pierre commented Sep 13, 2018

@sashkab: Assuming that would work at installation time, what would the code to be able to use those resources at runtime look like?

@dholth
Copy link
Member

@dholth dholth commented Sep 13, 2018

The status quo discourages Python as an application development language, and I think that is a shame. setup.py didn't start out as a library management tool, but it became that during the decade when web development was the most important domain for Python, and no one noticed the broken setup.py features. If we respect the authors, packages should be able to contain anything that their authors want to put in. The strict separation between packaging and installation that we get from wheel also gives the person who uses that package complete control of how it gets installed.
I think in many cases it is more likely that the prospective Python application developer, writing a mostly-cross-platform application, will choose a different programming language whose dominant packaging tool better supports applications rather than try to make a distribution-specific package.
@pfmoore is correct that without further work a "valid Python package" has those enumerated properties.

@njsmith
Copy link
Member

@njsmith njsmith commented Sep 13, 2018

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.

I just want to note that the other 2 requirements pretty much follow from this one. If we want to support man pages, the way to do that is to extend the definition of a "Python environment" to include a man-pages directory, which would require figuring out what that means in all of these cases.

Wheels are a high-level representation of a Python package, abstracted over the specific details of the installation environment. If you want a high-level representation of an arbitrary application, that's just a different thing, and wheels are not well-suited to that problem. There are many other tools that are designed to solve that problem, like rpms, debs, MacOS/Windows application installers, etc.

@dholth
Copy link
Member

@dholth dholth commented Sep 13, 2018

@njsmith
Copy link
Member

@njsmith njsmith commented Sep 13, 2018

If you have data you want to access at runtime, then we already have a standard and well-supported solution (it even works for packages installed in zips!): https://docs.python.org/3.7/library/importlib.html#module-importlib.resources

@dholth
Copy link
Member

@dholth dholth commented Sep 13, 2018

@pfmoore
Copy link
Member

@pfmoore pfmoore commented Sep 13, 2018

I've suggested that if a wheel contained a package-1.0.data/docs/
directory, that the installer could place those files into e.g. $virtualenv/share/docs/$packagename-
$packageversion by default. Imagine that plus a few more categories.

Indeed. If someone wanted to flesh out that proposal, put it into the form of a PEP/standard and get it approved and then implemented in the various tools, then that would probably cover a lot of the use cases I've seen mentioned in the past. Of course, no-one has yet volunteered to champion the suggestion. It really needs someone with an actual stake in the issue to step up, or it's going to forever sit behind other priorities.

@dholth
Copy link
Member

@dholth dholth commented Sep 13, 2018

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 14, 2018

So IIUC, the data directory in wheels has never worked in a useful way,

Wrong! The data directory (and data_files in setup.py) is useful in several ways. For example, it can be used to install Jupyter files such as Jupyter kernel specs or Jupyter notebook extensions (example). And I see nothing wrong with installing man pages or documentation using data.

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 14, 2018

What about relative location, but not inside of site-package?

That's exactly the use case that data_files solves.

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 14, 2018

it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

You are confusing two different use cases for package_data and data_files.

package_data is useful for data files used by the package itself (or possibly other Python tools looking there).

data_files on the other hard is useful for data files used by other software (which may not even be written in Python).

@njsmith
Copy link
Member

@njsmith njsmith commented Sep 15, 2018

what's this other software, that has nothing to do with Python, but it understands about Python environment layouts, including the data directory that even most Python software doesn't understand, but that doesn't know how to find package_data?

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 15, 2018

understands about Python environment layouts

"environments" are not specific to Python at all. Most open source software packages have a concept of installation prefix, analogous to sys.prefix. Conda for example installs everything (Python packages but also other packages) in a common prefix.

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 15, 2018

what's this other software

Jupyter packages are a good example. While many Jupyter kernels are written using Python, that is not a requirement: it is possible to implement the Jupyter protocol without Python. So they decided to use data_files for that, which makes it work the same way for Python packages and non-Python packages.

@jdemeyer
Copy link

@jdemeyer jdemeyer commented Sep 15, 2018

And the man pages example is also a good one (even though I personally don't know any Python package which installs a man page).

@agronholm
Copy link
Contributor Author

@agronholm agronholm commented Sep 30, 2018

The consensus (?) seems to be that this needs a new standard and that wheel itself is currently not doing anything wrong. If someone wants this to be reopened, be specific about what changes are required for the wheel project. Otherwise a new issue could be opened when a new standard emerges that requires implementation here.

@agronholm agronholm closed this Sep 30, 2018
openstack-mirroring pushed a commit to openstack/kolla that referenced this issue Jul 20, 2020
This change modifies the ironic base container
to copy rootwarp filters from the virtual
env rather than the source code directory. This
is need because some required filters have
been moved to ironic-lib and are not present in
the /ironic dir. The rootwrap filters are not
automitaclly installed in /etc/... due to kolla
use of virtual envs and pypa/wheel#92

Closes-Bug: #1886663
Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
openstack-mirroring pushed a commit to openstack/openstack that referenced this issue Jul 20, 2020
* Update kolla from branch 'master'
  - Merge "copy rootwarp files form venv in ironic base"
  - copy rootwarp files form venv in ironic base
    
    This change modifies the ironic base container
    to copy rootwarp filters from the virtual
    env rather than the source code directory. This
    is need because some required filters have
    been moved to ironic-lib and are not present in
    the /ironic dir. The rootwrap filters are not
    automitaclly installed in /etc/... due to kolla
    use of virtual envs and pypa/wheel#92
    
    Closes-Bug: #1886663
    Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
openstack-mirroring pushed a commit to openstack/kolla that referenced this issue Jul 21, 2020
This change modifies the ironic base container
to copy rootwarp filters from the virtual
env rather than the source code directory. This
is need because some required filters have
been moved to ironic-lib and are not present in
the /ironic dir. The rootwrap filters are not
automitaclly installed in /etc/... due to kolla
use of virtual envs and pypa/wheel#92

Closes-Bug: #1886663
Change-Id: Idb0a675d92bab8b9a0cf5209f0a06e996e96033c
(cherry picked from commit b6c7110)
@sinoroc sinoroc mentioned this issue Nov 20, 2020
2 of 2 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.