New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bdist_wheel makes absolute data_files relative to site-packages #92

Closed
agronholm opened this Issue Dec 7, 2013 · 81 comments

Comments

Projects
None yet
@agronholm
Contributor

agronholm commented Dec 7, 2013

Originally reported by: Marcus Smith (Bitbucket: qwcode, GitHub: qwcode)


bdist_wheel doesn't handle absolute paths in the "data_files" keyword like standard setuptools installs do, which honor it as absolute (which seems to match the examples in the distutils docs)

when using absolute paths, the data ends up in the packaged wheel at the top level, and get's installed relative to site-packages (along with the project packages)

so, bdist_wheel is re-interpreting distutil's "data_files" differently. maybe better for wheel to fail to build projects with absolute data_files, than to just reinterpret it.


@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Dec 7, 2013

Contributor

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


I.e. it's either that the wheel spec has to grow to cover absolute data_files (I don't see how it could handle them now; putting them into {distribution}-{version}.data doesn't help because that's relative to sys.prefix), or bdist_wheel just needs to fail to build in that case.

Contributor

agronholm commented Dec 7, 2013

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


I.e. it's either that the wheel spec has to grow to cover absolute data_files (I don't see how it could handle them now; putting them into {distribution}-{version}.data doesn't help because that's relative to sys.prefix), or bdist_wheel just needs to fail to build in that case.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Dec 27, 2013

Contributor

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, relative "data_files" paths are handled as expected and end up in the "*.data" dir in the packaged wheel.

Contributor

agronholm commented Dec 27, 2013

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, relative "data_files" paths are handled as expected and end up in the "*.data" dir in the packaged wheel.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't think we should allow absolute paths.

Contributor

agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't think we should allow absolute paths.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


Absolute paths need to be allowed but it may be acceptable to restrict to absolute paths within the sdist.

There's a place in setuptools where certain kinds of paths cause errors and I run into it from time to time. I don't remember the details atm, only that it would be much easier to use if it did allow absolute paths.

Contributor

agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


Absolute paths need to be allowed but it may be acceptable to restrict to absolute paths within the sdist.

There's a place in setuptools where certain kinds of paths cause errors and I run into it from time to time. I don't remember the details atm, only that it would be much easier to use if it did allow absolute paths.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


Why does it have to be allowed? If bdist_wheel and sdist were consistent, that would be one thing, but they're not and can't be at the current time, so it seems wrong for wheels to build absolute paths and then place them into site-packages

Contributor

agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


Why does it have to be allowed? If bdist_wheel and sdist were consistent, that would be one thing, but they're not and can't be at the current time, so it seems wrong for wheels to build absolute paths and then place them into site-packages

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


I could be thinking about setuptools' /other/ bug ;-)

Contributor

agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


I could be thinking about setuptools' /other/ bug ;-)

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't see any reason why absolute paths have to be allowed. I think they are a bad design in general, everything should be rooted in sys.prefix. It's not a very good thing for a Wheel to be able to override /etc/hosts for instance.

Contributor

agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


I don't see any reason why absolute paths have to be allowed. I think they are a bad design in general, everything should be rooted in sys.prefix. It's not a very good thing for a Wheel to be able to override /etc/hosts for instance.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, there's a metadata issue open for whether wheel would grow the ability to handle platform-specific paths (including absolute I guess) https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

for me, this issue isn't about that discussion.

it's about the oddity of placing absolute paths into site-packages

since wheel has no ability to properly place absolute files currently, it shouldn't build projects that declare them

Contributor

agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


btw, there's a metadata issue open for whether wheel would grow the ability to handle platform-specific paths (including absolute I guess) https://bitbucket.org/pypa/pypi-metadata-formats/issue/13/add-a-new-subdirectory-to-allow-wheels-to

for me, this issue isn't about that discussion.

it's about the oddity of placing absolute paths into site-packages

since wheel has no ability to properly place absolute files currently, it shouldn't build projects that declare them

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


packagename-1.0.data/data/ is currently a way to place absolute files. This is an accidental feature but I don't have any particular beef with it.

They are absolute relative to the root of the virtualenv :-) Or if no virtualenv is in use, probably /

Contributor

agronholm commented Feb 21, 2014

Original comment by Daniel Holth (Bitbucket: dholth, GitHub: dholth):


packagename-1.0.data/data/ is currently a way to place absolute files. This is an accidental feature but I don't have any particular beef with it.

They are absolute relative to the root of the virtualenv :-) Or if no virtualenv is in use, probably /

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


That's not what absolute means, that's a relative path. An absolute file is one that will install to /this/exact/path/even/in/a/virtualenv

Contributor

agronholm commented Feb 21, 2014

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


That's not what absolute means, that's a relative path. An absolute file is one that will install to /this/exact/path/even/in/a/virtualenv

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Feb 21, 2014

Contributor

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


so take this setup.py which defines an absolute data files at "/opt/data_file": https://gist.github.com/qwcode/9144129
(and assuming there is a "data_file" relative to it)

build an sdist and wheel and then install each, and see where "data_file" goes.

  • for the sdist: /opt/data_file
  • for the wheel: ../site-packages/opt/data_file

on the other hand, relative data files get packaged into *.data/data and get installed relative to sys.prefix

Contributor

agronholm commented Feb 21, 2014

Original comment by Marcus Smith (Bitbucket: qwcode, GitHub: qwcode):


so take this setup.py which defines an absolute data files at "/opt/data_file": https://gist.github.com/qwcode/9144129
(and assuming there is a "data_file" relative to it)

build an sdist and wheel and then install each, and see where "data_file" goes.

  • for the sdist: /opt/data_file
  • for the wheel: ../site-packages/opt/data_file

on the other hand, relative data files get packaged into *.data/data and get installed relative to sys.prefix

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Mar 3, 2014

Contributor

Original comment by Michael Hoglan (Bitbucket: mhoglan, GitHub: mhoglan):


Graphite does a similar thing, not specifically their data files, but the lib files are specified in an absolute location (/opt/graphite/webapp) in the setup.cfg, and it results in the files being under site-packages/opt/graphite/... when you build a wheel and install it in a virtualenv.

When building from source, I would specify --install-options to change those locations to be relative to the virtualenv, but that does not seem possible to pass those options into pip wheel.

Removing the prefix / lib configurations in the setup.cfg cause the wheel and source installs to behave the same (ends up in site-packages); Altering the wheel and getting rid of the /opt/graphite/webapp at the top level achieves the same thing (since it would have assumed prefix of . at install);

btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that. I see this as more of having to work with projects that are not defined cleanly. And probably allowing there to be consistency between a src install and a wheel install.

Contributor

agronholm commented Mar 3, 2014

Original comment by Michael Hoglan (Bitbucket: mhoglan, GitHub: mhoglan):


Graphite does a similar thing, not specifically their data files, but the lib files are specified in an absolute location (/opt/graphite/webapp) in the setup.cfg, and it results in the files being under site-packages/opt/graphite/... when you build a wheel and install it in a virtualenv.

When building from source, I would specify --install-options to change those locations to be relative to the virtualenv, but that does not seem possible to pass those options into pip wheel.

Removing the prefix / lib configurations in the setup.cfg cause the wheel and source installs to behave the same (ends up in site-packages); Altering the wheel and getting rid of the /opt/graphite/webapp at the top level achieves the same thing (since it would have assumed prefix of . at install);

btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that. I see this as more of having to work with projects that are not defined cleanly. And probably allowing there to be consistency between a src install and a wheel install.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Mar 23, 2015

Contributor

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that.

Agreed!

I see this as more of having to work with projects that are not defined cleanly.

Well, actually the current problem is to work with package installers and virtualenvs that are defined cleanly!

Problem is that you may be able to put a data file somewhere using setup(data_files=xx) -- but can you determine where it went from your application instance!?

That's the main problem I'm facing with setuptools right now... when using setuptools, all paths for the data_files kwarg are relative to sys.prefix, but when installing in a virtualenv, they're not..

Contributor

agronholm commented Mar 23, 2015

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


btw, I would say its not good practice for a module to be specifying absolute paths... not virtualenv friendly, and I would hope projects would fix that.

Agreed!

I see this as more of having to work with projects that are not defined cleanly.

Well, actually the current problem is to work with package installers and virtualenvs that are defined cleanly!

Problem is that you may be able to put a data file somewhere using setup(data_files=xx) -- but can you determine where it went from your application instance!?

That's the main problem I'm facing with setuptools right now... when using setuptools, all paths for the data_files kwarg are relative to sys.prefix, but when installing in a virtualenv, they're not..

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Mar 29, 2015

Contributor

Original comment by Keerthan Jaic (Bitbucket: jck2, GitHub: jck2):


Is there a uniform way to find (relative) package data which works irrespective of whether the package is installed globally or in a venv?

Contributor

agronholm commented Mar 29, 2015

Original comment by Keerthan Jaic (Bitbucket: jck2, GitHub: jck2):


Is there a uniform way to find (relative) package data which works irrespective of whether the package is installed globally or in a venv?

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm May 18, 2015

Contributor

Original comment by Joo Tsao (Bitbucket: nuwa, GitHub: nuwa):


need support setup(data_files=/opt/xxx)

Contributor

agronholm commented May 18, 2015

Original comment by Joo Tsao (Bitbucket: nuwa, GitHub: nuwa):


need support setup(data_files=/opt/xxx)

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jun 4, 2015

Contributor

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


For reference, this is bug is essentially the same as #120
And since pip 7.0.0 all packages are now wheeled before install, meaning that this bug and #120 are getting prime exposure in several packages.
See pypa/pip#2874

Contributor

agronholm commented Jun 4, 2015

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


For reference, this is bug is essentially the same as #120
And since pip 7.0.0 all packages are now wheeled before install, meaning that this bug and #120 are getting prime exposure in several packages.
See pypa/pip#2874

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jun 5, 2015

Contributor

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@jck2 the simplest way for me is to only use package data effectively stored in a package directory side by side with the python code that needs them and never use data files.
Once you have this, dirname and __file__ will let you navigate to these data file locations relative to your python code location. Since the data is always in the same place relative to the calling code, the fact you are installed globally in a venv or else does not matter anymore.

As a simple example of this approach:

Contributor

agronholm commented Jun 5, 2015

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@jck2 the simplest way for me is to only use package data effectively stored in a package directory side by side with the python code that needs them and never use data files.
Once you have this, dirname and __file__ will let you navigate to these data file locations relative to your python code location. Since the data is always in the same place relative to the calling code, the fact you are installed globally in a venv or else does not matter anymore.

As a simple example of this approach:

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jun 5, 2015

Contributor

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


My work-around for pip 7.0 (because pip automatically creates wheels from sdists) is to include this in setup.py:

if 'bdist_wheel' in sys.argv:
    raise RuntimeError("This setup.py does not support wheels")

Pip will automatically skip the .whl packaging and run the normal sdist installation.

Why on earth this decision to make an unfinished packaging system deploy things that weren't intended for it by default is beyond my belief :( People who've made sdist installations, released them, and tested them, can create their .whl files themselves... this new bdist_wheel call prolonges the installation process and creates new unexpected behavior.

Contributor

agronholm commented Jun 5, 2015

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


My work-around for pip 7.0 (because pip automatically creates wheels from sdists) is to include this in setup.py:

if 'bdist_wheel' in sys.argv:
    raise RuntimeError("This setup.py does not support wheels")

Pip will automatically skip the .whl packaging and run the normal sdist installation.

Why on earth this decision to make an unfinished packaging system deploy things that weren't intended for it by default is beyond my belief :( People who've made sdist installations, released them, and tested them, can create their .whl files themselves... this new bdist_wheel call prolonges the installation process and creates new unexpected behavior.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Dec 10, 2015

Contributor

Original comment by Benjamin Reedlunn (Bitbucket: breedlun, GitHub: breedlun):


I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand?

Here is a link to my stack overflow question that goes into more detail.

Contributor

agronholm commented Dec 10, 2015

Original comment by Benjamin Reedlunn (Bitbucket: breedlun, GitHub: breedlun):


I just released my first python package, and it is affected by this issue. I would like to avoid absolute paths, as suggested, but I do not know the proper way. Can someone give me a hand?

Here is a link to my stack overflow question that goes into more detail.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 18, 2017

Contributor

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


This is a real problem for me as well.

My setup.py script works as expected with regard to data_files that use an absolute path and honors them when I do 'python setup.py install' however when I do 'python setup.py bdist_wheel' and then pip install my wheel the data_files that I specified with an absolute path and were correctly installed using a straight setup.py install ARE NOT installed correctly from the wheel and wind up relative to site-packages. I.e. site-packages/usr/lib/blah/blah

If I want to install a file outside of site-packages (say to an arbitrary place on the filesystem) I should be able to do that. The behaviour is inconsistent. I'd really like to see this fixed because right now I can't use wheels and that's exactly what I want to use.

Contributor

agronholm commented Jan 18, 2017

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


This is a real problem for me as well.

My setup.py script works as expected with regard to data_files that use an absolute path and honors them when I do 'python setup.py install' however when I do 'python setup.py bdist_wheel' and then pip install my wheel the data_files that I specified with an absolute path and were correctly installed using a straight setup.py install ARE NOT installed correctly from the wheel and wind up relative to site-packages. I.e. site-packages/usr/lib/blah/blah

If I want to install a file outside of site-packages (say to an arbitrary place on the filesystem) I should be able to do that. The behaviour is inconsistent. I'd really like to see this fixed because right now I can't use wheels and that's exactly what I want to use.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 18, 2017

Contributor

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


@joe_code - I can recommend finding a workaround, not using setup.py's setup(). Ultimately, that's what we did, and to be honest and despite my previous harsh rhetoric in this thread, it's nice to get rid of data_files and have a Python project that works inside virtual environments again and can be distributed with Wheel :)

Contributor

agronholm commented Jan 18, 2017

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


@joe_code - I can recommend finding a workaround, not using setup.py's setup(). Ultimately, that's what we did, and to be honest and despite my previous harsh rhetoric in this thread, it's nice to get rid of data_files and have a Python project that works inside virtual environments again and can be distributed with Wheel :)

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 18, 2017

Contributor

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


Hey Benjamin, thanks for your reply. Could you elaborate a little bit on your solution please?

Contributor

agronholm commented Jan 18, 2017

Original comment by joe_code (Bitbucket: joe_code, GitHub: Unknown):


Hey Benjamin, thanks for your reply. Could you elaborate a little bit on your solution please?

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 18, 2017

Contributor

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


In our case, we could factor out most of the files in /usr/share and turn them into "package data". The remaining files are now handled by OS installers (for instance debian packages, pkg for Mac, setup.exe for stuff Windows etc).

In case you don't want to create OS installers, you can have a "run first" approach for your application for which you do if not os.path.exists, possibly adding a file with your project's version in. The disadvantage is uninstallation.

Contributor

agronholm commented Jan 18, 2017

Original comment by Benjamin Bach (Bitbucket: benjaoming, GitHub: benjaoming):


In our case, we could factor out most of the files in /usr/share and turn them into "package data". The remaining files are now handled by OS installers (for instance debian packages, pkg for Mac, setup.exe for stuff Windows etc).

In case you don't want to create OS installers, you can have a "run first" approach for your application for which you do if not os.path.exists, possibly adding a file with your project's version in. The disadvantage is uninstallation.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 24, 2017

Contributor

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


I think it's a real problem that this decision has effectively broken a use case that many packages have relied in--in some cases for bad reasons, but in other cases for good reasons.

Although I personally feel like the reasoning behind the breakage has some merit, breaking things without offering some kind of guidance on how best to handle outside-Python resource files has created yet another sore point against Python packaging that has been raised by some of colleagues, and it's a valid complaint.

I think the argument "well we shouldn't just allow installing files to arbitrary system locations" is well meaning but ultimately spurious. It's true that, depending on what install_data gets set to, the paths which can be installed to is somewhat limited making it hard, say, to overwrite /etc/hosts. Yet pip will also happily overwrite executables in /usr/bin, for example, which I think is awful and it shouldn't. So really you're making a security-related argument that falls apart because there's actually no promise of security when installing a wheel system-wide (outside a virtualenv). Meanwhile it's possible to hand-craft wheels with files in the .data directory that can be installed almost anywhere within /usr at the very least.

I think a better approach would be to not make arbitrary decisions for software developers who know what they're doing, and where necessary protect users (and developers who don't know what they're doing) by not allowing pip to overwrite files that already exist on their system (especially for "data files").

Contributor

agronholm commented Jan 24, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


I think it's a real problem that this decision has effectively broken a use case that many packages have relied in--in some cases for bad reasons, but in other cases for good reasons.

Although I personally feel like the reasoning behind the breakage has some merit, breaking things without offering some kind of guidance on how best to handle outside-Python resource files has created yet another sore point against Python packaging that has been raised by some of colleagues, and it's a valid complaint.

I think the argument "well we shouldn't just allow installing files to arbitrary system locations" is well meaning but ultimately spurious. It's true that, depending on what install_data gets set to, the paths which can be installed to is somewhat limited making it hard, say, to overwrite /etc/hosts. Yet pip will also happily overwrite executables in /usr/bin, for example, which I think is awful and it shouldn't. So really you're making a security-related argument that falls apart because there's actually no promise of security when installing a wheel system-wide (outside a virtualenv). Meanwhile it's possible to hand-craft wheels with files in the .data directory that can be installed almost anywhere within /usr at the very least.

I think a better approach would be to not make arbitrary decisions for software developers who know what they're doing, and where necessary protect users (and developers who don't know what they're doing) by not allowing pip to overwrite files that already exist on their system (especially for "data files").

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 24, 2017

Contributor

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


Allowing absolute paths breaks the isolation of virtual environments.

Contributor

agronholm commented Jan 24, 2017

Original comment by Donald Stufft (Bitbucket: dstufft, GitHub: dstufft):


Allowing absolute paths breaks the isolation of virtual environments.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 31, 2017

Contributor

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


So treat absolute paths as relative to the root of a virtualenv when installing in a virtualenv, and don't break their semantics on system installs.

Contributor

agronholm commented Jan 31, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


So treat absolute paths as relative to the root of a virtualenv when installing in a virtualenv, and don't break their semantics on system installs.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 31, 2017

Contributor

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@embray but is pip aware of being in a virtualenv at all?

Contributor

agronholm commented Jan 31, 2017

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


@embray but is pip aware of being in a virtualenv at all?

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 31, 2017

Contributor

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


Actually it is: https://github.com/pypa/pip/blob/d86d1713647f791979b9267ffc5773479d0ef469/pip/locations.py#L39

Contributor

agronholm commented Jan 31, 2017

Original comment by pombredanne NA (Bitbucket: pombredanne, GitHub: pombredanne):


Actually it is: https://github.com/pypa/pip/blob/d86d1713647f791979b9267ffc5773479d0ef469/pip/locations.py#L39

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Jan 31, 2017

Contributor

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


Yes, it has to be--especially to be able to deal with the nuances between virtualenvs with and without "global site-packages".

Contributor

agronholm commented Jan 31, 2017

Original comment by Erik Bray (Bitbucket: embray, GitHub: embray):


Yes, it has to be--especially to be able to deal with the nuances between virtualenvs with and without "global site-packages".

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Mar 5, 2017

Contributor

Original comment by Benno Fünfstück (Bitbucket: bennofs, GitHub: bennofs):


Is there a way right now to install some file into site-packages that works with both setuptools and bdist_wheel? For example, if I want to install a native library that is later loaded by my application. Or should I not use site-packages for that?

Contributor

agronholm commented Mar 5, 2017

Original comment by Benno Fünfstück (Bitbucket: bennofs, GitHub: bennofs):


Is there a way right now to install some file into site-packages that works with both setuptools and bdist_wheel? For example, if I want to install a native library that is later loaded by my application. Or should I not use site-packages for that?

@bennofs

This comment has been minimized.

Show comment
Hide comment
@bennofs

bennofs Aug 6, 2018

@agronholm The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

This is also why what @cheshirekow is doing is a bad idea: try installing your package not using wheels (pip will do this if the wheel package is not installed), and it'll fail: when not using wheels, a path like /xxxxx will install directly into the root directory, but when using wheels, it installs it into the site-packages dir.

@agronholm I believe reliably installing into site-packages is not easily possible. There are two hacky solutions I can think of, but no guarrantee that they work all the time:

  • use the / variant for wheels (detecting wheel build in setup.py is also only possible with hacks, such as inspecting sys.argv) and use absolute paths for non-wheel builds
  • try to generate the correct relative path such that sys.prefix + rel_path ends up in site_packages. I am not sure if that's even possible in all cases, not sure what configurations python allows (for example: could you configure python such that its libdir is in /lib/python... while the prefix is /usr?)

In any case, here's a few things I think you have to watch out for when testing your solution:

  • different python versions
  • wheel vs no-wheel
  • some distributions handle site-packages differently (for example, compare Arch vs Ubuntu. I think ubuntu uses dist-packages?)

I tried to do this once in https://github.com/bennofs/capstone/blob/23fe9f36622573c747e2bab6119ff245437bf276/bindings/python/setup.py, but this was too long ago so I can no longer say with confidence that this is the right approach.

bennofs commented Aug 6, 2018

@agronholm The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

This is also why what @cheshirekow is doing is a bad idea: try installing your package not using wheels (pip will do this if the wheel package is not installed), and it'll fail: when not using wheels, a path like /xxxxx will install directly into the root directory, but when using wheels, it installs it into the site-packages dir.

@agronholm I believe reliably installing into site-packages is not easily possible. There are two hacky solutions I can think of, but no guarrantee that they work all the time:

  • use the / variant for wheels (detecting wheel build in setup.py is also only possible with hacks, such as inspecting sys.argv) and use absolute paths for non-wheel builds
  • try to generate the correct relative path such that sys.prefix + rel_path ends up in site_packages. I am not sure if that's even possible in all cases, not sure what configurations python allows (for example: could you configure python such that its libdir is in /lib/python... while the prefix is /usr?)

In any case, here's a few things I think you have to watch out for when testing your solution:

  • different python versions
  • wheel vs no-wheel
  • some distributions handle site-packages differently (for example, compare Arch vs Ubuntu. I think ubuntu uses dist-packages?)

I tried to do this once in https://github.com/bennofs/capstone/blob/23fe9f36622573c747e2bab6119ff245437bf276/bindings/python/setup.py, but this was too long ago so I can no longer say with confidence that this is the right approach.

@gsemet

This comment has been minimized.

Show comment
Hide comment
@gsemet

gsemet Aug 6, 2018

I am thinking about making a getpkgdata library that would actually work in all cases. It only needs to detect if the current version is develop environment (pip install -e) or in a distribution package install.

gsemet commented Aug 6, 2018

I am thinking about making a getpkgdata library that would actually work in all cases. It only needs to detect if the current version is develop environment (pip install -e) or in a distribution package install.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Aug 6, 2018

Contributor

The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

Regardless, this was opened as an issue on the wheel bug tracker. Therefore I need to know what I have to do in terms of modifying the wheel code in order to close this issue.

Contributor

agronholm commented Aug 6, 2018

The problem is not how wheel does it, the problem is that behavior differs between wheel and setuptools installation methods.

Regardless, this was opened as an issue on the wheel bug tracker. Therefore I need to know what I have to do in terms of modifying the wheel code in order to close this issue.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Aug 6, 2018

Contributor

The crux of the problem is, IMHO, the fact that the wheel PEP speaks so vaguely about the .data directory and how it's supposed to work.

For reference: https://www.python.org/dev/peps/pep-0427/#the-data-directory

Contributor

agronholm commented Aug 6, 2018

The crux of the problem is, IMHO, the fact that the wheel PEP speaks so vaguely about the .data directory and how it's supposed to work.

For reference: https://www.python.org/dev/peps/pep-0427/#the-data-directory

@dholth

This comment has been minimized.

Show comment
Hide comment
@dholth

dholth Aug 17, 2018

Member

The reported problem goes all the way back to the invention of virtualenv and eggs. It is possible that wheel breaks it further, but the problem is that the "data" category (*.data/data/**) is installed in different places depending and we don't store which path that actually was when we install your program.
The way to fix it would be to allow custom folder names in .data/ with interpolated target paths, and a record of where those members were installed that a pkg_resources-type function could access.
Then the packager says "I would like to define /etc/hosts" but if you are installing into a virtualenv the file actually winds up in $virtualenv/etc/hosts or anywhere else depending on how the installer was invoked.
Several people have been interested in this kind of solution but so far not enough to actually write the PEP.

Member

dholth commented Aug 17, 2018

The reported problem goes all the way back to the invention of virtualenv and eggs. It is possible that wheel breaks it further, but the problem is that the "data" category (*.data/data/**) is installed in different places depending and we don't store which path that actually was when we install your program.
The way to fix it would be to allow custom folder names in .data/ with interpolated target paths, and a record of where those members were installed that a pkg_resources-type function could access.
Then the packager says "I would like to define /etc/hosts" but if you are installing into a virtualenv the file actually winds up in $virtualenv/etc/hosts or anywhere else depending on how the installer was invoked.
Several people have been interested in this kind of solution but so far not enough to actually write the PEP.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Aug 18, 2018

Contributor

The gist of it seems to be that the behavior of .data is not documented anywhere and (some) people don't like how it works in practice. Correct?

Contributor

agronholm commented Aug 18, 2018

The gist of it seems to be that the behavior of .data is not documented anywhere and (some) people don't like how it works in practice. Correct?

@pfmoore

This comment has been minimized.

Show comment
Hide comment
@pfmoore

pfmoore Aug 18, 2018

Member

Sounds about right - expanding on https://www.python.org/dev/peps/pep-0427/#the-data-directory, initially to capture current practice, and ultimately to document what people want to happen, in a way that gives installers enough detail to implement unambiguously, would be good.

https://www.python.org/dev/peps/pep-0491 made a start in that direction, but I don't believe it was ever finalised or approved - @dholth is that correct?

Member

pfmoore commented Aug 18, 2018

Sounds about right - expanding on https://www.python.org/dev/peps/pep-0427/#the-data-directory, initially to capture current practice, and ultimately to document what people want to happen, in a way that gives installers enough detail to implement unambiguously, would be good.

https://www.python.org/dev/peps/pep-0491 made a start in that direction, but I don't believe it was ever finalised or approved - @dholth is that correct?

@njsmith

This comment has been minimized.

Show comment
Hide comment
@njsmith

njsmith Sep 13, 2018

Member

So IIUC, the data directory in wheels has never worked in a useful way, and even if it did work it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

If that's right then I would deprecate and remove data, not try to fix it. There's nothing here to fix.

Member

njsmith commented Sep 13, 2018

So IIUC, the data directory in wheels has never worked in a useful way, and even if it did work it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

If that's right then I would deprecate and remove data, not try to fix it. There's nothing here to fix.

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Sep 13, 2018

Contributor

Where would you put things like man pages then?

Contributor

agronholm commented Sep 13, 2018

Where would you put things like man pages then?

@benoit-pierre

This comment has been minimized.

Show comment
Hide comment
@benoit-pierre

benoit-pierre Sep 13, 2018

Member

If you need to package things like man pages, make a proper distribution package. IMHO those have no place in a Python package.

Member

benoit-pierre commented Sep 13, 2018

If you need to package things like man pages, make a proper distribution package. IMHO those have no place in a Python package.

@pfmoore

This comment has been minimized.

Show comment
Hide comment
@pfmoore

pfmoore Sep 13, 2018

Member

I agree. But I do think we need to clarify (somewhere) what does constitute a "proper Python package". People are using the packaging toolset for all sorts of things that often go beyond the description of "a basic Python library" (for example, pip itself is not a library, rather it's a command line application, but it's distributed as a wheel).

My informal view on what constitutes a "valid Python package" is:

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.
  2. Must not need installation of any files in absolute locations, or OS-specific locations. It's perfectly OK to look for such files at runtime, but don't ship files that should be pre-installed.
  3. Must not require integration with OS services (e.g. installation as a system service, integration with system documentation services like manpages, registration with system package managers or in a system registry, ...)

If you fail any of these criteria (or can't work out some compromise of your own, like asking users to run a post-install script manually) then the Python packaging toolset isn't what you want. Of course, many people will use it anyway because the alternatives aren't that straightforward. But they might have to find their own fixes for things we don't support.

I'm happy if that's not actually what we choose to take as a definition - for example, if someone wanted to extend the wheel spec to support (in a suitably cross-platform way) absolute paths and/or locations for things like mampages, then I'd be fine with that. But until that happens, the above is my rule of thumb on what's in scope and supported for packaging.

Member

pfmoore commented Sep 13, 2018

I agree. But I do think we need to clarify (somewhere) what does constitute a "proper Python package". People are using the packaging toolset for all sorts of things that often go beyond the description of "a basic Python library" (for example, pip itself is not a library, rather it's a command line application, but it's distributed as a wheel).

My informal view on what constitutes a "valid Python package" is:

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.
  2. Must not need installation of any files in absolute locations, or OS-specific locations. It's perfectly OK to look for such files at runtime, but don't ship files that should be pre-installed.
  3. Must not require integration with OS services (e.g. installation as a system service, integration with system documentation services like manpages, registration with system package managers or in a system registry, ...)

If you fail any of these criteria (or can't work out some compromise of your own, like asking users to run a post-install script manually) then the Python packaging toolset isn't what you want. Of course, many people will use it anyway because the alternatives aren't that straightforward. But they might have to find their own fixes for things we don't support.

I'm happy if that's not actually what we choose to take as a definition - for example, if someone wanted to extend the wheel spec to support (in a suitably cross-platform way) absolute paths and/or locations for things like mampages, then I'd be fine with that. But until that happens, the above is my rule of thumb on what's in scope and supported for packaging.

@sashkab

This comment has been minimized.

Show comment
Hide comment
@sashkab

sashkab Sep 13, 2018

2. Must not need installation of any files in absolute locations,

What about relative location, but not inside of site-package? I.e we need to install something into $VIRTUAL_ENV/xxx. What's the best way here? Still come up with the install script for user to run manually? I need to support installation into virtual environment, I don't need to support any other installation method, because of the package specifics.

sashkab commented Sep 13, 2018

2. Must not need installation of any files in absolute locations,

What about relative location, but not inside of site-package? I.e we need to install something into $VIRTUAL_ENV/xxx. What's the best way here? Still come up with the install script for user to run manually? I need to support installation into virtual environment, I don't need to support any other installation method, because of the package specifics.

@benoit-pierre

This comment has been minimized.

Show comment
Hide comment
@benoit-pierre

benoit-pierre Sep 13, 2018

Member

@sashkab: Assuming that would work at installation time, what would the code to be able to use those resources at runtime look like?

Member

benoit-pierre commented Sep 13, 2018

@sashkab: Assuming that would work at installation time, what would the code to be able to use those resources at runtime look like?

@dholth

This comment has been minimized.

Show comment
Hide comment
@dholth

dholth Sep 13, 2018

Member

The status quo discourages Python as an application development language, and I think that is a shame. setup.py didn't start out as a library management tool, but it became that during the decade when web development was the most important domain for Python, and no one noticed the broken setup.py features. If we respect the authors, packages should be able to contain anything that their authors want to put in. The strict separation between packaging and installation that we get from wheel also gives the person who uses that package complete control of how it gets installed.
I think in many cases it is more likely that the prospective Python application developer, writing a mostly-cross-platform application, will choose a different programming language whose dominant packaging tool better supports applications rather than try to make a distribution-specific package.
@pfmoore is correct that without further work a "valid Python package" has those enumerated properties.

Member

dholth commented Sep 13, 2018

The status quo discourages Python as an application development language, and I think that is a shame. setup.py didn't start out as a library management tool, but it became that during the decade when web development was the most important domain for Python, and no one noticed the broken setup.py features. If we respect the authors, packages should be able to contain anything that their authors want to put in. The strict separation between packaging and installation that we get from wheel also gives the person who uses that package complete control of how it gets installed.
I think in many cases it is more likely that the prospective Python application developer, writing a mostly-cross-platform application, will choose a different programming language whose dominant packaging tool better supports applications rather than try to make a distribution-specific package.
@pfmoore is correct that without further work a "valid Python package" has those enumerated properties.

@njsmith

This comment has been minimized.

Show comment
Hide comment
@njsmith

njsmith Sep 13, 2018

Member
  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.

I just want to note that the other 2 requirements pretty much follow from this one. If we want to support man pages, the way to do that is to extend the definition of a "Python environment" to include a man-pages directory, which would require figuring out what that means in all of these cases.

Wheels are a high-level representation of a Python package, abstracted over the specific details of the installation environment. If you want a high-level representation of an arbitrary application, that's just a different thing, and wheels are not well-suited to that problem. There are many other tools that are designed to solve that problem, like rpms, debs, MacOS/Windows application installers, etc.

Member

njsmith commented Sep 13, 2018

  1. Must support installation in any Python environment, including the system Python, user-site, and a virtualenv.

I just want to note that the other 2 requirements pretty much follow from this one. If we want to support man pages, the way to do that is to extend the definition of a "Python environment" to include a man-pages directory, which would require figuring out what that means in all of these cases.

Wheels are a high-level representation of a Python package, abstracted over the specific details of the installation environment. If you want a high-level representation of an arbitrary application, that's just a different thing, and wheels are not well-suited to that problem. There are many other tools that are designed to solve that problem, like rpms, debs, MacOS/Windows application installers, etc.

@dholth

This comment has been minimized.

Show comment
Hide comment
@dholth

dholth Sep 13, 2018

Member
Member

dholth commented Sep 13, 2018

@njsmith

This comment has been minimized.

Show comment
Hide comment
@njsmith

njsmith Sep 13, 2018

Member

If you have data you want to access at runtime, then we already have a standard and well-supported solution (it even works for packages installed in zips!): https://docs.python.org/3.7/library/importlib.html#module-importlib.resources

Member

njsmith commented Sep 13, 2018

If you have data you want to access at runtime, then we already have a standard and well-supported solution (it even works for packages installed in zips!): https://docs.python.org/3.7/library/importlib.html#module-importlib.resources

@dholth

This comment has been minimized.

Show comment
Hide comment
@dholth

dholth Sep 13, 2018

Member
Member

dholth commented Sep 13, 2018

@pfmoore

This comment has been minimized.

Show comment
Hide comment
@pfmoore

pfmoore Sep 13, 2018

Member

I've suggested that if a wheel contained a package-1.0.data/docs/
directory, that the installer could place those files into e.g. $virtualenv/share/docs/$packagename-
$packageversion by default. Imagine that plus a few more categories.

Indeed. If someone wanted to flesh out that proposal, put it into the form of a PEP/standard and get it approved and then implemented in the various tools, then that would probably cover a lot of the use cases I've seen mentioned in the past. Of course, no-one has yet volunteered to champion the suggestion. It really needs someone with an actual stake in the issue to step up, or it's going to forever sit behind other priorities.

Member

pfmoore commented Sep 13, 2018

I've suggested that if a wheel contained a package-1.0.data/docs/
directory, that the installer could place those files into e.g. $virtualenv/share/docs/$packagename-
$packageversion by default. Imagine that plus a few more categories.

Indeed. If someone wanted to flesh out that proposal, put it into the form of a PEP/standard and get it approved and then implemented in the various tools, then that would probably cover a lot of the use cases I've seen mentioned in the past. Of course, no-one has yet volunteered to champion the suggestion. It really needs someone with an actual stake in the issue to step up, or it's going to forever sit behind other priorities.

@dholth

This comment has been minimized.

Show comment
Hide comment
@dholth

dholth Sep 13, 2018

Member
Member

dholth commented Sep 13, 2018

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 14, 2018

So IIUC, the data directory in wheels has never worked in a useful way,

Wrong! The data directory (and data_files in setup.py) is useful in several ways. For example, it can be used to install Jupyter files such as Jupyter kernel specs or Jupyter notebook extensions (example). And I see nothing wrong with installing man pages or documentation using data.

jdemeyer commented Sep 14, 2018

So IIUC, the data directory in wheels has never worked in a useful way,

Wrong! The data directory (and data_files in setup.py) is useful in several ways. For example, it can be used to install Jupyter files such as Jupyter kernel specs or Jupyter notebook extensions (example). And I see nothing wrong with installing man pages or documentation using data.

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 14, 2018

What about relative location, but not inside of site-package?

That's exactly the use case that data_files solves.

jdemeyer commented Sep 14, 2018

What about relative location, but not inside of site-package?

That's exactly the use case that data_files solves.

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 14, 2018

it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

You are confusing two different use cases for package_data and data_files.

package_data is useful for data files used by the package itself (or possibly other Python tools looking there).

data_files on the other hard is useful for data files used by other software (which may not even be written in Python).

jdemeyer commented Sep 14, 2018

it wouldn't be any more useful than putting the data files into the package directory. Also, data-in-package-dir is what pkg_resources and similar tools have standardized on.

You are confusing two different use cases for package_data and data_files.

package_data is useful for data files used by the package itself (or possibly other Python tools looking there).

data_files on the other hard is useful for data files used by other software (which may not even be written in Python).

@njsmith

This comment has been minimized.

Show comment
Hide comment
@njsmith

njsmith Sep 15, 2018

Member

what's this other software, that has nothing to do with Python, but it understands about Python environment layouts, including the data directory that even most Python software doesn't understand, but that doesn't know how to find package_data?

Member

njsmith commented Sep 15, 2018

what's this other software, that has nothing to do with Python, but it understands about Python environment layouts, including the data directory that even most Python software doesn't understand, but that doesn't know how to find package_data?

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 15, 2018

understands about Python environment layouts

"environments" are not specific to Python at all. Most open source software packages have a concept of installation prefix, analogous to sys.prefix. Conda for example installs everything (Python packages but also other packages) in a common prefix.

jdemeyer commented Sep 15, 2018

understands about Python environment layouts

"environments" are not specific to Python at all. Most open source software packages have a concept of installation prefix, analogous to sys.prefix. Conda for example installs everything (Python packages but also other packages) in a common prefix.

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 15, 2018

what's this other software

Jupyter packages are a good example. While many Jupyter kernels are written using Python, that is not a requirement: it is possible to implement the Jupyter protocol without Python. So they decided to use data_files for that, which makes it work the same way for Python packages and non-Python packages.

jdemeyer commented Sep 15, 2018

what's this other software

Jupyter packages are a good example. While many Jupyter kernels are written using Python, that is not a requirement: it is possible to implement the Jupyter protocol without Python. So they decided to use data_files for that, which makes it work the same way for Python packages and non-Python packages.

@jdemeyer

This comment has been minimized.

Show comment
Hide comment
@jdemeyer

jdemeyer Sep 15, 2018

And the man pages example is also a good one (even though I personally don't know any Python package which installs a man page).

jdemeyer commented Sep 15, 2018

And the man pages example is also a good one (even though I personally don't know any Python package which installs a man page).

@agronholm

This comment has been minimized.

Show comment
Hide comment
@agronholm

agronholm Sep 30, 2018

Contributor

The consensus (?) seems to be that this needs a new standard and that wheel itself is currently not doing anything wrong. If someone wants this to be reopened, be specific about what changes are required for the wheel project. Otherwise a new issue could be opened when a new standard emerges that requires implementation here.

Contributor

agronholm commented Sep 30, 2018

The consensus (?) seems to be that this needs a new standard and that wheel itself is currently not doing anything wrong. If someone wants this to be reopened, be specific about what changes are required for the wheel project. Otherwise a new issue could be opened when a new standard emerges that requires implementation here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment