Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On Debian and derivatives, python and python3 modules are wrong about installation paths #8739

Open
woju opened this issue May 7, 2021 · 15 comments
Labels
modules:python Issues specific to the python module

Comments

@woju
Copy link

woju commented May 7, 2021

Describe the bug
On Debian and derivatives, Python paths returned by import('python3').sysconfig_path() and import('python').find_installation().get_path() are not in sys.path (for --prefix=/usr/local and --prefix=/usr), because they have slight differences (dist-packages vs site-packages, pythonX vs pythonX.Y).

To Reproduce

python3_pkgdir = join_paths(import('python3').sysconfig_path('platlib'), 'packagename')
install_data('__init__.py', install_dir: python3_pkgdir)

import('python').find_installation().install_sources('module.py')

Expected behavior
I expect that if using --prefix=/usr or --prefix=/usr/local the package will be installed so that it can be imported without explicitly setting $PYTHONPATH.

system parameters

  • Is this a cross build or just a plain native build (for the same computer)?
    • native (didn't check for cross-compilation)
what operating system (e.g. MacOS Catalina, Windows 10, CentOS 8.0, Ubuntu 18.04, etc.) Ubuntu 18.04 Debian 10 Ubuntu 20.04  
what Python version are you using e.g. 3.8.0 3.6 3.7 3.8 i.e. system python
what meson --version 0.45.1 0.49.2 0.53.2 but I believe this is reproducible on current master
what ninja --version if it's a Ninja build 1.8.2 1.8.2 1.10.0 N/A I think

further information

There is a post https://discuss.python.org/t/pep-632-deprecate-distutils-module/5134/122 (the discussion is revolving around PEP 632, which deprecates distutils package) which describes the status quo, but the explanation is incomplete. What really happens is that Debian patches only distutils module and not sysconfig, and furthermore those changes only really work for --prefix=/usr, because with --prefix=/usr/local the paths are invalid because of the pythonX vs pythonX.Y difference.

There are 4 approaches to this problem using only standard library:

  • sysconfig (doesn't work, because on debian and derivatives it returns wrong things)
  • distutils.sysconfig (only works for --prefix=/usr)
  • site (AFAIK there is no clear way to infer the correct data from any function there)
  • distutils.command.install.INSTALL_SCHEMES IMHO is what really works with proper detection of the corner case

The patches are here: https://salsa.debian.org/cpython-team/python3-stdlib/-/tree/master/debian/patches, the relevant one is called distutils-install-layout.diff.

Without blaming anyone, the situation is both python meson modules are unfit for their stated purpose.

workaround

Currently we use a workaround for this: gramineproject/graphene#2353.

def get_platlib(prefix):
    is_debian = 'deb_system' in distutils.command.install.INSTALL_SCHEMES

    # this takes care of `/` at the end, though not `/usr/../usr/local`
    is_usr_local = pathlib.PurePosixPath(prefix).as_posix() == '/usr/local'

    if is_debian and is_usr_local:
        # we have to compensate for Debian
        return distutils.util.subst_vars(
            distutils.command.install.INSTALL_SCHEMES['unix_local']['platlib'],
            {
                'platbase': '/usr',
                'py_version_short': '.'.join(map(str, sys.version_info[:2])),
            })

    return distutils.sysconfig.get_python_lib(plat_specific=True, prefix=prefix)

This is obviously not sufficient, because there is also purelib, but we don't use it, so in original code I didn't bother.

@eli-schwartz
Copy link
Member

Shouldn't meson's behavior here be exactly identical to debian users using other mechanisms such as unpatched setuptools/pip via get-pip.py, or using pep 517/518 build systems that handle installation themselves?

I knew debian is a hot mess here, but I didn't realize it was this bad! What do debian packagers packaging software built with meson, do to fix their distro packaging recipes?

@woju
Copy link
Author

woju commented May 7, 2021

Shouldn't meson's behavior here be exactly identical to debian users using other mechanisms such as unpatched setuptools/pip via get-pip.py, or using pep 517/518 build systems that handle installation themselves?

Sorry to pick the language, but "unpatched pip" is a misconception: pip, setuptools etc. internally use distutils stdlib package, so it's both unpatched and patched at the same time. You can't really get "unpatched pip" on system's python if by "unpatched" you mean the totality of behaviour.

This is getting less true as we speak:

  • PEP 632 promises that setuptools vendor their own distutils and import distutils is deprecated.
  • Some time between 3.8 and 3.10 distutils.command.install.INSTALL_SCHEMES converted from static definition to some convoluted import from sysconfig._INSTALL_SCHEMES because "single source of truth".

Now debian circumvented it, their own truth is still not reflected in sysconfig's functions, so if something doesn't change I'm afraid they will still patch that vendored distutils in setuptools, at the same time ignoring sysconfig.

I knew debian is a hot mess here, but I didn't realize it was this bad! What do debian packagers packaging software built with meson, do to fix their distro packaging recipes?

I honestly don't know, I haven't seen any official packages doing that, and I've seen other approaches like libvirt: the main library is built using meson and libvirt-python (another git repo) is built with usual setup.py. I suspect installing python using meson is a niche use case, so if they do it, they might just fix it up with mv after DESTDIR install. And they know it's their own "hot mess", so they clean it up themselves, I doubt you (= meson upstream) will see any report from debian packagers about this problem.

I'd totally understand if you WONTFIX this and instead say this is debian's fault for partially patching some functions (distutils) and not patching other, more documented (sysconfig), but I'd be grateful if meson could at least consider aligning with what pip (patched or unpatched) does. The partial solution I posted might be a start, and if you need, I can post Signed-off-by for this or similar piece of code.

@woju
Copy link
Author

woju commented May 7, 2021

For posterity, here's a script I used while designing a solution:

#!/usr/bin/env python3

import distutils.command.install
import distutils.sysconfig
import pprint
import sys
import sysconfig

for expr in (
    "sys.path",
    "sysconfig.get_paths()",
    "sysconfig.get_paths(vars={'base': '/usr/local', 'platbase': '/usr/local'})",
    "distutils.sysconfig.get_python_lib()",
    "distutils.sysconfig.get_python_lib(prefix='/usr/local')",
    "distutils.sysconfig.get_python_lib(plat_specific=True, prefix='/usr/local')",
    "distutils.command.install.INSTALL_SCHEMES",
):
    print(f'{expr}={pprint.pformat(eval(expr))}')

@eli-schwartz
Copy link
Member

^^ Cinnamon has gone the route of get_install_dir() but having a meson_options.txt option used only in the debian packaging, to override the installation directory. The install dir is then used for install_data().

@woju
Copy link
Author

woju commented Jun 25, 2021 via email

@eli-schwartz
Copy link
Member

Yeah, a proper method is of course best...

Documentation updates are cheap, I'd be open to mentioning the Debian corner case.

We could also continue to discuss special casing this in meson just for Debian's sake so workarounds are not needed (we even do this for --libdir defaults although there we don't have an official API like sysconfig to return what the OS is supposed to use), but so far I'm the only core maintainer who responded to the ticket so I dunno what everyone thinks here. I'd really like to know what the general view of the team is on handling Debian here...

@jpakkane
Copy link
Member

The desired end user experience is fairly obvious: when doing the default thing with default settings, the end result should work on all distros. If every Python project needs to add workaround code then that is bad. The question is if and how we can make that happen. I've read through this entire thread and still don't fully understand the issue...

@jpakkane
Copy link
Member

Also note that the python3 module is deprecated and you should use python instead. Is the same issue in that module as well?

@eli-schwartz
Copy link
Member

eli-schwartz commented Jun 28, 2021

Yes. The ticket's title specifically mentions they're both wrong.

Repeating the original report:

Describe the bug
On Debian and derivatives, Python paths returned by import('python3').sysconfig_path() and import('python').find_installation().get_path() are not in sys.path

This calls out both modules.

The problem is they both rely on the python stdlib sysconfig.get_path(), and on Debian this stdlib module is lying through its teeth.[0] And the one that works is patched to report the patched sys.path on Debian is per python upstream "deprecated and should not be used" and has no viable solution today, but may or may not in the future and it depends on whether the python upstream developers decide on some brand new API specifically designed to support Debian.

@rgommers
Copy link
Contributor

The deprecated function won't disappear till Oct 2023 (Python 3.12.0), so probably using that now while silencing the deprecation warning is the pragmatic thing to do. And then revisit in 6 or 12 months to see if setuptools has decided to expose that particular distutils API as its own (it's in the process of doing this for all of distutils, which it vendored).

And put it behind a version check:

if sys.version_info >= (3, 12):
    # use sysconfig - Debian is weird here, see gh-8739.
else:
    # This does the right thing:
    from distutils.command.install import INSTALL_SCHEMES

@eli-schwartz
Copy link
Member

I'm not arguing "don't use a deprecated thing"...

I'm pointing out it's technically the wrong approach, which only matters in the context of "that's why using it constitutes a Debian hack".

Using distutils as "the Debian hack" is a perfectly valid argument to make, at least IMO.

@eli-schwartz
Copy link
Member

(No, we are not adding a runtime dependency on setuptools, meson has a stdlib-only policy.)

@woju
Copy link
Author

woju commented Jun 29, 2021

There is a degree of uncertainty what Debian packaging will do in the future, so I'd respectfully suggest that if you'd like to include some solution in Meson, that should be at least coordinated with Debian and apart from Linux Mint people I haven't seen anyone from Debian packaging community weighting on this issue yet.

@jpakkane >I've read through this entire thread and still don't fully understand the issue...

That's OK, I don't think anyone does. For me it took about two days reading through patches and testing a solution and I don't think I understand this either, but I'll try explain this as I know it. Sorry in advance if you know any part already.

So let's start by stating that Python has an interpreter-wide variable sys.path, which is a list of locations (directories and zipfiles) from which the interpreter will load modules. This variable can be modified by multiple parties for various reasons: users can define envvar $PYTHONPATH or create so-called "virtual environments" which is basically a directory for modules separate from system's python and a script which spawns a subshell with $PYTHONPATH adjustment. Distros can affect sys.path by compilation options (like --prefix etc.) and by dropping site.py module which gets autoloaded in every interpreter process before parsing user's script (i.e. before the first import).

Python has two different types of modules, written in python or written in C. Modules can be from stdlib or installed otherwise. So python has mechanisms to have all that cartesian product (they're called {pure,plat}{,std}lib) in separate directories, though stdlib and platstdlib is usually combined in /usr/lib/pythonX.Y and additional modules are in /usr/lib/pythonX.Y/site-packages, though there may be variations like /usr/lib64/pythonX.Y depending on distro. There's also a possibility to install multiple Python 3s and have pure modules shared (because why not) in /usr/lib/pythonX. Python ecosystem has a variety of packaging and distribution toolchains (setuptools install from source, pip install from pypi repo, whole distros like [ana]conda, ...) and all that tooling is expected to extract the .py and .so files in the right places.

To this end, python's standard library provides a way to query for those directories at runtime. Here comes first hard part: there are two different mechanisms for that, older distutils module and newer sysconfig module. PEP 632 deprecates the former in favour of the latter.

Note that mainline Python has no LSB/FHS-style concept of separation of /usr and /usr/local -- in Python's worldview there is a single prefix (this slight oversimplfication is bending the truth, but holds in all major distros).

Now back to Debian. Because distros are free to adjust site.py and also have a technical capability to patch any modules from stdlib, Debian took liberty to add directories from /usr/local and change site-packages into dist-packages. The official explanation is that in case you compile python yourself (perhaps another, newer version on almost 2 yo Debian stable) packages installed into system's python on't interfere, esp. compiled ones -- in Python, pure .py modules might be compatible between a range of 3.Y versions, but compiled .so modules are compatible with a single, minor X.Y version. (In practice this is, shall we say, less than helpful, because you don't ever install self-compiled python into /usr).

But the technical way they substitute site- -> dist- is incomplete (and this is a chariable way to describe this, the less charitable way would be "clueless"), because they changed only distutils module because that's what's historically used by pip and other assorted tooling in ecosystem, all of which was written before sysconfig module, which was never fixed even when mainline python moved their source-of-truth definitions from distutils to sysconfig and rewrote distutils to be a wrapper around sysconfig. Debian rewrote their patches and dutifuly maintained the difference. Here's the limit of my understanding, in particular I don't know why they didn't fix the situation already. It's possible that they're caught in some compatibility problems with their own packages which might break if they changed something, or they have problem with no /usr/local support in Python's API, or just the less charitable assessment applies and they don't understand the implications of what they're doing. IDK.

So the script above, which compares sys.path against variety of ways that you can get "some" info about installation paths, run against Linux Mint 20 (which is based on Ubuntu 20.04, so Debian derivative) on system's python 3.8 returns this:

sys.path=['',
 '/usr/lib/python38.zip',
 '/usr/lib/python3.8',
 '/usr/lib/python3.8/lib-dynload',
 '/usr/local/lib/python3.8/dist-packages',
 '/usr/lib/python3/dist-packages',
 '/usr/lib/python3.8/dist-packages']
sysconfig.get_paths()={'data': '/usr',
 'include': '/usr/include/python3.8',
 'platinclude': '/usr/include/python3.8',
 'platlib': '/usr/lib/python3.8/site-packages',
 'platstdlib': '/usr/lib/python3.8',
 'purelib': '/usr/lib/python3.8/site-packages',
 'scripts': '/usr/bin',
 'stdlib': '/usr/lib/python3.8'}
sysconfig.get_paths(vars={'base': '/usr/local', 'platbase': '/usr/local'})={'data': '/usr/local',
 'include': '/usr/include/python3.8',
 'platinclude': '/usr/include/python3.8',
 'platlib': '/usr/local/lib/python3.8/site-packages',
 'platstdlib': '/usr/local/lib/python3.8',
 'purelib': '/usr/local/lib/python3.8/site-packages',
 'scripts': '/usr/local/bin',
 'stdlib': '/usr/lib/python3.8'}
distutils.sysconfig.get_python_lib()='/usr/lib/python3/dist-packages'
distutils.sysconfig.get_python_lib(prefix='/usr/local')='/usr/local/lib/python3/dist-packages'
distutils.sysconfig.get_python_lib(plat_specific=True, prefix='/usr/local')='/usr/local/lib/python3/dist-packages'
distutils.command.install.INSTALL_SCHEMES={'deb_system': {'data': '$base',
                'headers': '$base/include/python$py_version_short/$dist_name',
                'platlib': '$platbase/lib/python3/dist-packages',
                'purelib': '$base/lib/python3/dist-packages',
                'scripts': '$base/bin'},
 'nt': {'data': '$base',
        'headers': '$base/Include/$dist_name',
        'platlib': '$base/Lib/site-packages',
        'purelib': '$base/Lib/site-packages',
        'scripts': '$base/Scripts'},
 'nt_user': {'data': '$userbase',
             'headers': '$userbase/Python$py_version_nodot/Include/$dist_name',
             'platlib': '$usersite',
             'purelib': '$usersite',
             'scripts': '$userbase/Python$py_version_nodot/Scripts'},
 'unix_home': {'data': '$base',
               'headers': '$base/include/python/$dist_name',
               'platlib': '$base/lib/python',
               'purelib': '$base/lib/python',
               'scripts': '$base/bin'},
 'unix_local': {'data': '$base/local',
                'headers': '$base/local/include/python$py_version_short/$dist_name',
                'platlib': '$platbase/local/lib/python$py_version_short/dist-packages',
                'purelib': '$base/local/lib/python$py_version_short/dist-packages',
                'scripts': '$base/local/bin'},
 'unix_prefix': {'data': '$base',
                 'headers': '$base/include/python$py_version_short$abiflags/$dist_name',
                 'platlib': '$platbase/lib/python$py_version_short/site-packages',
                 'purelib': '$base/lib/python$py_version_short/site-packages',
                 'scripts': '$base/bin'},
 'unix_user': {'data': '$userbase',
               'headers': '$userbase/include/python$py_version_short$abiflags/$dist_name',
               'platlib': '$usersite',
               'purelib': '$usersite',
               'scripts': '$userbase/bin'}}

$base and $platbase is /usr. $py_version_short is X.Y (3.8 in this case), so please pay attention to every place that in fact has 3 instead and especially where /usr and /usr/local versions disagree.

@jpakkane
Copy link
Member

Is it possible to add a new method, say python_mod.detect_the_thing_where_i_should_install_my_stuff_into() that does all the required detection magic, does the expected thing 99% of the time and which people can then use and not have to care about any of this?

@woju
Copy link
Author

woju commented Jun 30, 2021 via email

stefanor added a commit to stefanor/meson that referenced this issue Dec 24, 2022
Debian now (since Python 3.10.2-6) adds the deb_system scheme to
sysconfig. Newer distutils (such as bundled with setuptools >= 60) adds
fetch schemes from sysconfig, rather than duplicating the sysconfig
schemes statically in distutils.command.install.

This change broke meson's deb_system check.

This patch replaces that mechanism (for newer Debian releases) with
explicit scheme selection, which is far simpler.
But it also retains the old mechanism, for older Debian releases that
require it (Debian <= 11).

Fixes: mesonbuild#8739 (for python module)
Fixes: https://bugs.debian.org/1026312
stefanor added a commit to stefanor/meson that referenced this issue Dec 24, 2022
stefanor added a commit to stefanor/meson that referenced this issue Dec 24, 2022
…n Debian

Fixes: mesonbuild#8739 (for python3 module, on Debian >= 12)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules:python Issues specific to the python module
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants