Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

makesetup: must link C extensions to libpython when compiled in shared mode #78995

Closed
vstinner opened this issue Sep 26, 2018 · 29 comments
Closed
Labels
3.8 build The build process and cross-build

Comments

@vstinner
Copy link
Member

vstinner commented Sep 26, 2018

BPO 34814
Nosy @Yhg1s, @warsaw, @nascheme, @doko42, @pitrou, @vstinner, @ericvsmith, @ned-deily, @encukou, @vadmium, @koobs, @yan12125, @serge-sans-paille, @JulienPalard, @stratakis
PRs
  • bpo-34814: Fix Modules/makesetup for shared libraries #9593
  • bpo-34814: don't link C extensions to libpython on Unix #9912
  • Files
  • Setup.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-04-16.15:39:01.687>
    created_at = <Date 2018-09-26.16:55:34.465>
    labels = ['invalid', 'build', '3.8']
    title = 'makesetup: must link C extensions to libpython when compiled in shared mode'
    updated_at = <Date 2019-04-29.16:16:50.803>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2019-04-29.16:16:50.803>
    actor = 'reimar'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-04-16.15:39:01.687>
    closer = 'vstinner'
    components = ['Build']
    creation = <Date 2018-09-26.16:55:34.465>
    creator = 'vstinner'
    dependencies = []
    files = ['47831']
    hgrepos = []
    issue_num = 34814
    keywords = ['patch']
    message_count = 29.0
    messages = ['326486', '326487', '326490', '326491', '326492', '326494', '326495', '326539', '326540', '326541', '326543', '326544', '326545', '326546', '326547', '326549', '326551', '326560', '327822', '327823', '327906', '334135', '340349', '340350', '340355', '341047', '341084', '341087', '341095']
    nosy_count = 16.0
    nosy_names = ['twouters', 'barry', 'nascheme', 'doko', 'pitrou', 'vstinner', 'eric.smith', 'ned.deily', 'petr.viktorin', 'martin.panter', 'koobs', 'yan12125', 'serge-sans-paille', 'mdk', 'cstratak', 'reimar']
    pr_nums = ['9593', '9912']
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue34814'
    versions = ['Python 3.8']

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 26, 2018

    Python can be compiled in "shared" mode: "./configure --enable-shared", Py_ENABLE_SHARED is defined in pyconfig.h. Most Linux distributions use this configuration.

    By default, Python builds most C extensions using setup.py which is based on distutils. The get_libraries() method of Lib/distutils/command/build_ext.py explicity add a dependency to libpythonX.Y if Py_ENABLE_SHARED is defined.

    But it is possible to use Modules/Setup to build some C extensions using Makefile rather than setup.py. If "*shared*" is in Modules/Setup, following modules will be compiled as libraries (".so" files on Linux). For example, RHEL and Fedora use this configuration for many C extensions. Problem: C extensions compiled like are not linked to libpython.

    Example of the issue on Fedora 28 with Python 2.7:

    $ ldd $(python2 -c 'import _struct; print(_struct.__file__)')
    	linux-vdso.so.1 (0x00007ffeedf38000)
    	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fb4da876000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007fb4da4b7000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007fb4daca1000)

    => notice the lack of libpython

    Python 3.6 is fine:

    $ ldd $(python3 -c 'import _struct; print(_struct.__file__)')
    	linux-vdso.so.1 (0x00007ffd493dd000)
    	libpython3.6m.so.1.0 => /lib64/libpython3.6m.so.1.0 (0x00007f47b9160000)
    	...

    Patch used by Fedora to build _struct (and other modules) using Makefile:

    https://src.fedoraproject.org/rpms/python2/blob/f27/f/python-2.7.1-config.patch

    Another example of patch, to build _contextvars as a shared library:

    diff --git a/Modules/Setup b/Modules/Setup
    index a0622cc8c6..975aeff70d 100644
    --- a/Modules/Setup
    +++ b/Modules/Setup
    @@ -148,7 +148,7 @@ _symtable symtablemodule.c
     # modules are to be built as shared libraries (see above for more
     # detail; also note that *static* or *disabled* cancels this effect):
     
    -#*shared*
    +*shared*
     
     # GNU readline.  Unlike previous Python incarnations, GNU readline is
     # now incorporated in an optional module, configured in the Setup file
    @@ -166,7 +166,7 @@ _symtable symtablemodule.c
     #array arraymodule.c   # array objects
     #cmath cmathmodule.c _math.c # -lm # complex math library functions
     #math mathmodule.c _math.c # -lm # math library functions, e.g. sin()
    -#_contextvars _contextvarsmodule.c  # Context Variables
    +_contextvars _contextvarsmodule.c  # Context Variables
     #_struct _struct.c     # binary structure packing/unpacking
     #_weakref _weakref.c   # basic weak reference support
     #_testcapi _testcapimodule.c    # Python C API test module

    Attached PR fixes Modules/makesetup to:

    • (1) Add a dependency on the Makefile target to libpython: to make sure that the parallel compilation works as expected
    • (2) Add a dependency to libpythonX.Y on the compiled shared library (".so" file on Linux)

    @vstinner vstinner added 3.8 build The build process and cross-build labels Sep 26, 2018
    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 26, 2018

    Setup.patch: Example of patch to modify Modules/Setup to compile _contextvars as a shared library, to test the fix.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 26, 2018

    Example of the bug:

    ---

    $ git apply ~/Setup.patch
    $ ./configure --with-pydebug --enable-shared
    $ make
    $ grep _contextvars Makefile
    (...)

    Modules/_contextvarsmodule.o: $(srcdir)/Modules/_contextvarsmodule.c; $(CC) $(CCSHARED) $(PY_CFLAGS) $(PY_CPPFLAGS) -c $(srcdir)/Modules/_contextvarsmodule.c -o Modules/_contextvarsmodule.o

    Modules/_contextvars$(EXT_SUFFIX): Modules/_contextvarsmodule.o; $(BLDSHARED) Modules/_contextvarsmodule.o -o Modules/_contextvars$(EXT_SUFFIX)

    $ find -name "_contextvars.*so"
    ./Modules/_contextvars.cpython-38dm-x86_64-linux-gnu.so
    
    $ ldd $(find -name "_contextvars.*so")
    	linux-vdso.so.1 (0x00007ffd27973000)
    	libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fd081433000)
    	libc.so.6 => /lib64/libc.so.6 (0x00007fd081074000)
    	/lib64/ld-linux-x86-64.so.2 (0x00007fd081854000)

    The _contextvars shared library is not linked to libpython. There is not "-lpythonX.Y" in the Makefile rule.

    Now with the patch:
    ---

    $ git clean -fdx
    $ git apply ~/Setup.patch
    $ ./configure --with-pydebug --enable-shared
    $ make
    $ grep _contextvars Makefile
    (...)

    Modules/_contextvarsmodule.o: $(srcdir)/Modules/_contextvarsmodule.c; $(CC) $(CCSHARED) $(PY_CFLAGS) $(PY_CPPFLAGS) -c $(srcdir)/Modules/_contextvarsmodule.c -o Modules/_contextvarsmodule.o

    Modules/_contextvars$(EXT_SUFFIX): Modules/_contextvarsmodule.o $(LDLIBRARY); $(BLDSHARED) Modules/_contextvarsmodule.o $(BLDLIBRARY) -o Modules/_contextvars$(EXT_SUFFIX)

    $ find -name "_contextvars.*so"
    ./Modules/_contextvars.cpython-38dm-x86_64-linux-gnu.so
    
    $ ldd $(find -name "_contextvars.*so")
    	linux-vdso.so.1 (0x00007ffd1e918000)
    	libpython3.8dm.so.1.0 => not found
            (...)

    With my patch, _contextvars.cpython-38dm-x86_64-linux-gnu.so is linked to libpython3.8dm.so.1.0 as expected. The Makefile rule adds $(LDLIBRARY) to the dependencies of the _contextvars(...).so rule and it adds $(BLDLIBRARY) to the linker flags of this rule.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 26, 2018

    Downstream (RHEL) issue:
    https://bugzilla.redhat.com/show_bug.cgi?id=1585201

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 26, 2018

    I copied the nosy list from bpo-32430: people who understand and care about the Modules/Setup file :-)

    @pitrou
    Copy link
    Member

    pitrou commented Sep 26, 2018

    Why do you call this a bug?
    For me it's the reverse: it's linking to libpython.so which is a bug. It means a C extension compiled with a shared-library Python cannot be imported on a monolithic Python (which doesn't have libpython.so). It's a real problem when you want to redistribute compiled C extensions: if you compile it on RedHat/CentOS, it won't work on Ubuntu/Debian (the reverse works).

    I even opened an issue about that: bpo-21536
    ("extension built with a shared python cannot be loaded with a static python")

    @pitrou
    Copy link
    Member

    pitrou commented Sep 26, 2018

    Of course, one workaround to satisfy everyone would be to build a (empty) libpython.so even on static Python builds. But I'm not sure Debian/Ubuntu would package it.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    I search if C extensions of the Python standard libraries are always linked or not to libpython... it's complicated. I tested _ctypes, _hashlib and _struct modules:

    • Debian and Ubuntu: NOT linked to libpython
    • Conda: LINKED to libpython
    • Mageia 7: LINKED to libpython
    • Fedora 28, RHEL 7: LINKED to libpython on Python 2.7 and 3.6, except _struct which is NOT linked to libpython on Python 2.7

    It means that using dlopen("libpython2.7.so.1.0", RTLD_LOCAL | RTLD_NOW) may or may not work depending on the Linux distribution and depending on the imported C extensions...

    If we use the example of Fedora: some C extensions are compiled using Makefile (the Fedora package modifies Modules/Setup, as I showed previously), but others are compiled by setup.py. For example, _ctypes and _hashlib are compiled by setup.py.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    Ah, it seems like the bpo-832799 (reported in 2003) is similar to the RHEL bug:
    https://bugzilla.redhat.com/show_bug.cgi?id=1585201

    Extract of the RHEL bug report:
    ---

    pythontest.c:
    #include <dlfcn.h>
    
    int main(int argc, char *argv[])
    {
        void *pylib = dlopen("libpython2.7.so.1.0", RTLD_LOCAL | RTLD_NOW);
        void (*Py_Initialize)(void) = dlsym(pylib, "Py_Initialize");
        Py_Initialize();
        int (*PyRun_SimpleStringFlags)(const char *, void *) = dlsym(pylib, "PyRun_SimpleStringFlags");
        PyRun_SimpleStringFlags("import json\n", 0);
        return 0;
    }
    1. Compile with "gcc -Wall -o pythontest pythontest.c -ldl -g"

    2. Run ./pythontest -

    Actual results:

    it will fail with ImportError: /usr/lib64/python2.7/lib-dynload/_struct.so: undefined symbol: PyFloat_Type
    ---

    The reporter is already aware of the fallback on RTLD_GLOBAL: "(optionally) change RTLD_LOCAL to RTLD_GLOBAL and see that it works".

    @pitrou
    Copy link
    Member

    pitrou commented Sep 27, 2018

    Le 27/09/2018 à 12:49, STINNER Victor a écrit :

    I search if C extensions of the Python standard libraries are always linked or not to libpython... it's complicated. I tested _ctypes, _hashlib and _struct modules:

    • Debian and Ubuntu: NOT linked to libpython

    Do you realize libpython.so doesn't exist on Debian / Ubuntu?

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    Extract of Antoine's comment on bpo-21536:

    (AFAIK, systems notorious for providing shared library Pythons are RedHat-alike systems, while Debian/Ubuntu provide statically linked Pythons)

    Oh. I didn't notice this major difference...

    • Ubuntu 16.04: Python 2.7.12 and Python 3.5.2 are not linked to libpython
    • Fedora 28: Python 2.7.15 and Python 3.6.6 are linked to libpython (--enable-shared)
    • FreeBSD 12 (alpha): Python 2.7 and Python 3.6 are linked to libpython (--enable-shared)

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    • FreeBSD 12 (alpha): Python 2.7 and Python 3.6 are linked to libpython (--enable-shared)

    Note: _ctypes, _hashlib and _struct are all linked to libpython, on Python 2 and Python 3.

    Antoine:

    Do you realize libpython.so doesn't exist on Debian / Ubuntu?

    No, I didn't :-)

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    Antoine:

    Why do you call this a bug?

    "./configure --enable-shared && make" links C extensions to libpython. It's surprising that C extensions compiled by Makefile behave differently (not linked to libpython). We need consistency: either *never* link to libpython, or *always* link to libpython.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    It means a C extension compiled with a shared-library Python cannot be imported on a monolithic Python (which doesn't have libpython.so). It's a real problem when you want to redistribute compiled C extensions: if you compile it on RedHat/CentOS, it won't work on Ubuntu/Debian (the reverse works).

    Is it a real use case? Why would anyone use a RHEL binary on Debian? Debian already provides the full standard library.

    C extensions of the standard library are tidily coupled to CPython. For example, it may be dangerous to use a C extension of Python 2.7.5 on Python 2.7.15.

    I'm talking about the very specific case of C extensions which are part of the stdlib.

    Third party C extensions distributed as portable wheel packages using the stable ABI is different use case.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 27, 2018

    Is it a real use case? Why would anyone use a RHEL binary on Debian? Debian already provides the full standard library.

    I'm not talking about the standard library obviously. I don't remember my original use case exactly, but I must have been compiling a C extension on a system and expected it to work on another.

    C extensions of the standard library are tidily coupled to CPython. For example, it may be dangerous to use a C extension of Python 2.7.5 on Python 2.7.15.

    I don't believe that. Binary wheels uploaded to PyPI seem to work fine regardless of the exact bugfix version.

    Third party C extensions distributed as portable wheel packages using the stable ABI is different use case.

    Most wheel packages don't use the stable ABI. They are tied to a Python version such as 2.7, but they don't differentiate between e.g. 2.7.5 and 2.7.15. We don't break the ABI between bugfix releases.

    @vstinner
    Copy link
    Member Author

    vstinner commented Sep 27, 2018

    I'm not talking about the standard library obviously. I don't remember my original use case exactly, but I must have been compiling a C extension on a system and expected it to work on another.

    It seems like we are talking about two different things:

    • My issue is restricted to the C extensions compiled by Makefile and setup.py: C extensions of the standard libraries
    • You are talking about third party extensions: that's out of the scope of this issue, since my issue is about Modules/makesetup.

    @pitrou
    Copy link
    Member

    pitrou commented Sep 27, 2018

    But as you said, we need consistency: either *never* link to libpython, or *always* link to libpython. You are proposing to always link, I'm arguing we should never link.

    @doko42
    Copy link
    Member

    doko42 commented Sep 27, 2018

    Debian/Ubuntu doesn't link against the library because it would add dependencies on all supported Python versions. Normally this is just during transition times, but e.g. for the upcoming Ubuntu 18.10 release we didn't finish the transition and so ship two Python3 versions. The packaging tools would add package dependencies on both 3.6 and 3.7 what you don't want.

    @serge-sans-paille
    Copy link
    Mannequin

    serge-sans-paille mannequin commented Oct 16, 2018

    Not an expert of Python build, but I've been creating a few « reverse engineer challenge » where I had to ship modified version of the interpreter, so played with it a bit.

    I agree consistency is nice to reason about. It looks better to me to not link with libpython.so directly. This is probably better as this does not make libpython an install requirement (e.g. when one wants to embed a minimal version of python)

    As a short check, I ran

    nm libpython3.so | grep ' [tT] ' | cut -d ' ' -f 3 | while read line; do nm python | grep ' [tT] ' | cut -d ' ' -f 3 | grep $line >/dev/null || { echo "bad: $line"; break; }; done
    

    and everything looks fine, so all symbols should already be in the interpreter.

    I've also checked whether that's an issue or not for user-defined native extensions and everything runs smoothly without the explicit dep.

    So the argument would be: why adding this dep when it's not needed?

    @vstinner
    Copy link
    Member Author

    vstinner commented Oct 16, 2018

    I wrote the PR 9912 to ensure that C extensions are never linked to libpython.

    I tested my change using:

    git clean -fdx
    ./configure --with-pydebug --enable-shared
    make
    for SO in build/lib.*/.so Modules/.so; do ldd $SO|grep libpython; done
    # grep must not display anything

    I tested 3 configurations on my Fedora 28 (Linux):

    • ./configure --with-pydebug --enable-shared: NEVER linked
    • Modified Modules/Setup (*) with ./configure --with-pydebug --enable-shared: NEVER linked
    • ./configure --with-pydebug: NEVER linked (well, it doesn't use libpython, but I wanted to test all cases :-))

    (*) I modified Modules/Setup to compile 37 C extensions as shared libraries using Makefile. Extract of Modules/Setup:

    errno errnomodule.c # posix (UNIX) errno values
    pwd pwdmodule.c # this is needed to find out the user's home dir
    _sre _sre.c # Fredrik Lundh's new regular expressions
    _codecs _codecsmodule.c # access to the builtin codecs and codec registry
    _weakref _weakref.c # weak references
    _operator _operator.c # operator.add() and similar goodies
    _collections _collectionsmodule.c # Container types
    _abc _abc.c # Abstract base classes
    itertools itertoolsmodule.c # Functions creating iterators for efficient looping
    atexit atexitmodule.c # Register functions to be run at interpreter-shutdown
    _stat _stat.c # stat.h interface
    _locale _localemodule.c # -lintl
    faulthandler faulthandler.c
    _tracemalloc _tracemalloc.c hashtable.c
    _symtable symtablemodule.c
    readline readline.c -lreadline -ltermcap
    array arraymodule.c # array objects
    cmath cmathmodule.c _math.c # -lm # complex math library functions
    math mathmodule.c _math.c # -lm # math library functions, e.g. sin()
    _contextvars _contextvarsmodule.c # Context Variables
    _struct _struct.c # binary structure packing/unpacking
    _weakref _weakref.c # basic weak reference support
    _testcapi _testcapimodule.c # Python C API test module
    _random _randommodule.c # Random number generator
    _pickle _pickle.c # pickle accelerator
    _datetime _datetimemodule.c # datetime accelerator
    _bisect _bisectmodule.c # Bisection algorithms
    _heapq _heapqmodule.c # Heap queue algorithm
    _asyncio _asynciomodule.c # Fast asyncio Future
    unicodedata unicodedata.c # static Unicode character database
    fcntl fcntlmodule.c # fcntl(2) and ioctl(2)
    spwd spwdmodule.c # spwd(3)
    grp grpmodule.c # grp(3)
    select selectmodule.c # select(2); not on ancient System V
    mmap mmapmodule.c
    _csv _csv.c
    _socket socketmodule.c
    _crypt _cryptmodule.c # -lcrypt # crypt(3); needs -lcrypt on some systems
    termios termios.c # Steen Lumholt's termios module
    resource resource.c # Jeremy Hylton's rlimit interface
    _posixsubprocess _posixsubprocess.c # POSIX subprocess module helper
    audioop audioop.c # Operations on audio samples
    _md5 md5module.c
    _sha1 sha1module.c
    _sha256 sha256module.c
    _sha512 sha512module.c
    syslog syslogmodule.c # syslog daemon interface
    _gdbm _gdbmmodule.c -I/usr/local/include -L/usr/local/lib -lgdbm
    binascii binascii.c
    parser parsermodule.c
    zlib zlibmodule.c -I$(prefix)/include -L$(exec_prefix)/lib -lz
    xxsubtype xxsubtype.c

    @ned-deily
    Copy link
    Member

    ned-deily commented Oct 17, 2018

    Perhaps you should bring up this proposed change in distutils-sig before committing. It's probably an OK change but it would be good to try to get some feedback from the downstream users who might be affected by it.

    @vstinner
    Copy link
    Member Author

    vstinner commented Jan 21, 2019

    Another Fedora on Python2:
    https://bugzilla.redhat.com/show_bug.cgi?id=1667914

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 16, 2019

    Downstream (RHEL) issue:
    https://bugzilla.redhat.com/show_bug.cgi?id=1585201

    This issue has been closed as "not a bug".

    --

    Since this issue has been created, no consensus could be found. So I close the issue to keep the status quo.

    In short, RTLD_LOCAL is not supported.

    I also close this issue as not a bug.

    @pitrou
    Copy link
    Member

    pitrou commented Apr 16, 2019

    What do you mean, "no consensus could be found"? I don't see anyone objecting the change.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 16, 2019

    What do you mean, "no consensus could be found"? I don't see anyone objecting the change.

    I propose a change to always link and a change to never link. I don't see any tracking towards one option. It seems like there are issues on Android.

    Anyway, this issue only seems to be theoretical since libpython must not be used with RTLD_LOCAL.

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 29, 2019

    FYI I modified Python 3.8 to never link C extensions to libpython on Unix (except on Android): bpo-21536, but with a different rationale (better reasons to do so).

    @reimar
    Copy link
    Mannequin

    reimar mannequin commented Apr 29, 2019

    In short, RTLD_LOCAL is not supported.

    I am sorry, this is not a workable stance.
    This does not just affect loading libpython directly.
    It also affects dlopen of a library that links to a library ... that links to libpython.
    For a developer is impossible to know if some library via dozens of dependencies might ever bring in libpython.
    Thus your stance is essentially equivalent to "nobody must ever use RTLD_LOCAL". I find it hard to consider that an acceptable "solution".

    @vstinner
    Copy link
    Member Author

    vstinner commented Apr 29, 2019

    In short, RTLD_LOCAL is not supported.

    reimar: "I am sorry, this is not a workable stance. This does not just affect loading libpython directly. (...)"

    This issue is now closed, as bpo-21536. Would you mind to open a new issuse to clearly explain your own case? Please mention your platform.

    @reimar
    Copy link
    Mannequin

    reimar mannequin commented Apr 29, 2019

    Sorry for my laziness. I opened bpo-36753.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 build The build process and cross-build
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants