Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing and removing runtime binding errors for CPython extensions on macOS #103306

Closed
rickmark opened this issue Apr 6, 2023 · 13 comments
Closed
Labels
build The build process and cross-build OS-mac type-feature A feature request or enhancement

Comments

@rickmark
Copy link

rickmark commented Apr 6, 2023

Feature or enhancement

Utilize -bundle_loader to make C extensions to CPython more reliable.

Pitch

Today the majority of CPython extensions on macOS are built with lazy, dynamic bound symbols (-undefined dynamic_lookup). This is due to the fact that the linker would fail for a great many inputs. The reason for this is that distutils and others are not using the macOS system linker as intended. When compiling a "bundle" (a specialized type of dylib) one can and should pass an executable "bundle loader" (-bundle_loader) that defines all the symbols that are expected to be ambient prior to the loading of the bundle. This allows for the linker to correctly validate the closure of all symbols when producing the linked product. Due to the fact that it is possible to compile python the binary with a shared library, it would also be required that the binary need "re-export" all symbols from the shared library. This would make the python executable suitable for being used as the bundle loader during extension linking. The undefined switch can then be removed, and the linker will provide actual valuable information as to the extensions ability to execute at time of compile, perhaps due to an ABI change, rather then breaking at time of load.

As far as I can tell, there is no downside to "re-exporting" shared library symbols in the python binary, other then a negligable increase in binary size for the defined symbols.

From my current understanding, LLVM's lld will only accept a "bundle" or "executable" for the bundle loader. This means we cannot simply pass the shared library in as current. An alternative to re-exporting the symbols would be to create a "dummy" bundle that exports all the same symbols as the shared library. This dummy bundle on macOS can be used as the ABI and should always exactly match the headers used by the extension.

It may be even better to influence lld (and possibly ld64) to accept tbd or text based symbols or some other format of textual symbol as to not have to worry about the dummy bundle or re-exporting (after all the linker is really only using the exported symbols of the executable to verify that all symbols are in fact defined, not to bind them - the loader is not defined as a load command as that would create a circular dependency). The -undefined is too course grained if its on, then you have no verification all symbols are defined.

A simple proof of concept can be generated to create the dummy bundle and to use it in conjunction with the building of included core modules.

R

@rickmark rickmark added the type-feature A feature request or enhancement label Apr 6, 2023
@arhadthedev arhadthedev added OS-mac build The build process and cross-build labels Apr 6, 2023
@ronaldoussoren
Copy link
Contributor

-bundle_loader doesn't work when the loader binary is dynamically linked to libpython. There doesn't seem to be a way to report symbols in a way that will be work with -bundle_loader. I've checked, see #97524.

Other than that I'd love to have a way to avoid -undefined dynamic_lookup for the reasons you mention (one that's does not "ship a second python binary that is statically linked").

@ronaldoussoren
Copy link
Contributor

One option that could work is to using -U <symbol> for every exported symbol in libpython while linking (and dropping -undefined dynamic_lookup, but that both leads to extremely long command lines while linking and requires some automated process to keep the list of symbols up-to-date (and likely needs two lists, one for extensions using the limited ABI and one for other extensions).

@rickmark
Copy link
Author

rickmark commented Apr 6, 2023

One option that could work is to using -U <symbol> for every exported symbol in libpython while linking (and dropping -undefined dynamic_lookup, but that both leads to extremely long command lines while linking and requires some automated process to keep the list of symbols up-to-date (and likely needs two lists, one for extensions using the limited ABI and one for other extensions).

Yeah - it's a shame that there's not an option for ld64 or lld to specify a list of undefined symbols. Suppose this can be added to lld via a patch since it is useful in such cases.

@rickmark
Copy link
Author

rickmark commented Apr 6, 2023

-bundle_loader doesn't work when the loader binary is dynamically linked to libpython. There doesn't seem to be a way to report symbols in a way that will be work with -bundle_loader. I've checked, see #97524.

Other than that I'd love to have a way to avoid -undefined dynamic_lookup for the reasons you mention (one that's does not "ship a second python binary that is statically linked").

What are your thoughts on creating a separate python.abi file as a bundle (as in MH_BUNDLE) during build that simply defines the symbols exported (potentially as as aliases to a single stub function)? This would only used for ABI checking and have no code (the symbols must actually be exported, undefined symbols are not sufficient). I'll see if building such a thing works.

As a snide side note, moving to CMake would make this so much easier, autotools is a bit difficult for platform specific quirks these days.

@ronaldoussoren
Copy link
Contributor

One option that could work is to using -U <symbol> for every exported symbol in libpython while linking (and dropping -undefined dynamic_lookup, but that both leads to extremely long command lines while linking and requires some automated process to keep the list of symbols up-to-date (and likely needs two lists, one for extensions using the limited ABI and one for other extensions).

Yeah - it's a shame that there's not an option for ld64 or lld to specify a list of undefined symbols. Suppose this can be added to lld via a patch since it is useful in such cases.

Turns out there is an option for that, use @/path/to/symbols.txt with symbols.txt containing a list of -U <symbol> arguments (see the first option under "Rarely used Options" in the manpage.

@ronaldoussoren
Copy link
Contributor

-bundle_loader doesn't work when the loader binary is dynamically linked to libpython. There doesn't seem to be a way to report symbols in a way that will be work with -bundle_loader. I've checked, see #97524.
Other than that I'd love to have a way to avoid -undefined dynamic_lookup for the reasons you mention (one that's does not "ship a second python binary that is statically linked").

What are your thoughts on creating a separate python.abi file as a bundle (as in MH_BUNDLE) during build that simply defines the symbols exported (potentially as as aliases to a single stub function)? This would only used for ABI checking and have no code (the symbols must actually be exported, undefined symbols are not sufficient). I'll see if building such a thing works.

That could work, but still requires something to maintain that and integrate it in the build process. An the @ option mentioned in my previous comment is likely less work (build as usual and then use nm to extract the list of symbols from the shared library and/or main binary (for static builds). This wouldn't differentiate between the normal and limited/stable API, but that's fine.

As a snide side note, moving to CMake would make this so much easier, autotools is a bit difficult for platform specific quirks these days.

I'm not getting into that can of worms ;-).

@rickmark
Copy link
Author

rickmark commented Apr 6, 2023

-bundle_loader doesn't work when the loader binary is dynamically linked to libpython. There doesn't seem to be a way to report symbols in a way that will be work with -bundle_loader. I've checked, see #97524.
Other than that I'd love to have a way to avoid -undefined dynamic_lookup for the reasons you mention (one that's does not "ship a second python binary that is statically linked").

What are your thoughts on creating a separate python.abi file as a bundle (as in MH_BUNDLE) during build that simply defines the symbols exported (potentially as as aliases to a single stub function)? This would only used for ABI checking and have no code (the symbols must actually be exported, undefined symbols are not sufficient). I'll see if building such a thing works.

That could work, but still requires something to maintain that and integrate it in the build process. An the @ option mentioned in my previous comment is likely less work (build as usual and then use nm to extract the list of symbols from the shared library and/or main binary (for static builds). This wouldn't differentiate between the normal and limited/stable API, but that's fine.

The -U file version is fine for cpython's building of itself, but for out of tree extensions we should probably provide something simpler like the bundle. (since its a single cmdline switch less likely to conflict with other switches provided by the extension author).

@ronaldoussoren
Copy link
Contributor

-bundle_loader doesn't work when the loader binary is dynamically linked to libpython. There doesn't seem to be a way to report symbols in a way that will be work with -bundle_loader. I've checked, see #97524.
Other than that I'd love to have a way to avoid -undefined dynamic_lookup for the reasons you mention (one that's does not "ship a second python binary that is statically linked").

What are your thoughts on creating a separate python.abi file as a bundle (as in MH_BUNDLE) during build that simply defines the symbols exported (potentially as as aliases to a single stub function)? This would only used for ABI checking and have no code (the symbols must actually be exported, undefined symbols are not sufficient). I'll see if building such a thing works.

That could work, but still requires something to maintain that and integrate it in the build process. An the @ option mentioned in my previous comment is likely less work (build as usual and then use nm to extract the list of symbols from the shared library and/or main binary (for static builds). This wouldn't differentiate between the normal and limited/stable API, but that's fine.

The -U file version is fine for cpython's building of itself, but for out of tree extensions we should probably provide something simpler like the bundle. (since its a single cmdline switch less likely to conflict with other switches provided by the extension author).

Both should be equivalent for users and both can be provided as part of python3-config or sysconfig.

I'm afraid both would only be usable by default in Python 3.13 at the earliest, although it might be possible to get into 3.12. Given our backward compatibility policy neither option can be added to bug fix releases.

@rickmark
Copy link
Author

rickmark commented Apr 7, 2023 via email

@rickmark
Copy link
Author

rickmark commented Apr 7, 2023

So good news bad news time, I have an "improvement" but its missing support for embedded python (clearly a bad thing) - i worked around that by using -flat_namespace but the real fix would be to have dlopen provide the bundle loader as the shared library or the executable (where-ever the runtime symbols are in fact defined).

#103356

@rickmark
Copy link
Author

rickmark commented Apr 8, 2023

More specifics - Without flat namespaces the symbols from the bundle loader are bound as such:

    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x0000C000              bind  main-executable/_PyBool_FromLong
        __DATA_CONST __got            0x0000C008              bind  main-executable/_PyBool_Type
        __DATA_CONST __got            0x0000C010              bind  main-executable/_PyErr_Clear
        __DATA_CONST __got            0x0000C018              bind  main-executable/_PyErr_ExceptionMatches
        __DATA_CONST __got            0x0000C020              bind  main-executable/_PyErr_Format
        __DATA_CONST __got            0x0000C028              bind  main-executable/_PyErr_NoMemory
        __DATA_CONST __got            0x0000C1F8              bind  libSystem.B.dylib/___error
        __DATA_CONST __got            0x0000C200              bind  libSystem.B.dylib/___stack_chk_fail
        __DATA_CONST __got            0x0000C208              bind  libSystem.B.dylib/___stack_chk_guard
        __DATA_CONST __got            0x0000C210              bind  libSystem.B.dylib/_acos
        __DATA_CONST __got            0x0000C218              bind  libSystem.B.dylib/_acosh
        __DATA_CONST __got            0x0000C220              bind  libSystem.B.dylib/_asin
        __DATA_CONST __got            0x0000C228              bind  libSystem.B.dylib/_asinh
        __DATA_CONST __got            0x0000C230              bind  libSystem.B.dylib/_atan
        __DATA_CONST __got            0x0000C238              bind  libSystem.B.dylib/_atan2
        __DATA_CONST __got            0x0000C240              bind  libSystem.B.dylib/_atanh

This would be very bad as python can often be a host in another process and cannot and should not assume it is the main executable, and even if it is python, we may be using the shared library ourself. But when we go to a flat namespace, then we loose proper dylib binding for other libraries brining back the fragile issue:

Modules/math.cpython-312-darwin.so [arm64]:
    -fixups:
        segment      section          address                 type   target
        __DATA_CONST __got            0x0000C000              bind  flat-namespace/_PyBool_FromLong
        __DATA_CONST __got            0x0000C008              bind  flat-namespace/_PyBool_Type
        __DATA_CONST __got            0x0000C010              bind  flat-namespace/_PyErr_Clear
        __DATA_CONST __got            0x0000C018              bind  flat-namespace/_PyErr_ExceptionMatches
        __DATA_CONST __got            0x0000C020              bind  flat-namespace/_PyErr_Format
        __DATA_CONST __got            0x0000C028              bind  flat-namespace/_PyErr_NoMemory
        __DATA_CONST __got            0x0000C030              bind  flat-namespace/_PyErr_Occurred
        __DATA_CONST __got            0x0000C038              bind  flat-namespace/_PyErr_SetFromErrno
        __DATA_CONST __got            0x0000C040              bind  flat-namespace/_PyErr_SetString
        __DATA_CONST __got            0x0000C048              bind  flat-namespace/_PyExc_MemoryError
        __DATA_CONST __got            0x0000C050              bind  flat-namespace/_PyExc_OverflowError
        __DATA_CONST __got            0x0000C058              bind  flat-namespace/_PyExc_StopIteration
        __DATA_CONST __got            0x0000C060              bind  flat-namespace/_PyExc_TypeError
        __DATA_CONST __got            0x0000C068              bind  flat-namespace/_PyExc_ValueError
        __DATA_CONST __got            0x0000C070              bind  flat-namespace/_PyFloat_AsDouble
        __DATA_CONST __got            0x0000C078              bind  flat-namespace/_PyFloat_FromDouble

The ideal and correct output would be flat-namespace binding for the Python symbols, but two-level namespace for everything else. This can be done (as in the fixups can define this by using the MH_FLAT_LOOKUP opcode for python and ordinals for the others) - but the question is how to pull this off. Perhaps there is something in the generation of the .pyabi bundle file that can provide the difference in behavior? (can a bundle be compiled in a way to say "always bind to me as a flat namespace")?

@rickmark
Copy link
Author

rickmark commented Apr 8, 2023

Ideally there would be a new BIND_SPECIAL_DYLIB_LOADER_LOOKUP that allowed a symbol to be bound to a context that was passed to dlopen() so that you could specify the loader dependencies are provided by whatever image hosts the python runtime. (maybe even hacking this by letting the dylib be "main executable" for the purposes of the load) - but that can't probably happen without support from the OS. -U would be the only choice for now to use bundle loader...

@ronaldoussoren
Copy link
Contributor

I'm closing this as a duplicate of #97524, because fixing that issue will also fix this issue by switch to the -U option alternative I sketched earlier. That's still the best we can do within the design constraints given Apple tools.

@ronaldoussoren ronaldoussoren closed this as not planned Won't fix, can't repro, duplicate, stale Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build The build process and cross-build OS-mac type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants