Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

building: macOS: use @rpath to rewrite binaries' dependency paths #7664

Merged

Conversation

rokm
Copy link
Member

@rokm rokm commented May 27, 2023

When rewriting the dylib identifier and paths to linked non-system shared libraries in collected binaries, use @rpath instead of directly using @loader_path.

E.g., for a binary collected as mypackage/lib/libsomething.dylib, instead of rewriting the identifier and paths to linked dylibs as @loader_lib/../../<something>, rewrite as @rpath/<something> and set rpath to @loader_path/../...

This ensures that the dylib identifier of a shared library (now @rpath/libsomething.dylib) always matches the name/path used by another collected binary to refer to this shared library (the dependency path now also being @rpath/libsomething.dylib), even if the shared library and referring binary are located at different directory depth w.r.t. the top-level application directory. Under the old approach that directly uses @loader_path, the dylib identifier of shared library might end up being @loader_path/../../libsomething.dylib, but the library path in the binary located one directory higher would end up being @loader_path/../libsomething.dylib.

Mismatches in dylib identifiers are typically not a problem, because the linked dylib is actually found at the specified location. However, some packages do not place shared libraries in "correct" location, and work around this by "preloading" the shared libraries from another location, for example via ctypes. In this case, for a binary to recognize an already-loaded library, the referring name and the target dylib's identifier must match. And in PyInstaller's case, the only way to achieve that is to have identifiers (and referring paths) normalized to @rpath/<name>, and store the actual relative path to the top-level application directory in the rpath.

Using @rpath, however, means that we now need to ensure that our rpath is the only one set in the binary; i.e., we need to remove all other rpaths, and add our target rpath if necessary. To do so, we need to use the external macOS-provided install_name_tool utility, which we now use for all header modifications: modification of dylib identifier, modification of dependend dylib paths, and removal/addition of rpaths.

@rokm rokm force-pushed the macos-rewrite-dylib-paths-using-rpath branch 2 times, most recently from bbf38a8 to 79fe2a0 Compare May 27, 2023 19:41
When rewriting the dylib identifier and paths to linked
non-system shared libraries in collected binaries, use
`@rpath` instead of directly using `@loader_path`.

E.g., for a binary collected as `mypackage/lib/libsomething.dylib`,
instead of rewriting the identifier and paths to linked dylibs as
`@loader_lib/../../<something>`, rewrite as `@rpath/<something>`
and set rpath to `@loader_path/../..`.

This ensures that the dylib identifier of a shared library
(now `@rpath/libsomething.dylib`) always matches the name/path
used by another collected binary to refer to this shared library
(the dependency path now also being `@rpath/libsomething.dylib`),
even if the shared library and referring binary are located at
different directory depth w.r.t. the top-level application
directory. Under the old approach that directly uses
`@loader_path`, the dylib identifier of shared library might
end up being `@loader_path/../../libsomething.dylib`, but the
library path in the binary located one directory higher would
end up being `@loader_path/../libsomething.dylib`.

Mismatches in dylib identifiers are typically not a problem,
because the linked dylib is actually found at the specified
location. However, some packages do not place shared libraries
in "correct" location, and work around this by "preloading"
the shared libraries from another location, for example via
`ctypes`. In this case, for a binary to recognize an
already-loaded library, the referring name and the target
dylib's identifier must match. And in PyInstaller's case, the
only way to achieve that is to have identifiers (and referring
paths) normalized to `@rpath/<name>`, and store the actual
relative path to the top-level application directory in the
rpath.

Using `@rpath`, however, means that we now need to ensure that
our rpath is the only one set in the binary; i.e., we need to
remove all other rpaths, and add our target rpath if necessary.
To do so, we need to use the external macOS-provided
`install_name_tool` utility, which we now use for all header
modifications: modification of dylib identifier, modification
of dependend dylib paths, and removal/addition of rpaths.
@rokm rokm force-pushed the macos-rewrite-dylib-paths-using-rpath branch from 79fe2a0 to 67eb981 Compare May 27, 2023 19:46
@rokm rokm marked this pull request as ready for review May 27, 2023 20:01
@rokm
Copy link
Member Author

rokm commented May 27, 2023

The dylib identifier mismatch came up while I was looking into pyinstaller/pyinstaller-hooks-contrib#591.

The torchtext/_torchtext.so is originally linked against libtorchtext.so with @rpath-based path:

$ otool -L venv/lib/python3.10/site-packages/torchtext/_torchtext.so 
venv/lib/python3.10/site-packages/torchtext/_torchtext.so:
	@rpath/_torchtext.so (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libtorchtext.so (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libtorch_python.dylib (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libc10.dylib (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libtorch.dylib (compatibility version 0.0.0, current version 0.0.0)
	@rpath/libtorch_cpu.dylib (compatibility version 0.0.0, current version 0.0.0)
	@loader_path/.dylibs/libc++.1.0.dylib (compatibility version 1.0.0, current version 1300.36.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.0.0)

but there is actually no rpath set in this binary.

So the libtorchtext.so, which is located at torchtext/lib/libtorchtext.so actually cannot be loaded by dynamic lib loader. And it is also (correctly!) proclaimed as missing by our binary dependency scanner, and thus a --collect-binaries tortchtext is required...

The only reason the package actually works is that it pre-loads the torchtext/lib/libtorchtext.so via a wrapper for ctypes.CDLL before importing _torchtext.

So in the frozen application, even if we collect the offending shared library via --collect-binaries, the pre-loading trick does not work, because after the path rewriting, the dylib identifier of collected torchtext/lib/libtorchtext.so ends up being @loader_path/../../libtorchtext.so, while torchtext/_torchtext.so is referring to it as @loader_path/../libtorchtext.so, and the pre-loaded instance is not recognized.


The same sort of problem has been observed with pyexiv2: https://github.com/orgs/pyinstaller/discussions/6404#discussioncomment-1724640.

@rokm rokm merged commit 49fe7b1 into pyinstaller:develop May 30, 2023
18 checks passed
@rokm rokm deleted the macos-rewrite-dylib-paths-using-rpath branch May 30, 2023 09:46
@HealsCodes
Copy link

It would have been nice if there had been an option to force the old-style behaviour.

There's proprietary libraries like ODA out there that have additional runtime modules with a non-dylib extension (.tx*) and the actual dylibs rely specific @loader_path/plugins/ entries in order to be able to load these.

  • pinning our apps to == 5.11.0 for the time being

@rokm
Copy link
Member Author

rokm commented Jun 9, 2023

It would have been nice if there had been an option to force the old-style behaviour.

There's proprietary libraries like ODA out there that have additional runtime modules with a non-dylib extension (.tx*) and the actual dylibs rely specific @loader_path/plugins/ entries in order to be able to load these.

* pinning our apps to == 5.11.0 for the time being

Can you provide a specific example of the paths and identity of such dylibs and plugins, and what the requirement is (i.e., does it require @loader_path prefix)? Because under old behavior, any plugin binary in a plugins directory (provided it was collected as a binary and not as a data file) would have their identity rewritten as @loader_path/../<name>.

I'd really prefer not to go back to the old behavior (or having to support both); for libraries with special requirements such as the ones you mention and which we cannot test against, it might make more sense to undo the changes using the install-name-tool.

@HealsCodes
Copy link

I can understand not wanting to revert your changes as they probably improve the majority of use-cases on macOS.
We already have a python wrapper around pyinstaller in use that does some post-processing on the frozen app bundle.
I'll look into adding a way to specify extra library paths to be re-applied to a subset of the collected dylibs.

(I'll also so see if I can provide a reasonably useable example but I suppose this is a very niece case)

rokm added a commit to fgdroege/pyinstaller that referenced this pull request Jun 20, 2023
…d commands

Extend the new `install_name_tool`-based library path rewriting
(introduced in pyinstaller#7664) to the following list of load commands:
`LC_LOAD_DYLIB`, `LC_LOAD_UPWARD_DYLIB`, `LC_LOAD_WEAK_DYLIB`,
`LC_PREBOUND_DYLIB`, and `LC_REEXPORT_DYLIB`.

This should match the behavior of the old `macholib`-based path
rewriting.
rokm added a commit that referenced this pull request Jun 21, 2023
…d commands (#7704)

Extend the new `install_name_tool`-based library path rewriting (introduced in #7664) to the following list of load commands: `LC_LOAD_DYLIB`, `LC_LOAD_UPWARD_DYLIB`, `LC_LOAD_WEAK_DYLIB`, `LC_PREBOUND_DYLIB`, and `LC_REEXPORT_DYLIB`.

This should match the behavior of the old `macholib`-based path rewriting.

---------

Co-authored-by: Rok Mandeljc <rok.mandeljc@gmail.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants