Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify correct runtime (versus compile time) libraries on Linux #84

Open
manolis-andr opened this issue Apr 17, 2020 · 3 comments
Open
Labels
enhancement validated issue has been validated by a maintainer

Comments

@manolis-andr
Copy link

Problem

  1. I used ctypesgen (1.0.2) in environment A (Debian Stretch), to generate a Python wrapper for a header file that includes ext2fs.h and declares functions exported by the libext2fs.so library.
ctypesgen <my_header.h> --library libext2fs.so
  1. I tried to use the wrapper in a different environment B (again Debian Stretch). Importing the Python file, resulted in the following error:
File "/usr/lib/python2.7/dist-packages/my_pacakge/my_header.py", line 367, in load_library
ImportError: libext2fs.so not found.

However, I could successfully compile in A a C executable that uses the same header file (to link against the shared library I passed -lext2fs to gcc), and I could also successfully run the executable in B without any issues.

Inspecting environment B, I noticed that libext2fs.so did not exist, since it is part of the e2fslibs-dev package. However, it may make sense for a development package to be absent from my run environment. I also could not pass the libext2fs.so.2 library, which exists in B, to ctypesgen, because this file does not exist in A. Again there is probably no reason for it to exist in A, since I do not need to run an executable linked against libext2fs there. I could pass libext2fs.so.2.4 to ctypesgen, which exists in both environments, but then updating the libext2fs library to 2.5 in B, would needlessly break the wrapper. Since the library's major version number would be unaltered, the changes would not affect the library's interface and so I should still be able to use the same Python wrapper.

In conclusion, it seems that on Linux, ctypesgen does not differentiate between the compile/link-time libraries and the runtime libraries.

Background

According to my current understanding, ctypesgen loads the provided libraries twice. Once in generation time (processing phase), to determine which library contains each function and variable, and once in runtime to load the library and call the underlying function. The printer writes the library name on the exported Python file, and upon runtime, the libraryloader code takes care of finding and loading that library.

On GCC, to link against shared libraries one uses the -l option (-lname), which after compilation, is passed to the static linker. If I have understood correctly, the latter automatically searches the development-specific library name, by extending the name provided with the lib prefix and the .so suffix (libname.so). Although, this can be the ELF library itself, it is usually a symbolic link to the actual ELF, which has the full-version suffix (.so.X.Y). The static linker is responsible of specifying the dependency to the shared library. However, it does not specify the .so library as a dependency, because in another environment this may be a symlink to a different library version, and so the executable will be broken. Actually, it does not even specify the name of the full library version (.so.X.Y), since this is more restrictive than necessary. It usually specifies the ABI version of the library (.so.X), which guarantees that the ABI is compatible so that the executable does not break, but also allows for internal changes in the library that do not affect its interface (ABI in general).

As far as I know, implementation-wise, this process is facilitated by the SONAME entry in an ELF library. For most properly built libraries this filed contains the library name, which implies ABI compatibility at the symbols' level. Therefore, the static linker probably opens the .so library and specifies the library's SONAME as the dependency for the final executable. If the .so library is a symlink, it actually opens the actual library file (.so.X.Y), and reads the SONAME from there. It should still be the ABI library name (.so.X).

Proposal

If the above hold and I am not missing something, in ctypesgen's case, I assume that the user could provide the .so library name through the --library option, and then ctypesgen could inspect that library through objdump and add the SONAME value, as the library that should be loaded upon runtime. Thus, the same semantics applied to normal executables would be applied.

This also solves the potential issue of using two distinct environments for generating and using the Python wrappers. Generation can happen in an environment that only contains the .so file that points to .so.X.Y (but whose SONAME is .so.X), and actual use of the wrappers can happen in an environment that only contains the so.X file that points to .so.X.Y. This is actually quite a common practice in Debian, which distinguishes the development packages, which contain files necessary to link against a shared library, from the regular packages, which just contain the shared library itself (ELF).

However, since I am not aware of the project details, what is the rationale behind the --library option? Is there something that I am missing regarding the specification of libraries? Do you think that extending ctypesgen to specify the ABI name of a library as the runtime dependency would make sense?

Thanks in advance and sorry for the length of the issue.

@Alan-R Alan-R changed the title Specify correct runtime libraries on Linux Specify correct runtime (versus compile time) libraries on Linux Aug 1, 2021
@Alan-R Alan-R added Type-Enhancement validated issue has been validated by a maintainer enhancement and removed Type-Enhancement labels Aug 1, 2021
@ldo
Copy link

ldo commented Mar 5, 2023

The distinction between development and runtime library names is important. Note that the latter has a version number suffix: this changes whenever some backward-incompatible change is made to the ABI. It makes sense for the generated ctypes wrapper to explicitly load the filename with the correct version suffix, in case some other version is also present on the system.

I admit that this is probably not something easy to do with automatically-generated ctypes wrappers. I create all mine by hand, because I feel that is the only way to create a truly Pythonic library wrapper.

@mara004
Copy link
Contributor

mara004 commented Dec 17, 2023

The pypdfium2-team branch of ctypesgen uses ctypes.util.find_library() to locate system libraries, which to my understanding automatically searches by plain name (without prefix/suffix/version), and expands to what it considers the best matching filename, with major version if available. We store this in _libs_info[...]["path"].
Then I believe it should be possible to achieve the OP's goal in a very lean way, by splitting out the version and asserting on it at import time in calling code.

@mara004
Copy link
Contributor

mara004 commented Dec 19, 2023

uses ctypes.util.find_library()

Note though that currently this code passage is only active if you do not use custom runtime libdirs, i.e. the library lies in standard system directories. Presumably this would be the case here.
Also, on Linux, find_library() understands LD_LIBRARY_PATH, so you could set that if the library lies in a custom directory. Unfortunately I'm not aware of a cross-os solution to use custom libdirs with find_library(), so we still need some own code for that.

find_library() unfortunately does not tell us the full path on Linux (see python/cpython#65241), but that's a separate matter, which shouldn't affect the versioning story.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement validated issue has been validated by a maintainer
Projects
None yet
Development

No branches or pull requests

4 participants