Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

building: validate binaries returned by analysis of ctypes calls #7984

Merged
merged 1 commit into from Oct 5, 2023

Conversation

rokm
Copy link
Member

@rokm rokm commented Oct 5, 2023

Validate binaries returned by analysis of ctypes calls in collected modules; the analysis might return files that are in PATH but are not binaries, which then cause errors during binary dependency analysis. An example of such problematic case is the gmsh package on Windows, where ctypes.util.find_library('gmsh') ends up resolving the python script called gmsh in the environment's Scripts directory (see #7982).

Validate binaries returned by analysis of `ctypes` calls in collected
modules; the analysis might return files that are in `PATH` but are
not binaries, which then cause errors during binary dependency analysis.
An example of such problematic case is the `gmsh` package on Windows,
where `ctypes.util.find_library('gmsh')` ends up resolving the python
script called `gmsh` in the environment's Scripts directory.
@rokm
Copy link
Member Author

rokm commented Oct 5, 2023

This is the last-line-of-defense kind of thing. In the future, we might want to also remove the Scripts directory from library search paths (both in ctypes-based analysis and the regular binary dependency analysis).

@bwoodsend
Copy link
Member

Doesn't that mean that there is still a gmsh.dll which we are not collecting but should? Shouldn't we be swapping out the ctypes.util.find_library() for an implementation which does the filtering and keeps searching if an invalid file is picked up?

@rokm
Copy link
Member Author

rokm commented Oct 5, 2023

Doesn't that mean that there is still a gmsh.dll which we are not collecting but should? Shouldn't we be swapping out the ctypes.util.find_library() for an implementation which does the filtering and keeps searching if an invalid file is picked up?

Indeed. But it is actually called gmsh-4.11.dll, and lives in the environment's Lib, which is not in PATH, and thus cannot be discovered by ctypes.util.find_library().

Which is why the gmsh search code looks like this (the released version is 4.11, the git version is 4.12):
https://gitlab.onelab.info/gmsh/gmsh/-/blob/master/api/gmsh.py?ref_type=heads#L41-86

[...]
moduledir = os.path.dirname(os.path.realpath(__file__))
parentdir1 = os.path.dirname(moduledir)
parentdir2 = os.path.dirname(parentdir1)

if platform.system() == "Windows":
    libname = "gmsh-4.12.dll"
elif platform.system() == "Darwin":
    libname = "libgmsh.4.12.dylib"
else:
    libname = "libgmsh.so.4.12"

# Searching lib in various subfolders
libpath = None
possible_libpaths = [os.path.join(moduledir, libname),
                     os.path.join(moduledir, "lib", libname),
                     os.path.join(moduledir, "Lib", libname),
                     # first parent dir
                     os.path.join(parentdir1, libname),
                     os.path.join(parentdir1, "lib", libname),
                     os.path.join(parentdir1, "Lib", libname),
                     # second parent dir
                     os.path.join(parentdir2, libname),
                     os.path.join(parentdir2, "lib", libname),
                     os.path.join(parentdir2, "Lib", libname)
                     ]

for libpath_to_look in possible_libpaths:
    if os.path.exists(libpath_to_look):
        libpath = libpath_to_look
        break

# if we couldn't find it, use ctype's find_library utility...
if not libpath:
    if platform.system() == "Windows":
        libpath = find_library("gmsh-4.11")
        if not libpath:
            libpath = find_library("gmsh")
    else:
        libpath = find_library("gmsh")

# ... and print a warning if everything failed
if (not libpath) or (not os.path.exists(libpath)):
    print("Warning: could not find Gmsh shared library " + libname)
    print("Searched at these locations: " + str(possible_libpaths))

lib = CDLL(libpath)
[...]

So the library is not actually discoverable by ctypes, at least not on Windows. But they still have a codepath for it, and that codepath triggers resolution of gmsh script in Script directory.

@rokm
Copy link
Member Author

rokm commented Oct 5, 2023

So for actually collecting the lib, the user would have to use --add-binary or write a hook that queries gmsh.libpath.

@rokm rokm merged commit 43f3d06 into pyinstaller:develop Oct 5, 2023
18 checks passed
@rokm rokm deleted the validate-ctypes-binaries branch October 5, 2023 17:13
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants