New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean up bindepend and building codebase #7784
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix RemovedCipherFeatureError to display the whole error message instead of just the passed argument.
Remove the deprecated and ignored `key` argument from `main`.
rokm
force-pushed
the
bindepend-cleanup
branch
2 times, most recently
from
July 23, 2023 20:10
d34f707
to
3056181
Compare
Remove support for generating external (side-by-side) manifest for the Windows executable. This has been available for onedir builds, but is now broken due to relocation of all bundle contents into sub-directory. Instead of trying to fix it, simply drop the option; if anyone needs to post-process the manifest, they can used 3rd party tools to extract and re-embed it.
… to EXE Defer building of the default executable manifest from analysis stage to the EXE instantiation. Up until now, the default manifest was built in the analysis stage, because scanning of the python executable for its dependencies also added potential WinSxS dependencies to the generated manifest. This "dependency" manifest was then picked up by EXE and used to generate the final version (with applied uac_admin and uac_uiaccess options). However, contemporary python versions have no WinSxS dependencies (python 2.7 and <= 3.3 used to reference VC9 runtime this way) except for MS CommonControls, which we can reference in our default manifest template, without having to rely on the results of the said analysis. So, move all executable-manifest-related processing into EXE.
Remove support for analysis and collection of WinSxS assemblies. The amount of code involved is disproportionate to the chance of encountering a binary making use of WinSxS (and the required assembly not being available on the target system) in contemporary python and python packages (e.g., VC9 runtime from VS2008 was the last VC runtime that made use of WinSxS; and was used to build python 2.7 and python <= 3.3).
Remove helpers for dealing with zipped eggs, as part of wider clean-up that will remove support for zipped eggs.
With support for zipped eggs being removed, there is no need to check whether the path to otherwise non-existant file happens to point inside of the egg. Also remove the `is_path_to_egg` from `depend.utils`.
bwoodsend
approved these changes
Jul 29, 2023
Remove the `building.toc_conversion` module and its `DependencyProcessor` class. The main purpose of this class was aggregation of `binaries` and `datas` TOC lists returned from the hooks, with additional collection of files from eggs. In contemporary python environments, however, that extra data collection should be handled by hooks, which should work with non-zipped eggs as well as wheel-installed packages. So remove the `DependencyProcessor` and replace it with two helper methods on `PyiModuleGraph` that return `binaries` and `datas` TOC lists for entries returned by hooks. This removes the support for zipped eggs from PyInstaller; the `zipfiles` and `zipped_data` TOC lists of Analysis should now be always empty.
The `zipfiles` and `zipped_data` TOC lists are now always empty, so remove them from the spec templates, and remove their references from the docs (mostly from examples of spec files). They can still be passed to the corresponding targets, so existing .spec files are not affected.
In binary-vs-data-reclassification tests, adjust the indices of `binaries` and `datas` TOCs in the serialized `Analysis` data struct; this is needed because elements were removed from the object's guts. Fix `test_egg_unzipped`. The test expects the data file from egg to be automatically collected, but it should in fact be collected via hook. So add the said hook. Remove the `test_egg_zipped`, as zipped eggs are not supported anymore. Remove the egg-based variant of tests for `pkg_resources` provider. As we do not support zipped eggs anymore, drop the (zipped) egg variant of pkg_resources provider tests.
Rename the `checkCache` helper to `process_collected_binary`, and clean up naming of internal variables. Also use `exist_ok=True` with `os.makedirs` instead of explicitly checking if the directory exists.
Remove special error message about installing python-dev package if pyconfig.h cannot be found. Even when we were trying to collect that file, the general-purpose `format_binaries_and_datas` helper was not the best place for such a message...
Reorder the functions in `depend.bindepend` module, and group them by functionality (high-level import analysis, low-level import analysis, library full-path resolution, etc.). This commit should make it easier to track the changes made in functions themselves in the subsequent commits.
Rename getImports* functions to get_imports*, and clean them up.
During analysis, avoid performing separate binary dependency analysis on python executable, and always use the `get_python_library_path` helper from `bindepend` instead. This ensures that the same search logic is always employed in both analysis and later on in EXE/PKG construction, where we search for python shared library again using `get_python_library_path` (which is necessary due to inability to pass the value from Analysis to EXE without breaking existing .spec design). Also perform the second `get_python_library_path` in EXE and store the result, which can be used both in `EXE.assemble` (in macOS codepath) as well as passed on to PKG, to avoid having to call it there again.
Sort the obtained imports and print them on separate lines to make them easier to read.
Rework the _get_imports_macholib helper, especially the part handling run paths. Instead of trying to manually substitute @loader_path and @executable_path in an @rpath, record the run path entries as they are; when prefixing the library path with run-paths to obtain candidate paths, have `dyld_find` from `macholib` deal with @loader_path and @executable_path prefix in the resulting library candidate path.
Up until now, `get_imports` (and its predecessor) had inconsistent behavior w.r.t. resolution of full paths to the referenced libraries. On Unix-like systems, path resolution is performed by ldd, so the corresponding codepath always returned full paths. On macOS, referenced libraries were explicitly resolved in post-processing codepath. On Windows, no resolution is attempted. Rework the get_imports and its helpers to always (attempt to) resolve the library paths. In addition, return (name, fullpath) tuples, where name is the referenced library name (which, depending on the platform, may be just a basename or not). The platform-specific behavior w.r.t referenced name is allowed in order to give caller more context when library cannot be resolved (e.g., a system-provided library on macOS). This will also allow us to move the warning messages out of the get_imports and let the caller handle them as neccessary. The `get_imports` helper for linux (based on `ldd`) and macOS (based on `macholib`) now fall back to searching the supplied search paths (if any) when the base platform-specific resolution fails. This will provide a mechanism for suppressing missing-library warnings in cases libraries cannot be resolved due to missing rpath but could be found in a search path provided by the corresponding hook. The library path resolution helpers have been consolidated and cleaned up. The high-level function is now called `resolve_library_path` (formerly `findSystemLibrary`), and the low-level helper for Unix (formerly called `findLibrary`) has been renamed to `_resolve_library_path_unix`. Compatibility aliases for `findSystemLibrary` and `findLibrary` are made (both pointing to high-level `resolve_library_path`) due their use in contributed hooks repository. The Windows-specific `getfullnameof` helper has been split into generic `_resolve_library_path_in_search_paths` (which is also used by `get_imports` helpers for search path fallback), while the Windows-specific search paths are now handled in Windows-specific codepath of `resolve_library_path`.
Consolidate the `Dependencies` and `selectImports` helper functions into single `binary_dependency_analysis` function, and get rid of global `seen` variable.
Replace the manifest parser (and processor) in winmanifest module with a simplified version. The former parser/processor included code for WinSxS assembly processing, look-up and binding redirection; with support for WinSxS processing dropped from PyInstaller, this is not needed anymore. If provided with manifest XML, the former parser attempted to load it into its structures, and then write them back into XML when embedding the manifest into the executable. Due to processing only a subset of manifest XML schema, this inherently meant that elements unknown to the parser were lost in the process. Which in turn negated main reason for providing external manifest (aside from being able to set custom application name and version). The new manifest processing code encodes the default manifest as an XML, making it easier to read and edit. When creating application manifest, the base XML (the template or externally-provided XML) is processed just to: * write uac_admin and uac_uiaccess options into corresponding attributes of the <requestedExecutionLevel> element (creating it and its parents if necessary) * ensure that <dependency> element exists for MS Common-Controls dependent assembly, and that the attributes of corresponding <assemblyIdentity> element are set to the expected values (this is required by windowed bootloader for the unhandled exception dialog). All other parts of the manifest XML are left untouched. The new manifest processing does not require writing manifest to the disk any more, as all processing is done in-memory. The winmanifest module can now be imported on all platforms, as `winresource` module is imported only within the functions that read and write the manifest from/to the executable.
Having icon group with ID 0 seems to abort the enumeration of resources using `winresource.GetResources` helper with `win32ctypes.pywin32.pywintypes.error: (15106, 'EnumResourceNamesW', 'User stopped resource enumeration.')`. Presumably this is because the 0 ID evaluates to falsely value somewhere in the underyling win32-ctypes code. Even if this is underlying win32-ctypes problem, Windows executables seem to have their icon group IDs start with 1, so keep in line with that.
Move the code for embedding or copying win32 resources to a separate helper function to make code in `assemble` easier to follow. Improve resource parameters validation and error messages, and raise exceptions on errors instead of swallowing them.
Turn the function-local makeabs() helper function (used for icons) into a static method called _makeabs(), so it can be reused in several other places where relative paths need to be anchored to the spec file location.
Reduce the amount of functions to the ones that are actually used, and clean up the names and implementation.
As the missing-library warnings are now handled by the high-level binary dependency analysis function, we can be a bit smarter about them, and suppress the warnings for dependencies that happen to be collected anyway. This might happen either because the library was collected by some other mechanism (for example, via hook, or supplied by the user), or because it was discovered during the analysis of another binary (which, for example, had properly set run-paths on Linux/macOS or was located next to that other analyzed binary on Windows). Therefore, we defer the missing-library warnings until after binary dependency analysis is complete; at that point, we can compare the basenames of unresolved dependencies to basenames of discovered (collected) binaries for improved warning suppression.
The global missing-lib warning suppression for libshiboken should now be redundant, thanks to the improved missing-lib warning suppression from the preceding commit.
On Windows, have `resolve_library_path` first process the supplied search paths, and then fall back to searching only locations found in the `PATH` environment variable. I.e., change the old behavior for Windows library path resolution using the former `getfullnameof`, where following the supplied search paths, we would search all locations in `sys.path`, then attempt to search pywin32's `pywin32_system32` directory, then search Windows' system directories, and then locations in `PATH`. We do not search `sys.path` anymore, because doing so may lead to corner-case issues when system-installed programs modify PYTHONPATH to point to directories that happen to contain program's bundled DLLs (pyinstaller#5560). Adding `pywin32_system32` directory to search paths is now handled further up the call chain, in `find_binary_dependencies` helper of the `building.build_main` module, where other search paths are set as well. The search for `pywin32_system32` directory (and other `pywin32` directories) is now done by treating the directory as a namespace package and looking up its path, which should allow us to find it without assuming the prefix path of the `site-packages` directory. This also removes the use of deprecated `distutils.sysconfig` and its `get_python_lib` function, which was triggerring pyinstaller#7347. Also note that `PATH` should already contain Windows' system directories, so we can avoid explicitly searching paths returned by `winutils.get_system_path()`.
And remove unnecessary intermediate variable in `get_system_path`.
There is little point in trying to work around broken pywin32 installations on our side; if pythoncom/pywintypes cannot be imported in the given python environment as-is, the user cannot run and test their unfrozen program, either. Remove the `PyInstaller.utils.hooks.win32` module and move its only function, `get_pywin32_module_file_attribute` into `PyInstaller.utils.hooks`. The two hooks in contributed hooks repository that make use of this function are importing it from this package, anyway. Reduce the implementation into simple `get_module_attribute(name, '__file__')` call.
Remove part of the code that extends sys.path with contents of the `os.path.join(sys._MEIPASS, 'eggs')`, as we should not be collecting eggs there anymore.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This cleanup was initially aimed at
PyInstaller.depend.bindepend
, and attempting to sort outgetImports
,selectImports
,Dependencies
,getfullnameof
. These are now reworked into low-levelget_imports
and high-levelbinary_dependency_analysis
, and the name resolving functionality has been consolidated intoresolve_library_path
(with compatibility aliasesfindLibrary
andfindSystemLibrary
pointing to it, because some hooks in the contrib repository are making use of that).The low-level
get_imports
and its platform-specific implementations do not generate missing-lib warnings anymore; instead, they return list of tuples in form of(ref_name, lib_path)
, whereref_name
is the referenced dependency name (a base-name, or something like@rpath/dependency.dylib
) andlib_name
is resolved library full-path, if resolved (andNone
otherwise). The missing-lib warnings are now handled by the caller (e.g., the high-levelbinary_dependency_analysis
), who has more information available to decide whether to emit the warning or not.Further clean up spread to other parts of the codebase, and involves removal of:
support for analyzing, collecting, and processing WinSxS assemblies. These are mostly a thing of the past (last VC runtime that used WinSxS was VC9 from VS 2008, used to build Python 2.7 and 3.3). Considering the amount of (poorly-tested) code required to support this, and considering that mangling of collected assembly manifests went unnoticed for a rather long time (Error in Microsoft.VC90.CRT.manifest file generates side by side issue #6509), I'm opting for removal of this feature. If someone requires collection of binaries compiled with VS2008 and requires the VC9 runtime to be automatically collected, they should stick to PyInstaller 5.x series.
support for zipped eggs. Similar to the above, the related codepaths are slowly bit-rotting due to lack of testing, both on our CI and in the wild. In addition, remove the attempt at collecting data files from egg-installed packages (zipped or non-zipped); this should be handled by hooks, regardless of how the package is installed, in order to have consistent and predictable behavior.
The removal of WinSxS processing simplifies the processing of collected binaries, and allows us to replace
PyInstaller.utils.win32.winmanifest
with a simplified version that just applies theuac_admin
anduac_uiaccess
settings and ensures that MS Common-Controls dependency is listed among dependencies. This is done by searching and modifying the corresponding DOM nodes in the manifest, which means that any other elements in user-provided manifest are preserved (as opposed to old approach that parsed the manifest into internal structures, made modifications, and generated output XML, thus removing any elements that it did not recognize, even if they were valid).The command-line options
--win-private-assemblies
and--win-no-prefer-redirects
have been removed; the corresponding arguments toAnalysis
have been deprecated and raise error if not set toTrue
(the old default), and they have been removed from .spec template. The option for external executable manifest in onedir mode has also been removed;--no-embed-manifest
command line option is gone, and correspondingembed_manifest
argument toEXE
has been deprecated to raise an error if not set toTrue
(the old default).As part of the cleanup, binary dependency analysis does not search
sys.path
anymore, so we can close #5560. In addition, it does not use deprecateddistutils.sysconfig
module, which should prevent an issue withtensorflow
and its plugins; hence closes #7347.Closes #6509 as collection from WinSxS is not supported anymore - that particular issue was likely "fixed" by c5a64a0, but only because it accidentally disables the processing of manifests unless caching is required (i.e., upx or strip is enabled)...