Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let there be symbolic links #7619

Merged
merged 36 commits into from Jul 1, 2023
Merged

Commits on Jun 30, 2023

  1. bootloader: refactor file opening when extracting to filesystem

    Split the parent directory creation into separate helper that we
    can reuse elsewhere.
    
    Use low-level helpers for file and directory manipulation
    (checking whether path exists, creating a directory, etc.)
    to reduce amount of OS-specific code in higher-level helpers.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    c1f2612 View commit details
    Browse the repository at this point in the history
  2. bootloader: add support for creating symbolic links

    Introduce new TOC entry type, SYMLINK ('n') and implement
    creation/reconstruction of symbolic links from the CArchive.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    f123ee3 View commit details
    Browse the repository at this point in the history
  3. tests: add tests for symbolic link collection

    The tests verify that when collecting a symbolic link and its
    target file, the link is collected/re-created as symbolic link
    instead of a hard copy. On the other hand, when collecting only
    a symbolic link without its corresponding target, the symbolic
    link is collected as hard copy.
    
    The tests cover three scenarios: symbolic link points to a
    file in the same directory, to a file in sub-directory, and to
    a file in a parent directory. Similarly, they cover both explicit
    collection (symbolic link and corresponding file are added via
    individual `--add-data switches`) and implicit collection (whole
    data directory is added via `--add-data` switch).
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    18c8335 View commit details
    Browse the repository at this point in the history
  4. archive: writer: add support for symbolic link entries

    Store the NULL-terminated link target name in the data blob.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    8b7be25 View commit details
    Browse the repository at this point in the history
  5. building: implement symbolic link support

    Implement basic symlink support in `Analysis` and in `PKG` and
    `COLLECT` build targets.
    
    Towards the end of `Analysis`, we now scan the combined (and
    normalized) `binaries` and `datas` TOC for symbolic links. If
    the original file is also being collected, we convert those
    entries to SYMLINK entries, else we leave them as is (which
    creates a hard copy). The SYMLINK entries are returned via
    the `datas` TOC, even if their original entry came from the
    `binaries` TOC.
    
    The `PKG` build target writes the SYMLINK to the `CArchive`,
    while the `COLLECT` target re-creates the symlink in the
    frozen application directory.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    c786582 View commit details
    Browse the repository at this point in the history
  6. tests: skip symlink tests if OS cannot create symbolic link

    Creating symbolic links in unprivileged mode on Windows 10 requires
    Developer mode being enabled. Therefore, if a symlink test fails
    to create symbolic link in the test data directory, skip the test.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    ccf9ccf View commit details
    Browse the repository at this point in the history
  7. building: improve processing of TOC entries for symlink discovery

    Skip entries without valid source path (e.g., OPTION entries),
    and skip SYMLINK entries.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    d51dfab View commit details
    Browse the repository at this point in the history
  8. tests: symlinks: add scenario where file destination changes

    Test the cases when we collect original, but place it in a different
    location. This effectively invalidates symbolic link, so we should
    collect a hard copy.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    11708ce View commit details
    Browse the repository at this point in the history
  9. building: when evaluating symlinks, check if relationship is preserved

    Check whether the path relationship between collected file and
    symbolic link is preserved in the frozen bundle. If not, create
    a hard copy.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    5517a3f View commit details
    Browse the repository at this point in the history
  10. tests: symlinks: add scenario where file is collected multiple times

    The original file might be collected multiple times, and the
    symlink processing code should account for that.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    78b3145 View commit details
    Browse the repository at this point in the history
  11. building: symlinks: handle cases when file is collected multiple times

    Extend the symbolic link handling code to properly resolve symbolic
    link in cases when the original file is collected multiple times
    in different locations. If one of those matches the original
    path relationship, the symbolic link is collected as a link,
    otherwise it is collected as a hard copy.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    c80fd63 View commit details
    Browse the repository at this point in the history
  12. Configuration menu
    Copy the full SHA
    dd70c14 View commit details
    Browse the repository at this point in the history
  13. building: add support for SYMLINK entries to MERGE

    The SYMLINK entries must not be turned into DEPENDENCY entries,
    but they do need to be moved to the "references" list so that
    they end up in `a.dependencies` TOC after the merge.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    26a3454 View commit details
    Browse the repository at this point in the history
  14. bindepend: try to preserve location of dependent binaries

    Enable preservation of parent directory structure for
    dependend binaries on macOS and linux (and other POSIX
    systems) to match the behavior that was earlier
    implemented only for Windows.
    
    Instead of automatically collecting the detected dependent
    binaries into _MEIPASS, try putting them into corresponding
    package sub-directory (provided they originated from one of
    site-packages paths).
    
    But create a symlink into _MEIPASS to emulate former behavior
    and avoid potential issues with LD_LIBRARY_PATH on linux and
    breakage on macOS due to library path rewriting logic assuming
    shared libraries are collected into _MEIPASS.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    9b0c0aa View commit details
    Browse the repository at this point in the history
  15. building: implement automatic Info.plist collection

    On macOS, have Analysis automatically pick up `Info.plist` files
    for .framework bundles from which we collect (the main) binary,
    but only if we preserve the corresponding .framework directory
    layout in the frozen application.
    
    For now, only versioned layout (`QtCore.framework/Versions/5/QtCore`)
    is supported.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    d236db1 View commit details
    Browse the repository at this point in the history
  16. hookutils: qt: clean up QtWebEngine file collection

    Remove the explicit .framework collection (which resulted in
    duplication of data). This is now unnecessary, because we preserve
    original library location and thus .framework directory structure.
    
    So all we really need is properly collect Helper and Resources
    directories from QtWebEngineCore.framework.
    
    This fixes issues with QtWebEngine in onefile builds on macOS, as
    well as onedir applications being incorrectly names QtWebEngineProcess
    due to "misplaced" Info.plist file from the bundle.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    347d626 View commit details
    Browse the repository at this point in the history
  17. hooks: remove run-time hooks for QtWebEngine

    We do not need to override QTWEBENGINEPROCESS_PATH anymore, as
    the helper executable's location is now correctly inferred from
    the shared library location within the .framework bundle.
    
    And with PySide6/PyQt6 we do not need to disable sandboxing anymore,
    either - it should work with the properly preserved
    QtWebEngineCore.framework bundle layout.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    0cd6956 View commit details
    Browse the repository at this point in the history
  18. tests: enable QtWebEngine tests in onefile mode on macOS

    These should work now.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    230bb0f View commit details
    Browse the repository at this point in the history
  19. bindepend: further try to preserve .framework bundles

    When we are collecting a shared library into top-level directory
    and this shared library comes from a .framework bundle, re-create
    the .framework bundle directory structure in the top-level
    directory, and create symlink to library for backward compatibility.
    
    The Info.plist should then also be automatically collected as
    part of previously-added collection mechanism.
    
    This functionality aims to further preserve .framework bundle
    structures, this time in cases when they are collected from
    system-wide library directory instead of from a package; for
    example, when collecting Qt .framework bundles from Homebrew
    installation.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    2c07d99 View commit details
    Browse the repository at this point in the history
  20. hookutils: qt: fix QtWebEngineCore helper collection for Homebrew

    With Homebrew python and Qt, we are collecting the Qt libraries
    from system-wide installation, into top-level application
    directory (while preserving the .framework bundle directories).
    
    So the extra helper files from QtWebEngineCore.framework bundle
    need to be collected into QtWebEngineCore.framework directory in
    top-level application directory instead of the sub-directory
    in PySide/PyQt package directory.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    5707aba View commit details
    Browse the repository at this point in the history
  21. building: fix preserved .framework bundle structure

    When collecting the `Info.plist file`, we need to collect the
    original file from `Versions/<version>/Resources`, rather than
    from symlinked `Resources` in the top-level directory. Otherwise
    `codesign` verification reports `embedded framework contains
    modified or invalid version`.
    
    We also need to (re)create the symlink `Versions/Current` pointing
    to `Versions/<version>` directory from which we collected the
    binary. Framework bundles shipped with python packages may not
    include this, but it is required by `codesign`, so we always
    manually create this symlink (i.e., do not attempt to collect it
    even if it is present).
    
    Neither signing nor verification with `codesign` seem to require
    us to symlink `Versions/Current/Resources` and `Versions/Current/<binary>`
    to the top-level directory, so for now, we do not do that.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    45745e5 View commit details
    Browse the repository at this point in the history
  22. building: BUNDLE: add optional strict bundle verification

    Add optional strict verification of the generated bundle w.r.t.
    codesign requirements. It can be enabled by setting the
    `PYINSTALLER_VERIFY_BUNDLE_SIGNATURE` environment variable to
    a value different from 0, and is meant to supplement the existing
    `PYINSTALLER_STRICT_BUNDLE_CODESIGN_ERROR`.
    
    The first step is verification with `codesign --verify`; it seems
    that bundles can fail verification even though preceding signing
    command succeeded without complaints.
    
    The second step is scanning the extended attributes of all files
    collected in the bundle, to see if any of them store the
    code-signing data. This indicates that the files were not relocated
    properly (e.g., data files being located outside of the
    `Contents/Resources` directory) and risk losing the codesign
    data when bundle is transferred using methods that do not preserve
    the extended attributes.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    0f8176e View commit details
    Browse the repository at this point in the history
  23. ci: enable strict macOS bundle verification

    Enable `PYINSTALLER_STRICT_BUNDLE_CODESIGN_ERROR` and
    `PYINSTALLER_VERIFY_BUNDLE_SIGNATURE`, which will force us to
    mop up the codesign-related issues, such as dots in the directory
    names and Qt data files not being relocated to `Contents/Resources`.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    6baab1f View commit details
    Browse the repository at this point in the history
  24. building: fix preserved .framework bundle structure, try 2

    While the well-formed versioned .framework bundles should contain
    their `Info.plist` in the `Versions/<version>/Resources` directory,
    we are likely to come across the bundles that contain `Info.plist`
    in the top-level `Resources` directory (while the binary is in the
    versioned directory). This seems to be the case with contemporary
    `PySide2`, `PyQt5`, and `PyQt6`.
    
    Accommodate such cases, but ensure that the file is collected into
    versioned `Resource` directory, which seems to be a requirement
    for code-signing.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    6ebf20b View commit details
    Browse the repository at this point in the history
  25. building: BUNDLE: rewrite the resource relocation mechanism

    Redesign and rewrite the resource relocation mechanism used to divert
    data files into `Contents/Resources` directory while keeping the
    binaries in `Contents/MacOS`.
    
    The relocation is now performed as a pre-processing step on the input
    TOC, and produces the "final" TOC, which is then used by the `assemble`
    method to preform the actual collection. The separation of relocation
    step from the collection step makes it easier to make adjustments to
    the relocation implementation. At the same time, it significantly
    simplifies the collection itself, with the code there now being very
    similar to what we have in the COLLECT target.
    
    The relocation mechanism has been completely redesigned. We now
    relocate ALL data files to `Contents/Resources`. That includes the
    source  .py and byte-compiled .pyc files (which were exempted from
    relocation in # in an attempt to fix `cv2` loader scripts expecting
    to find the extension binary next to source .py file). It also includes
    data files from the PySide2/PySide6/PyQt5/PyQt6 which have been
    explicitly excluded from relocation up until now.
    
    The nested .framework bundles are treated as monolihic BINARY
    entities, and thus kept in `Contents/MacOS` (i.e., the DATA entries
    from within such .framework bundles are not relocated).
    
    While previously only data files (and data-only directories) have
    been symlinked from `Contents/Resources` back to `Contents/MacOS`,
    we now automatically cross-link all files between the two directory
    structures in an attempt to maintain the illusion of mixed-content
    directories at either location. Data-only and binary-only directories
    are cross-linked at directory level, while files in mixed-content
    directories are cross-linked at individual file level.
    
    An automatic work-around for directories in `Contents/MacOS` that
    have dot in their name is implemented; the directory is created
    with a modified name, and a symbolic link pointing to it is created
    under the original name.
    
    A series of tests has been added to verify that the basic rules of
    relocation.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    0b39689 View commit details
    Browse the repository at this point in the history
  26. building: fix preserved .framework bundle structure, try 3

    As part of post-processing of collected files from .framework bundles,
    re-create the symlinks to the content within the `Versions/Current`
    (which itself is a re-created symlink to the `Versions/<version>`)
    from the top-level .framework directory: the binary file, the
    Resources directory, and auxiliary directories, such as Helpers.
    
    While this is a generic solution, it is primarily aimed at
    seamlessly accommodating the QtWebEngine in contemporary PySide/PyQt,
    where the QtWebEngineProcess helper process is sought in the
    top-level Helpers directory and the resources are sought in the
    top-level Resources directory. On the other hand, we need to collect
    these directories in the `Versions/<version>` directory to be
    compliant with codesign...
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    ac388df View commit details
    Browse the repository at this point in the history
  27. hookutils: qt: fix collection of extras from QtWebEngineCore.framework

    The Resources and Helpers directory that we collect from the
    QtWebEngineCore.framework need to be collected in the versioned
    directory (Versions/<version>/Helpers and Versions/<version>/Resources)
    in order to be compliant with codesign, so adjust the paths
    accordingly.
    
    The .framework bundles shipped with contemporary PyPI PyQt/PySide wheels
    are naturally not well-formed, so we need to fall-back to collecting
    stuff from top-level .framework directory. But collect into versioned
    directory to be compliant with codesign.
    
    The helper process and resources are actually sought in the top-level
    .framework directory (which is why everything except the codesign
    worked so far) - this is now handled by symlinks to the top-level
    directory that are automatically created by .framework post-processing
    code in the `PyInstaller.utils.osx.collect_files_from_framework_bundles`
    helper function (see preceding commit).
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    2c6df95 View commit details
    Browse the repository at this point in the history
  28. building: BUNDLE: use xattr package for extended attribute check

    Instead of using system-provided `/usr/bin/xattr` utility to list
    the extended attributes on collected files, add an optional
    dependency on `xattr` package and use `xattr.listxattr` function.
    
    Turns out that system-provided `/usr/bin/xattr` is in fact just an
    entry-point provided by the `xattr` package installed in the
    system-provided python, which on macOS <= 11 is still python2.
    Therefore, trying to run it in a sub-process from python3 might
    lead to weird compatibility issues to due python3 site-packages
    leaking into the python2 process, as seen on our CI:
    
    ```
     output: Traceback (most recent call last):
      File "/usr/bin/xattr", line 8, in <module>
        from pkg_resources import load_entry_point
      File "/Users/runner/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1423
        local = f"sanitized.{_safe_segment(rest)}".strip(".")
                                                  ^
    SyntaxError: invalid syntax
    ```
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    25ccae5 View commit details
    Browse the repository at this point in the history
  29. rthooks: pyqt/pyside: extend qml import paths in macOS .app bundles

    To satisfy codesign requirements, we are forced to split the
    collected `qml` directory into two parts; one that keeps only
    binaries (rooted in the `Contents/MacOS`) and one that keeps
    only data files (rooted in the `Contents/Resources`), with files
    from one directory tree being symlinked to the other to mantinain
    illusion of a single mixed-content directory.
    
    As Qt seems to compute the identifier of its QML components based
    on location of the `qmldir` file w.r.t. the registered QML import
    paths, we need to register both paths, because the `qmldir` file
    for a component could be reached via either directory tree.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    4738bb1 View commit details
    Browse the repository at this point in the history
  30. building: BUNDLE: relocate dylibs to Contents/Frameworks directory

    When assembling macOS .app bundle, put the dylibs and nested
    .framework bundles into `Contents/Frameworks` directory, so that
    only the program executable is left in the `Contents/MacOS`.
    
    This requires us to relocate `sys._MEIPASS` from `Contents/MacOS`
    (= parent of the executable directory) to `Contents/Frameworks`
    in lieu of having to cross-link everything back (which would
    defeat the purpose of the relocation in the first place).
    
    A side effect is that `sys._MEIPASS` does not correspond to
    `os.path.dirname(sys.executable)` in onedir .app bundles anymore.
    But that never held for onefile builds (regular or .app bundles),
    and may not hold for regular onedir builds in the near future.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    22831f6 View commit details
    Browse the repository at this point in the history
  31. analysis: implement automatic binary vs data (re)classification

    Perform automatic BINARY vs. DATA (re)classification for TOC
    entries received from user via Analysis' input arguments and
    from hooks. These entries might have incorrect typecode due to
    various reasons: user passing binaries as datas or vice versa, a
    hook brute-force collecting a directory as datas even though it
    also contains binaries, or due to incorrect classification done
    by our hook utilities.
    
    Automatic (re)classification helps ensure that *all* binaries
    undergo proper binary dependency analysis.
    
    Furthermore, in generated macOS .app bundles, we rely on proper
    data/binary typecodes to put the files into their corresponding
    directory structure (`Contents/Resources` vs `Contents/MacOS` or
    `Contents/Frameworks`) to satisfy code-signing requirements.
    
    The classification is currently implemented for:
     - Windows: attempt to open the file using `pefile`
     - macOS: attempt to open the file using `macholib`
     - Linux: see if `objdump -a` recognizes the file
    
    On these platforms, the TOC entry's typecode is automatically
    corrected and the entry is moved to the correct TOC list (`binaries`
    or `datas`). On other platforms, no changes are made.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    da3bd9d View commit details
    Browse the repository at this point in the history
  32. building: use os.makedirs(exist_ok=True) when creating directories

    When creating parent directory structure for collected files
    in COLLECT and BUNDLE, avoid explicitly checking whether the
    directory already exists. Instead, call `os.makedirs` with
    `exists_ok=True` and catch the `FileExistsError`.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    8ef40c2 View commit details
    Browse the repository at this point in the history
  33. bulding: BUNDLE: avoid cross-linking the EXECUTABLE file

    The .app bundle's executable's location should be always
    discoverable via `sys.executable`, so there is no reason to
    cross-link it into `Contents/Frameworks` and `Contents/Resources`.
    
    By not cross-liking the executable, we also prevent potential
    conflicts between the executable (i.e., program/app bundle name)
    and an eponymous package (from which we collect binaries or data
    files), which arise in scenario when same base name is used for
    the entry-point script (`something.py`) and a package (`something`).
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    2f90649 View commit details
    Browse the repository at this point in the history
  34. tests: re-enable PyQt6 >= 6.5.1 on macOS

    The crash caused by missing Info.plist should be solved now that we
    properly preserve the Qt .framework bundles.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    86d5a33 View commit details
    Browse the repository at this point in the history
  35. building: improve handling of symlink chains with uncollected links

    If we have a symbolic link chain with intermediate links that we
    do not collect, attempt to rewrite the links to "jump over" the
    missing ones.
    
    This attempts to mitigate potential issues under Homebrew python,
    where shared libraries for `wxwidgets` have the following layout:
     * libwx_baseu-3.2.0.2.1.dylib
     * libwx_baseu-3.2.0.dylib -> libwx_baseu-3.2.0.2.1.dylib
     * libwx_baseu-3.2.dylib -> libwx_baseu-3.2.0.dylib
    
    If the other collected binaries reference only the two of them,
    `libwx_baseu-3.2.0.2.1.dylib` and `libwx_baseu-3.2.dylib`, we
    end up missing the intermediate link, and the final link ends
    up as a hard copy, causing issues due to duplication.
    
    Therefore, if we do not seem to be collecting the file referenced
    by a symlink, but the referenced file itself is a symlink, follow
    it again and see if we happen to be collecting that file (and if
    that file happens to be a symlink, follow it again...).
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    9a5d98b View commit details
    Browse the repository at this point in the history
  36. building: rework symbolic link check to handle linked parent dir

    Rework the symbolic link check to handle the problem with linked
    parent directory, as observed on macOS:
     * /usr/local/Cellar/wxwidgets/3.2.2.1_1/lib/libwx_baseu-3.2.0.2.1.dylib
     * /usr/local/Cellar/wxwidgets/3.2.2.1_1/lib/libwx_baseu-3.2.0.dylib -> libwx_baseu-3.2.0.2.1.dylib
     * /usr/local/Cellar/wxwidgets/3.2.2.1_1/lib/libwx_baseu-3.2.dylib -> libwx_baseu-3.2.0.dylib
    and
     * /usr/local/opt/wxwidgets/lib/libwx_baseu-3.2.0.2.1.dylib
     * /usr/local/opt/wxwidgets/lib/libwx_baseu-3.2.0.dylib -> libwx_baseu-3.2.0.2.1.dylib
     * /usr/local/opt/wxwidgets/lib/libwx_baseu-3.2.dylib -> libwx_baseu-3.2.0.dylib
    which are actually the same, because
     * /usr/local/opt/wxwidgets -> ../Cellar/wxwidgets/3.2.2.1_1
    
    Other binaries end up referencing
    `/usr/local/opt/wxwidgets/lib/libwx_baseu-3.2.dylib` and
    `/usr/local/Cellar/wxwidgets/3.2.2.1_1/lib/libwx_baseu-3.2.0.2.1.dylib`.
    
    So in addition the the problem with chained links and missing
    intermediate link, we also need to be able to handle the situation
    where the files appear to be originating from different locations
    due to the linking of parent directories.
    rokm committed Jun 30, 2023
    Configuration menu
    Copy the full SHA
    d011333 View commit details
    Browse the repository at this point in the history