-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bunch of hooks for contemporary ML stuff #676
Commits on Dec 21, 2023
-
tests: create separate test file for torch and its libraries
Create a separate `test_pytorch.py` test for `torch` and its associated libraries. Move existing tests to the new file.
Configuration menu - View commit details
-
Copy full SHA for ecade84 - Browse repository at this point
Copy the full SHA ecade84View commit details -
tests: improve the enforcement of onedir-only tests for torch
Explicitly force the `pyi_builder` into onedir-only mode instead of skipping onefile tests. Reduces number of reported tests.
Configuration menu - View commit details
-
Copy full SHA for 84bf7f1 - Browse repository at this point
Copy the full SHA 84bf7f1View commit details -
tests: torchvision: add test that uses torchscript
Add a test that uses `torchvision.datasets`, `torchvision.transforms`, and torchscript. The latter demonstrates the need for collecting source .py files from `torchvision`.
Configuration menu - View commit details
-
Copy full SHA for e2105a4 - Browse repository at this point
Copy the full SHA e2105a4View commit details -
hooks: torchvision: collect source .py files
Rename `hook-torchvision.ops.py` to `hook-torchvision.py`, and add `module_collection_mode = 'pyz+py'` to collect source .py files for torch JIT/torchscript.
Configuration menu - View commit details
-
Copy full SHA for f428a74 - Browse repository at this point
Copy the full SHA f428a74View commit details -
hooks: torch: explicitly collect versioned .so files
When collecting binaries in the PyInstaller >= 6.0 codepath, explicitly collect versioned .so files by adding '*.so.*' to the list of search patterns passed to `collect_dynamic_libs`. Just in case that any of those libs happens to be dynamically loaded at run-time...
Configuration menu - View commit details
-
Copy full SHA for 5bf7e4f - Browse repository at this point
Copy the full SHA 5bf7e4fView commit details -
hooks: torch: add support for external nvidia.* packages on linux
The contemporary PyPI torch wheels for linux use CUDA libraries that are installed via `nvidia-*` packages. Therefore, attempt to convert the `nvidia-*` requirements from the `torch` metadata into hidden imports. This way, we can provide hooks for `nvidia.*` packages that collect the shared libs, in case any of them are dynamically loaded (which currently seems to be the case with some of the shared libraries from `nvidia.cudnn`).
Configuration menu - View commit details
-
Copy full SHA for c901c07 - Browse repository at this point
Copy the full SHA c901c07View commit details
Commits on Dec 22, 2023
-
hooks: add hooks for subpackages of nvidia package
On Linux, NVIDIA CUDA 11.x and 12.x shared libraries can be installed via PyPI wheels (e.g., `nvidia-cuda-runtime-cu12`, `nvidia-cudnn-cu12`, `nvidia-cublas-cu12`). These all provide sub-packages under `nvidia` top level package (e.g., `nvidia.cuda_runtime`, `nvidia.cudnn`, `nvidia.cublas`). Add hooks for these sub-packages that ensure that all shared libraries from the sub-packages `lib` directory are collected, in case they are dynamically loaded. For example, `torch` PyPI wheels for linux do not bundle CUDA inside the `torch/lib` (whereas the wheels from their own server, built with "non-default" CUDA versions bundle them), and dynamically load `libcudnn_ops_infer.so.8` from `nvidia.cudnn.lib`.
Configuration menu - View commit details
-
Copy full SHA for d6a66a0 - Browse repository at this point
Copy the full SHA d6a66a0View commit details -
tests: add test for torchaudio that uses torchscript
Add a test for torchaudio that uses a transform to resample a signal. The test shows the need to collect binaries from the torchaudio package, and, due to use of torchscript, also shows the need to collect source .py files.
Configuration menu - View commit details
-
Copy full SHA for a9389f1 - Browse repository at this point
Copy the full SHA a9389f1View commit details -
hooks: add hook for torchaudio
Add hook for torchaudio that collects binaries and ensures that source .py files are collected.
Configuration menu - View commit details
-
Copy full SHA for ae2b12b - Browse repository at this point
Copy the full SHA ae2b12bView commit details -
tests: add test for torchtext that uses torchscript
Add a test for torchext that uses a tokenization transform from Berta Encoder. The test shows the need to collect binaries from the torchtext package, and, due to use of torchscript, also shows the need to collect source .py files. We perform only tokenization part of processing, in order to avoid having to download the whole model (~240 MB).
Configuration menu - View commit details
-
Copy full SHA for 061efe9 - Browse repository at this point
Copy the full SHA 061efe9View commit details -
Add hook for torchtext that collects binaries and ensures that source .py files are collected.
Configuration menu - View commit details
-
Copy full SHA for dd3bed6 - Browse repository at this point
Copy the full SHA dd3bed6View commit details -
tests: move tensorflow tests to their own file
Move tensorflow tests into separate `test_tensorflow` file. Improve the `tensorflow_onedir_only` test mark (force pyi_builder into generating only onedir case instead of skipping the onefile case).
Configuration menu - View commit details
-
Copy full SHA for 293a458 - Browse repository at this point
Copy the full SHA 293a458View commit details -
tests: add tests for transformers package
Add a basic `transformers` pipeline test. Add a basic import test for `transformers.DebertaModel`, which shows that we need to collect source .py files for TorchScript.
Configuration menu - View commit details
-
Copy full SHA for ceff7b1 - Browse repository at this point
Copy the full SHA ceff7b1View commit details -
hooks: add hook for transformers
Add hook for Hugging Face `transformers` package. Attempt to automatically collect metadata for all of package's dependencies (as declared in `deps` dictionary in the `transformers.dependency_versions_table` module). Collect source .py files as some of the functionality uses TorchScript.
Configuration menu - View commit details
-
Copy full SHA for 663df32 - Browse repository at this point
Copy the full SHA 663df32View commit details -
Add a hook for fastai, which ensures that the package's source .py files are collected, as they are required for TorchScript. Add a test based on the "building models from tabular data" example from https://docs.fast.ai/quick_start.html, which demonstrates the need for the source .py files.
Configuration menu - View commit details
-
Copy full SHA for fc744e9 - Browse repository at this point
Copy the full SHA fc744e9View commit details -
hooks: torchvision: ensure torchvision.io.image works
`torchvision.io.image` attempts to dynamically load `torchvision.image` extension, so add a hook that ensures the latter is collected. Add a basic image encoding/decoding test.
Configuration menu - View commit details
-
Copy full SHA for bbb9872 - Browse repository at this point
Copy the full SHA bbb9872View commit details -
hooks: add hook for timm (Hugging face torch image models)
Add hook for timm, which ensures that the package's source .py files are collected, as they are required for TorchScript. Add a basic model listing and creation test.
Configuration menu - View commit details
-
Copy full SHA for 030e9bb - Browse repository at this point
Copy the full SHA 030e9bbView commit details -
Add test for `lightning`, based on their "Getting started" example with autoencoder trained on MNIST dataset. Requires `torchivsion` for dataset.
Configuration menu - View commit details
-
Copy full SHA for 64751d9 - Browse repository at this point
Copy the full SHA 64751d9View commit details -
Add hook for (PyTorch) `lightning`. Currently, the main functionality is to ensure that the `version.info` file from the package is collected. We do not collect source .py files, as it seems that even if `lightning.LightningModule.to_torchscript()` is used, it requires the source where the model inheriting from `lightning.LightningModule` is defined, rather than `lightning`'s own sources. We can always add source .py files collection later, if it proves to be necessary.
Configuration menu - View commit details
-
Copy full SHA for aa493db - Browse repository at this point
Copy the full SHA aa493dbView commit details -
hooks: add hooks for bitsandbytes and triton
Add hooks for `bitsandbytes`, and its dependency `triton`. Both packages have dynamically-loaded extension libraries, and both require collection of source .py files for (`triton`'s) JIT module. With `triton`, some of the submodules need to be collected only as source .py files (no PYZ), because the code naively tries to read the files pointed to by `__file__` attribute under assumption that they are source .py files. So we must not collect these modules into the PYZ.
Configuration menu - View commit details
-
Copy full SHA for a81ba9a - Browse repository at this point
Copy the full SHA a81ba9aView commit details -
hooks: add hook for linear_operator
Add hook and basic test for `linear_operator`. The package uses torchscript/JIT, so we need to collect its source .py files.
Configuration menu - View commit details
-
Copy full SHA for 7f57989 - Browse repository at this point
Copy the full SHA 7f57989View commit details -
Add a basic test for GPyTorch, based on their "Simple GP Regression" example.
Configuration menu - View commit details
-
Copy full SHA for 85bf2ce - Browse repository at this point
Copy the full SHA 85bf2ceView commit details -
Add hook for `fvcore.nn` to collect its source .py files for torchscript/JIT. Add a basic import test that demonstrates the need for that.
Configuration menu - View commit details
-
Copy full SHA for 368fc11 - Browse repository at this point
Copy the full SHA 368fc11View commit details -
hooks: add hook for detectron2
Add hook for `detectron` to collect its source .py files for torchscript/JIT. Add a basic import test that demonstrates the need for that.
Configuration menu - View commit details
-
Copy full SHA for 5b51e85 - Browse repository at this point
Copy the full SHA 5b51e85View commit details -
hooks: add hook for Hugging Face datasets
Add hook for `datasets` to collect its source .py files for torchscript/JIT. Add a basic dataset loading test that demonstrates the need for that.
Configuration menu - View commit details
-
Copy full SHA for 4ba8a67 - Browse repository at this point
Copy the full SHA 4ba8a67View commit details -
tests: add a basic test for Hugging Face accelerate
Add basic test for Hugging Face `accelerate`; demonstrates that for the tested subset of functionality, no hook is necessary.
Configuration menu - View commit details
-
Copy full SHA for fac988f - Browse repository at this point
Copy the full SHA fac988fView commit details -
hook: tensorflow: reformat line wrapping
Use 120-character lines to reduce amount of line wrapping and make the hook easier to read.
Configuration menu - View commit details
-
Copy full SHA for 4319060 - Browse repository at this point
Copy the full SHA 4319060View commit details -
hook: tensorflow: revise the _pywrap_tensorflow_internal hack
Remove the `tensorflow.python._pywrap_tensorflow_internal` hack (adding it to excluded modules to avoid duplication) for PyInstaller >= 6.0, where the issue is alleviated thanks to the binary dependency analysis preserving the parent directory layout of discovered/collected shared libraries. This should fix the problem with `tensorflow` builds where the `_pywrap_tensorflow_internal` module is not used as a shared library, as seen in `tensorflow` builds for Raspberry Pi: pyinstaller#121
Configuration menu - View commit details
-
Copy full SHA for 2b4eeae - Browse repository at this point
Copy the full SHA 2b4eeaeView commit details -
hooks: tensorflow: collect sources for tensorflow.python.autograph
Collect sources for `tensorflow.python.autograph` to avoid run-time warning about AutoGraph being unavailable. Not sure if we need to collect sources for other parts of `tensorflow`, though; if that proves to be the case, we can adjust the collection mode later.
Configuration menu - View commit details
-
Copy full SHA for 8103da5 - Browse repository at this point
Copy the full SHA 8103da5View commit details -
Add _pyinstaller_hooks_contrib.compat module
Add `_pyinstaller_hooks_contrib.compat` module to hide the gory details of stdlib `importlib.metadata` vs. `importlib_metadata`. Import the `importlib_metadata` from `PyInstaller.compat` if available (PyInstaller >= 6.0), otherwise duplicate the logic. Copy `importlib_metadata` and `packaging` requirements from PyInstaller to pyinstaller-hooks-contrib.
Configuration menu - View commit details
-
Copy full SHA for c85cc87 - Browse repository at this point
Copy the full SHA c85cc87View commit details -
hooks: tensorflow: rework the tensorflow version check
Determine the tensorflow's dist name using list of potential candidates, and if a dist is found, retrieve the version from it. Otherwise, fall back to reading `tensorflow.__version__`. The `tensorflow` package is available in several variants, and sometimes the `tensorflow` dist installs a separate dist (e.g., `tensorflow-intel` on Windows); but the user can install this separate dist without installing the "top-level" `tensorflow` one. In PyInstaller v5 and earlier, the `is_module_satisfies` fell back to querying `tensorflow.__version__` if the dist could not be found - in v6, the implementation of `is_module_satisfies` (or rather, `check_requirement`) checks only the metadata. As an added bonus, the direct version comparisons are nicer to read than the comparisons against `tf_pre_1_15_0`, `tf_post_1_15_0`, etc.
Configuration menu - View commit details
-
Copy full SHA for 6ef9304 - Browse repository at this point
Copy the full SHA 6ef9304View commit details -
hooks: tensorflow: generate hidden imports for nvidia.* modules
Linux builds of `tensorflow` can install CUDA via nvidia-* packages that are enabled by `and-cuda` extra marker. So parse the dist metadata for requirements, and turn the `nvidia-*` requirements into `nvidia.*` hidden imports. Consolidate the shared code for conversion of `nvidia-*` dist name into `nvidia.*` module in a utility function, and use it in both `torch` and `tensorflow` hooks.
Configuration menu - View commit details
-
Copy full SHA for 39f7dee - Browse repository at this point
Copy the full SHA 39f7deeView commit details -
pytest: consolidate pytest.ini into setup.cfg
Consolidate `pytest` configuration from `pytest.ini` into `setup.cfg`, to match what we have in the main PyInstaller repository. Add -v, -rsxXfE, and ----doctest-glob= to test flags. The addition of -v ensures that in manual (local) pytest runs, the test names are displayed as they are ran (the CI workflows seem to explicitly specify -v as part of their pytest commnads).
Configuration menu - View commit details
-
Copy full SHA for cce1021 - Browse repository at this point
Copy the full SHA cce1021View commit details -
tests: add multiprocessing.freeze_support() call to lightning test
When running the `test_lightning_mnist_autoencoder` under arm64 macOS, `multiprocessing` seems to be activated at some point, and the test gets stuck due to lack of `multiprocessing.freeze_support` call.
Configuration menu - View commit details
-
Copy full SHA for 0772f24 - Browse repository at this point
Copy the full SHA 0772f24View commit details -
hooks: tensorflow: collect plugins from tensorflow-plugins
Have the `tensorflow` standard hook collect binaries from the `tensorflow-plugins` package; this contains plugins for tensorflow's pluggable device architecture (such as `tensorflow-metal` for macOS and `tensorflow-directml-plugin` for Windows). Have the `tensorflow` run-time hook override the `site.getsitepackages()` with custom implementation that allows us to trick `tensorflow` into loading the plugins.
Configuration menu - View commit details
-
Copy full SHA for 5ec43b8 - Browse repository at this point
Copy the full SHA 5ec43b8View commit details