Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow models to specify target framework versions #348

Open
3 of 4 tasks
VivekPanyam opened this issue May 14, 2020 · 0 comments
Open
3 of 4 tasks

Allow models to specify target framework versions #348

VivekPanyam opened this issue May 14, 2020 · 0 comments

Comments

@VivekPanyam
Copy link
Collaborator

VivekPanyam commented May 14, 2020

At export time, models should be able to specify version(s) of a framework that the model is compatible with.

When we load a model, Neuropod will attempt to find a backend that satisfies the specified range.

This combined with OPE will enable environments to provide many supported versions of each framework and have models choose which one to use.

This is useful in several cases:

  • If a model uses custom ops that are only compatible with one version of the framework (e.g. custom ops built against TF 1.15 headers vs TF 1.13.1 headers)
  • Being able to ensure that runtime and test environments are as similar as possible
  • Avoiding bugs in a specific version of the underlying framework

Target version ranges are specified as semver ranges (see https://semver.org/, https://docs.npmjs.com/misc/semver#ranges, https://docs.npmjs.com/misc/semver#advanced-range-syntax) which provides a very flexible way of defining the required framework.

TODO:

Once this is production ready, using dockerized OPE worker processes will also let us specify required CUDA versions (if any) and provide even more isolation.

VivekPanyam added a commit that referenced this issue Oct 14, 2020
### Summary:
This PR adds some comments to the backend loading system and refactors the loading behavior to support default backends installed at absolute paths.

This helps us start to solve a few problems:
- There are a growing number of interfaces to Neuropod (C++, Python, C, Java, etc) and we don't want unique ways of installing backends per language.
- There are also issues with Java and `LD_LIBRARY_PATH` on macOS because of system integrity protection. This makes it quite difficult to use `LD_LIBRARY_PATH` to add backends for Neuropod to dlopen
-  Models can't choose the version of a framework to use. This is because all the backend `so` files for a type have the same name regardless of the version (so the backend loading system cannot load a specific version). Even if they had different names based on version (e.g. `libneuropod_torchscript_1_4_0_cpu_backend.so`), we would still see issues when adding their containing folders to the library loader search path. This is because then there would be conflicting versions of other unversioned shared objects included with the backends (e.g. `libc10.so` from torch 1.1.0 and `libc10.so` from torch 1.4.0)


We start to solve these by allowing users to install backends at a "well known" path on the system. This way, all installed backends will be available from all languages without any additional configuration.

For now, the "well known" location is `/usr/local/lib/neuropod/`, but this may change or become user-configurable in the future.

We solve the last problem mentioned above by installing each backend in a versioned location (that is not on the system's library path). For example, `/usr/local/lib/neuropod/0.2.0/backends/torchscript_1.4.0_gpu/libneuropod_torchscript_backend.so`.

This means each backend version can now be independently loaded without being on the system library path. See #348 for more details. 

### Test Plan:
CI + Local testing

Will explore tests with multiple framework versions in CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant