Request for stripped down / inference only pytorch wheels #12609
Labels
module: build
Build system issues
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
馃殌 Feature
Creating a precompiled pytorch wheel file that is trimmed down, inference only version.
Motivation
Right now pytorch wheels are on average ~400MB zipped -> 1.+ GB unzipped, which is not a big deal for training & prototyping as generally the wheels are only installed once - but that's not the case for productionizing using service providers like sagemaker / algorithmia / etc.
Pitch
If we can create a trimmed down, potentially inference only capable wheel file - we can directly improve the load time performance of these algorithms in serverless algorithm delivery environments, which could directly pytorch's ability to compete in the HPC serverless marketplace.
Alternatives
We could also provide a clear way for users to create their own wheels, by simplifying and documenting the build process somewhat to enable optional features during the compilation process.
Additional context
Full disclosure, I'm an employee at Algorithmia and this change would make my life much easier 馃槃
cc @malfet @seemethere @walterddr
The text was updated successfully, but these errors were encountered: