Request for stripped down / inference only pytorch wheels #12609

zeryx · 2018-10-12T16:33:17Z

🚀 Feature

Creating a precompiled pytorch wheel file that is trimmed down, inference only version.

Motivation

Right now pytorch wheels are on average ~400MB zipped -> 1.+ GB unzipped, which is not a big deal for training & prototyping as generally the wheels are only installed once - but that's not the case for productionizing using service providers like sagemaker / algorithmia / etc.

Pitch

If we can create a trimmed down, potentially inference only capable wheel file - we can directly improve the load time performance of these algorithms in serverless algorithm delivery environments, which could directly pytorch's ability to compete in the HPC serverless marketplace.

Alternatives

We could also provide a clear way for users to create their own wheels, by simplifying and documenting the build process somewhat to enable optional features during the compilation process.

Additional context

Full disclosure, I'm an employee at Algorithmia and this change would make my life much easier 😄

cc @malfet @seemethere @walterddr

ezyang · 2018-10-12T17:08:40Z

Do you need CPU only, or CUDA as well? A CPU wheel is substantially smaller than the CUDA one.

soumith · 2018-10-12T19:09:46Z

As edward said, the CPU-only wheel is 10x smaller, would it be sufficient for your purposes? you can download it from the website by selecting "CUDA: None"

zeryx · 2018-10-12T20:33:29Z

CPU only definitely gets us somewhere - although we still find it slower to download and unpack than some other ML frameworks. However, we do also want to ensure that our users can access a gpu enabled wheel as well, while not sacrificing load time performance anywhere nearly as severely as they are today.
At the moment, because of how long the pytorch GPU wheel takes to zip & unzip on our infrastructure, inference algorithms (think autoencoders & other DNN models) that depend on the GPU wheel are nearly impossible to use.

ezyang · 2018-10-13T03:54:10Z

I think it'd be generally a good idea for us to pave the wheel-building codepath. Users can very easily, without our intervention, get stripped down builds by, for example, specifying only the architectures they care about. It will be much harder to omit backward kernels, however.

soumith self-assigned this Nov 7, 2018

jbschlosser added module: build Build system issues triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module labels Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for stripped down / inference only pytorch wheels #12609

Request for stripped down / inference only pytorch wheels #12609

zeryx commented Oct 12, 2018 •

edited by pytorch-probot bot

ezyang commented Oct 12, 2018

soumith commented Oct 12, 2018

zeryx commented Oct 12, 2018

ezyang commented Oct 13, 2018

Request for stripped down / inference only pytorch wheels #12609

Request for stripped down / inference only pytorch wheels #12609

Comments

zeryx commented Oct 12, 2018 • edited by pytorch-probot bot

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

ezyang commented Oct 12, 2018

soumith commented Oct 12, 2018

zeryx commented Oct 12, 2018

ezyang commented Oct 13, 2018

zeryx commented Oct 12, 2018 •

edited by pytorch-probot bot